In the previous post, we focused on setting up declarative Jenkins pipelines with emphasis on parametrizing builds and using environment variables across pipeline stages.
In this post, we look at various tips that can be useful when automating R application testing and continuous integration, with regards to orchestrating parallelization, combining sources from multiple git repositories and ensuring proper access right to the Jenkins agent.
Running stages in parallel Parallel computation using R Orchestrating parallelization of R jobs with Jenkins Failing early Cloning multiple git repositories Cloning into a separate subdirectory Cleaning up Changing permissions to allow the Jenkins user to read References Running stages in parallel Parallel computation using R There are numerous way to achieve parallel computation in the context of an R application, those native to R are for example
Jenkins is a popular open-source tool that helps teams with automation and implementation of continuous integration and deployment pipelines, comparable to for example Atlassian’s Bamboo, GitLab CI or to some extent Travis.
In this post, we share some practical lessons learned when integrating R applications via Jenkins for the purpose of continuous integration and regression testing on runner nodes configured using Jenkins via declarative pipelines defined in a Jenkinsfile.
Recently I was involved in a task that included reading and writing quite large amounts of data, totaling more than 1 TB worth of csvs without the standard big data infrastructure. After trying multiple approaches, the one that made this possible was using data.table’s reading and writing facilities - fread() and fwrite().
This motivated me to look at benchmarking data.table’s fread() and how it compares to other packages such as tidyverse’s readr and base R for reading tabular data from text files such as csvs.
As pointed out by a recent read the R source post on the R hub’s website, reading the actual code, not just the documentation is a great way to learn more about programming and implementation details. But there is one more activity to get even more hands-on experience and understanding of the code in practice.
In this post, we provide tips on how to interactively debug R code step-by-step and investigate the values of objects in the middle of function execution.
As we wrote in Should you start your R blog now?, blogging has probably never been more accessible to the general population, R users included. Usually, the simplest solution is to host your blog via a service that provides it for free, such as Netlify, GitHub or GitLab Pages. But what if you want to host that awesome blog on your own, HTTPS enabled domain?
In this post, we will look at how to port a Hugo-based website, such as a blogdown blog to our own domain, specifically focusing on GitLab Pages.
In the previous post, we looked at how to easily automate R analysis, modeling, and development work for free using GitLab’s CI/CD. Together with the fantastic R-hub project, we can use GitLab CI/CD to do much more.
In this post, we will take it to the next level by using R-hub to test our development work on many different platforms such as multiple Linux setups, MS Windows and MacOS.
Automating the execution, testing and deployment of R work is a very powerful tool to ensure the reproducibility, quality and overall robustness of the code that we are building, be it for data analysis and modeling purposes, developing R packages or even blogging. Modern tools also provide a free an easy to use way of achieving this goal.
In this post, we will show a quick and simple way to automate R data analysis and package development checking, testing and installation with GitLab CI/CD and provide example files that can be used for testing packages and deploying blogdown-based websites.
It has been a year since I posted the first post on this blog. Since that time, I have learned many lessons, but the main one is probably that blogging has never been as accessible as it is now.
In this anniversary post, I would like to give you a few reasons to start your own R blog and write about what I have learned in my first year of blogging about R.
If the practical tips for R Markdown post we talked briefly about how we can easily create professional reports directly from R scripts, without the need for converting them manually to Rmd and creating code chunks. In this one, we will provide useful tips on advanced options for styling, using themes and producing light-weight HTML reports directly from R scripts. We will also provide a repository with example R script and rendering code to get different styled and sized outputs easily.
Data manipulation and aggregation is one of the classic tasks anyone working with data will come across. We of course can perform data transformation and aggregation with base R, but when speed and memory efficiency come into play, data.table is my package of choice.
In this post we will look at of the fresh and very useful functionality that came to data.table only last year - grouping sets, enabling us, for example, to create pivot table-like reports with sub-totals and grand total quickly and easily.