Automating the execution, testing and deployment of R work is a very powerful tool to ensure the reproducibility, quality and overall robustness of the code that we are building, be it for data analysis and modeling purposes, developing R packages or even blogging. Modern tools also provide a free an easy to use way of achieving this goal.
In this post, we will show a quick and simple way to automate R data analysis and package development checking, testing and installation with GitLab CI/CD and provide example files that can be used for testing packages and deploying blogdown-based websites.
It has been a year since I posted the first post on this blog. Since that time, I have learned many lessons, but the main one is probably that blogging has never been as accessible as it is now.
In this anniversary post, I would like to give you a few reasons to start your own R blog and write about what I have learned in my first year of blogging about R.
If the practical tips for R Markdown post we talked briefly about how we can easily create professional reports directly from R scripts, without the need for converting them manually to Rmd and creating code chunks. In this one, we will provide useful tips on advanced options for styling, using themes and producing light-weight HTML reports directly from R scripts. We will also provide a repository with example R script and rendering code to get different styled and sized outputs easily.
Data manipulation and aggregation is one of the classic tasks anyone working with data will come across. We of course can perform data transformation and aggregation with base R, but when speed and memory efficiency come into play, data.table is my package of choice.
In this post we will look at of the fresh and very useful functionality that came to data.table only last year - grouping sets, enabling us, for example, to create pivot table-like reports with sub-totals and grand total quickly and easily.
When speed and memory efficiency is important, the data.table package is one of the ways to improve those aspects of our R code dramatically. Including data.table in a package also comes with the added benefit of only importing the methods package, which is part of base R. We must also however pay attention to correctly importing and using methods, as data.table handles data.frame subsetting operators in a special way.
R Markdown is a great tool to use for creating reports, presentations and even websites that contain evaluated and rendered code. This can help us immensely when presenting data science type of work to audiences, while still being able to version control the content creation process.
One of the challenges that stay is reproducibility of the rendered results. In this post, I will list a few sources of reproducibility issues I came across and how I tried to solve them.
Including R Markdown in the workflow for presenting and publishing analyses that use code in R or other languages is a great way to make presentations, dashboards or reports good looking, reproducible and version controllable.
In this post, we will look at three simple ways to improve that workflow even further with methods that are lesser known and can make producing results with R Markdown more efficient and reviewing them more interactive.
It has been more than ten years since I wrote my first R code. And in those years, the R world has changed dramatically, and mostly to the better. I believe that the current time may be one of the best times to start working with R.
In this new year’s post we will look at the R world 10 years ago and today, and provide links to many tools that helped it become a great language to solve and present everyday tasks with a welcoming community of users and developers.
It is Christmas time! And what better time than this to write about the great tools that are available to all who like R and would like to publish their R work or even blog about it. This post is meant as a praise to the tools that are helping me to write this blog and make it a very nice experience, allowing me to focus on the content.
In this post in the R:case4base series we will examine sorting (ordering) data in base R. We will learn to sort our data based on one or multiple columns, with ascending or descending order and as always look at alternatives to base R, namely the tidyverse’s dplyr and data.table to show how we can achieve the same results.
It is recommended to first have a look at the post on subsetting to understand the concepts underlying the sorting process in more detail.