When speed and memory efficiency is important, the data.table package is one of the ways to improve those aspects of our R code dramatically. Including data.table in a package also comes with the added benefit of only importing the methods package, which is part of base R. We must also however pay attention to correctly importing and using methods, as data.table handles data.frame subsetting operators in a special way.
R Markdown is a great tool to use for creating reports, presentations and even websites that contain evaluated and rendered code. This can help us immensely when presenting data science type of work to audiences, while still being able to version control the content creation process.
One of the challenges that stay is reproducibility of the rendered results. In this post, I will list a few sources of reproducibility issues I came across and how I tried to solve them.
Including R Markdown in the workflow for presenting and publishing analyses that use code in R or other languages is a great way to make presentations, dashboards or reports good looking, reproducible and version controllable.
In this post, we will look at three simple ways to improve that workflow even further with methods that are lesser known and can make producing results with R Markdown more efficient and reviewing them more interactive.
It has been more than ten years since I wrote my first R code. And in those years, the R world has changed dramatically, and mostly to the better. I believe that the current time may be one of the best times to start working with R.
In this new year’s post we will look at the R world 10 years ago and today, and provide links to many tools that helped it become a great language to solve and present everyday tasks with a welcoming community of users and developers.
It is Christmas time! And what better time than this to write about the great tools that are available to all who like R and would like to publish their R work or even blog about it. This post is meant as a praise to the tools that are helping me to write this blog and make it a very nice experience, allowing me to focus on the content.
In this post in the R:case4base series we will examine sorting (ordering) data in base R. We will learn to sort our data based on one or multiple columns, with ascending or descending order and as always look at alternatives to base R, namely the tidyverse’s dplyr and data.table to show how we can achieve the same results.
It is recommended to first have a look at the post on subsetting to understand the concepts underlying the sorting process in more detail.
In this post in the R:case4base series we will look at string manipulation with base R, and provide an overview of a wide range of functions for our string working needs.
We will use simple examples to learn to perform basic string operations, concatenate strings, work with substrings, switch cases, quote, find and replace within strings and more. Some interesting bonuses will also be included.
As always, some popular alternatives to base R will also be suggested and many useful references provided for further reading.
In this post we will look at yet another productivity increasing feature of the RStudio IDE - Code Snippets. Code Snippets let us easily insert and potentially execute predefined pieces of code and work not just for R code, but many other languages as well.
In this post we will cover 4 different ways to increase productivity using Code Snippets and provide 11 real-life examples of their use that you can take advantage of instantly.
In this post in the R:case4base series we will look at one of the most common operations on multiple data frames - merge, also known as JOIN in SQL terms.
We will learn how to do the 4 basic types of join - inner, left, right and full join with base R and show how to perform the same with tidyverse’s dplyr and data.table’s methods. A quick benchmark will also be included.
Inspired by a recent post on how to import a directory of csv files at once using purrr and readr by Garrick, in this post we will try achieving the same using base R with no extra packages, and with data·table, another very popular package and as an added bonus, we will play a bit with benchmarking to see which of the methods is the fastest, including the tidyverse approach in the benchmark.