I have a new book in progress called A Short Introduction to Applied Statistical Programming in R, which can be viewed online as a Gitbook or as a PDF. [EDIT 2020-04-01: I will primarily focus on the Gitbook version, as I am running into some typesetting issues with the PDF at the moment.]
[EDIT 2020-04-02: The Gitbook version is fairly complete and I do not foresee many major updates to it unless they are requested or if I think of anything else significant to add.
Introduction Maintaining data frame consistency within Base R can be difficult. The library purrr1 from the tidyverse solves this problem with its map_df() function. However, we can achieve similar results and expand upon them with base R functions. To do so, two methods will be used.
Method 1: Use lapply(), data.frame(), and do.call() To replicate purrr’s map_df(), we use three functions: lapply() to apply the function to some data; data.
Introduction This blog post will compare sweep() and a function I’ve created called mop(). I argue that the latter is preferred over the former, as it is more concise in nature.
The Old Way: sweep() The function sweep()1 allows one to process data based on a summary statistic function–for example, dividing each element by a column’s mean. A problem, however, arises: you are required to explicitly state the summary statistic value in the STATS input.