Use data.table the tidy way: An ultimate tutorial of tidyfst

I love the tidy syntax of dplyr and the ultimate speed of data.table. Why not take the both? That is why I have started the work of tidyfst, bridging the tidy syntax and computation performance via translation. This tool is especially friendly for dplyr users who want to learn some data.table, but data.table could also benefit from it (more or less).

A great comparison of data.table and dplyr was displayed at https://atrebas.github.io/post/2019-03-03-datatable-dplyr/ (thanks to Atrebas). I love this tutorial very much because it dig rather deep into many features from both packages. Here I’ll try to implement all operations from that tutorial, and the potential users could find why they would prefer tidyfst for some (if not most) tasks.

The below examples have all been checked with tidyfst. Now let’s begin.

Create example data

library(tidyfst)

set.seed(1L)

## Create a data table
DF <- data.table(V1 = rep(c(1L, 2L), 5)[-10],
                V2 = 1:9,
                V3 = c(0.5, 1.0, 1.5),
                V4 = rep(LETTERS[1:3], 3))
copy(DF) -> DT

class(DF)
DF

Basic operations

Filter rows

Sort rows

Select columns

Summarise data

Add/update/delete columns

group computation (by)

Going further

Advanced columns manipulation

Indexing and Keys

set*() modifications

Advanced use of by

Miscellaneous

Read / Write data

tidyfst exports data.table::fread and data.table::fwrite directly.

Reshape data

Other

Join/Bind data sets

Join

Bind

Set operations

Summary

To break all these codes through, tidyfst has improved bit by bit. If you are using tidyfst frequently, you’ll find that while it enjoys a tidy syntax, it is more like you are using data.table in another style. Compared with many other packages with similar goals, tidyfst sticks to many principles of data.table (and is more like data.table in many ways). However, the ultimate goal is still clear: providing users with state-of-the-art data manipulation tools with least pain. Therefore, keep it simple and make it fast. Enjoy tidyfst~