The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Example 4: Nest

This vignette displays how to use nesting in tidyfst. It has referred to tidyrs vignette in https://tidyr.tidyverse.org/articles/nest.html. Now fist, we nest the “mtcars” data.frame by “cyl” column.

library(tidyfst)

# nest by "cyl" column
mtcars_nested <- mtcars %>% 
  nest_dt(cyl) # you can use "cyl" too, very flexible

# inspect the output data.table
mtcars_nested
#>      cyl                 ndt
#>    <num>              <list>
#> 1:     6  <data.table[7x10]>
#> 2:     4 <data.table[11x10]>
#> 3:     8 <data.table[14x10]>

Now, we want to do a regression within the nested group “cyl”. We’ll use the famous lapply to complete this:

mtcars_nested2 <- mtcars_nested %>% 
  mutate_dt(model = lapply(ndt,function(df) lm(mpg ~ wt, data = df)))

mtcars_nested2
#>      cyl                 ndt    model
#>    <num>              <list>   <list>
#> 1:     6  <data.table[7x10]> <lm[12]>
#> 2:     4 <data.table[11x10]> <lm[12]>
#> 3:     8 <data.table[14x10]> <lm[12]>

We could see that the model is stored in the column “model”. Now, we try to get the fitted value in the model.

mtcars_nested3 <- mtcars_nested2 %>% 
  mutate_dt(model_predict = lapply(model, predict))
mtcars_nested3$model_predict
#> [[1]]
#>        1        2        3        4        5        6        7 
#> 21.12497 20.41604 19.47080 18.78968 18.84528 18.84528 20.70795 
#> 
#> [[2]]
#>        1        2        3        4        5        6        7        8 
#> 26.47010 21.55719 21.78307 27.14774 30.45125 29.20890 25.65128 28.64420 
#>        9       10       11 
#> 27.48656 31.02725 23.87247 
#> 
#> [[3]]
#>        1        2        3        4        5        6        7        8 
#> 16.32604 16.04103 14.94481 15.69024 15.58061 12.35773 11.97625 12.14945 
#>        9       10       11       12       13       14 
#> 16.15065 16.33700 15.44907 15.43811 16.91800 16.04103

We could find that the “model_predict” is a list of numeric vectors. Let’s try to unnest the target column “model_predict”.

mtcars_nested3 %>% unnest_dt(model_predict)
#>       cyl model_predict
#>     <num>         <num>
#>  1:     6      21.12497
#>  2:     6      20.41604
#>  3:     6      19.47080
#>  4:     6      18.78968
#>  5:     6      18.84528
#>  6:     6      18.84528
#>  7:     6      20.70795
#>  8:     4      26.47010
#>  9:     4      21.55719
#> 10:     4      21.78307
#> 11:     4      27.14774
#> 12:     4      30.45125
#> 13:     4      29.20890
#> 14:     4      25.65128
#> 15:     4      28.64420
#> 16:     4      27.48656
#> 17:     4      31.02725
#> 18:     4      23.87247
#> 19:     8      16.32604
#> 20:     8      16.04103
#> 21:     8      14.94481
#> 22:     8      15.69024
#> 23:     8      15.58061
#> 24:     8      12.35773
#> 25:     8      11.97625
#> 26:     8      12.14945
#> 27:     8      16.15065
#> 28:     8      16.33700
#> 29:     8      15.44907
#> 30:     8      15.43811
#> 31:     8      16.91800
#> 32:     8      16.04103
#>       cyl model_predict

This process would remove all the other list column automatically. For instance, in our case, the column “ndt” is removed.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.