The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
The goal of edar is to provide some convenient functions for common tasks in exploratory data analysis.
Sou T (2025). edar: Convenient Functions for Exploratory Data Analysis. R package version 0.0.3.9000, https://github.com/soutomas/edar.
citation("edar")
#> To cite package 'edar' in publications use:
#>
#> Sou T (2025). _edar: Convenient Functions for Exploratory Data
#> Analysis_. R package version 0.0.3.9000, <https://github.com/soutomas/edar>.
#>
#> A BibTeX entry for LaTeX users is
#>
#> @Manual{,
#> title = {edar: Convenient Functions for Exploratory Data Analysis},
#> author = {Tomas Sou},
#> year = {2025},
#> note = {R package version 0.0.3.9000},
#> url = {https://github.com/soutomas/edar},
#> }You can install the development version of edar from GitHub with:
# install.packages("pak")
pak::pak("soutomas/edar")Commonly, we want to generate a quick summary of variables in a dataset.
library(edar)
# Data
dat = mtcars |> dplyr::mutate(vs=factor(vs), am=factor(am))
# Summary for continuous variables in a data frame.
dat |> summ_by()
#> Dropped: vs am
#> Adding missing grouping variables: `name`
#> # A tibble: 9 × 10
#> name n nNA Mean Med SD Min P25 P75 Max
#> <chr> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 carb 32 0 2.81 2 1.62 1 2 4 8
#> 2 cyl 32 0 6.19 6 1.79 4 4 8 8
#> 3 disp 32 0 231. 196. 124. 71.1 121. 326 472
#> 4 drat 32 0 3.60 3.70 0.535 2.76 3.08 3.92 4.93
#> 5 gear 32 0 3.69 4 0.738 3 3 4 5
#> 6 hp 32 0 147. 123 68.6 52 96.5 180 335
#> 7 mpg 32 0 20.1 19.2 6.03 10.4 15.4 22.8 33.9
#> 8 qsec 32 0 17.8 17.7 1.79 14.5 16.9 18.9 22.9
#> 9 wt 32 0 3.22 3.32 0.978 1.51 2.58 3.61 5.42
# Summary of selected variable after grouping.
dat |> summ_by("mpg",vs)
#> Adding missing grouping variables: `vs`
#> # A tibble: 2 × 10
#> vs n nNA Mean Med SD Min P25 P75 Max
#> <fct> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 0 18 0 16.6 15.6 3.86 10.4 14.8 19.1 26
#> 2 1 14 0 24.6 22.8 5.38 17.8 21.4 29.6 33.9
dat |> summ_by("mpg",vs,am)
#> Adding missing grouping variables: `vs`, `am`
#> # A tibble: 4 × 11
#> # Groups: vs [2]
#> vs am n nNA Mean Med SD Min P25 P75 Max
#> <fct> <fct> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 0 0 12 0 15.0 15.2 2.77 10.4 14.0 16.6 19.2
#> 2 0 1 6 0 19.8 20.4 4.01 15 16.8 21 26
#> 3 1 0 7 0 20.7 21.4 2.47 17.8 18.6 22.2 24.4
#> 4 1 1 7 0 28.4 30.4 4.76 21.4 25.0 31.4 33.9
# Summary for categorical variables in a data frame.
dat |> summ_cat()
#> Dropped: mpg cyl disp hp drat wt qsec gear carb
#> $vs
#> vs n percent
#> 0 18 0.5625
#> 1 14 0.4375
#> Total 32 1.0000
#>
#> $am
#> am n percent
#> 0 19 0.59375
#> 1 13 0.40625
#> Total 32 1.00000
# Summary for selected categorical variable.
dat |> summ_cat("vs")
#> Dropped: mpg cyl disp hp drat wt qsec gear carb
#> vs n percent
#> 0 18 0.5625
#> 1 14 0.4375
#> Total 32 1.0000Results can be directly viewed in a flextable object easily.
# Show data frame in a flextable object.
dat |> summ_by("mpg",vs) |> ft()It is often helpful to add a label in the output indicating the source file.
# A label indicating the current source file can be easily generated.
lab = label_src(1)# A source label can be directly added to the flextable output.
dat |> summ_cat("am") |> ft(src=1)# A source label can be easily added to a ggplot object.
library(ggplot2)
p = ggplot(mtcars, aes(mpg, wt)) + geom_point()
p |> ggsrc()These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.