An introduction to baseverse

The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

An introduction to baseverse

Overview

baseverse is intended to be a relatively minimal suite of packages, supporting the use of base R with native piping.

Several functions are wrapper functions for existing base-R functions, adding support for native piping:

p_cor(): a wrapper for cor()
p_glm(): a wrapper for glm()
p_lm(): a wrapper for lm()
p_t.test(): a wrapper for t.test()
p_table(): a wrapper for table()
p_wilcox.test(): a wrapper for wilcox.test()

Other functions are wrapper functions for existing base-R features:

bang(): is a wrapper for !, and is similar to not() from magrittr
bracket(): is a wrapper for []
dollar(): is a wrapper for $, and is similar to pull() from dplyr

Other functions mimic tidyverse functions:

base_match(): mimics case_match(), but returns a factor and respects the user’s desired order of groups
base_when(): mimics case_when(), but returns a factor and respects the user’s desired order of groups
et(): mimics count()

Loading the package

Load the package:

library(baseverse)

Load the data

This vignette will draw from the built-in nhanes data:

data(nhanes)

Country of birth

Table the dmdborn4 variable:

nhanes |> p_table(dmdborn4)

## 
##     1     2 
## 10039  1875

Create a new, labelled version of dmdborn4:

nhanes<-nhanes |> transform(
  country=base_match(dmdborn4,'USA'=1,'Other'=2)
)

Table the new variable using p_table():

nhanes |> p_table(country)

## 
##   USA Other 
## 10039  1875

Or, table the new variable using et():

nhanes |> et(country)

##   country     n
## 1     USA 10039
## 2   Other  1875
## 3    <NA>    19

Notice that the USA group is listed first. This is, deliberately, hugely different behavior from case_match().

Total cholesterol

Summarize the lbxtc variable:

nhanes$lbxtc |> summary()

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##    62.0   151.0   178.0   181.5   207.0   438.0    5043

Or, using dollar():

nhanes |> dollar(lbxtc) |> summary()

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##    62.0   151.0   178.0   181.5   207.0   438.0    5043

Create a categorical variable for total cholesterol:

nhanes<-nhanes |>
  transform(
    cholesterol=base_when(
      'Desirable' = (lbxtc<200),
      'Borderline high' = (lbxtc>=200)&(lbxtc<240),
      'High' = (lbxtc>=240)
    )
  )

Table the new variable using p_table():

nhanes |> p_table(cholesterol)

## 
##       Desirable Borderline high            High 
##            4797            1460             633

Or, table the new variable using et():

nhanes |> et(cholesterol)

##       cholesterol    n
## 1       Desirable 4797
## 2 Borderline high 1460
## 3            High  633
## 4            <NA> 5043

Notice that the Desirable group is listed first. This is, deliberately, hugely different behavior from case_when().

Linear regression

Fit a linear model for systolic blood pressure (bpxosy1):

model_1<-nhanes |> 
  p_lm(bpxosy1~ridageyr+country+lbxtc)

Summarize the model:

model_1 |>
  summary()

## 
## Call:
## stats::lm(formula = formula, data = data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -57.672 -10.213  -1.227   8.520 107.359 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  97.440904   0.907807 107.337  < 2e-16 ***
## ridageyr      0.401313   0.009199  43.626  < 2e-16 ***
## countryOther -0.095695   0.509981  -0.188    0.851    
## lbxtc         0.020008   0.004775   4.190 2.82e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 16 on 6553 degrees of freedom
##   (5376 observations deleted due to missingness)
## Multiple R-squared:  0.2417, Adjusted R-squared:  0.2414 
## F-statistic: 696.4 on 3 and 6553 DF,  p-value: < 2.2e-16

Obtain 95% confidence intervals for the coefficients:

model_1 |>
  confint()

##                    2.5 %      97.5 %
## (Intercept)  95.66130724 99.22050174
## ridageyr      0.38328041  0.41934603
## countryOther -1.09542428  0.90403489
## lbxtc         0.01064781  0.02936871

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.