Codebook example with SPSS dataset

Ruben Arslan

2018-08-01

knit_by_pkgdown <- !is.null(knitr::opts_chunk$get("fig.retina"))
pander::panderOptions("table.split.table", Inf)
ggplot2::theme_set(ggplot2::theme_bw())
knitr::opts_chunk$set(warning = TRUE, message = TRUE, error = TRUE, echo = TRUE)
library(dplyr)
library(codebook)

In this vignette, you can see how to use the metadata that is often already stored in SPSS and Stata files. It’s easy. All we need is the rio::import function. For files with the right file extension, we can automatically pick the right way to import the data. Here, we’re downloading straight from the Open Science Framework, so we have to specify the file extension.

We select a subset of variables, just to keep it short. The data were shared by Emanuel Jauk in a project called How alluring are dark personalities? The Dark Triad and attractiveness in speed dating.

Often, files imported from SPSS or Stata to R will not have their missings coded properly. Here, that is not the case, but if you find yourself with such a dataset, the detect_missings function makes it easy to recognise common ways to specify missing data (e.g. negative values, labelled values, 99/999).

darktriad <- rio::import("https://osf.io/j4fcb/download", format = "sav") %>%
  select(DG, sex, relStat, education, NPI_avg)
if (!knit_by_pkgdown) knitr::opts_chunk$set(echo = FALSE)

Now, we can immediately generate a codebook.

Items

DG

dating group

Distribution

0 missings.

Summary statistics

name label data_type missing complete n mean sd p0 p25 p50 p75 p100 hist format.spss
DG dating group numeric 0 90 90 2.03 0.77 1 1 2 3 3 ▆▁▁▇▁▁▁▆ F8.0

sex

sex

Distribution

0 missings.

Summary statistics

name label data_type value_labels missing complete n mean sd p0 p25 p50 p75 p100 hist format.spss display_width
sex sex numeric 1. female,
2. male
0 90 90 1.49 0.5 1 1 1 2 2 ▇▁▁▁▁▁▁▇ F1.0 5

Value labels

  • female: 1
  • male: 2

relStat

relationship status

Distribution

1 missings.

Summary statistics

name label data_type value_labels missing complete n mean sd p0 p25 p50 p75 p100 hist format.spss display_width
relStat relationship status numeric 1. single,
2. in a relationship,
3. living separately / divorced
1 89 90 1.09 0.32 1 1 1 1 3 ▇▁▁▁▁▁▁▁ F8.0 10

Value labels

  • single: 1
  • in a relationship: 2
  • living separately / divorced: 3

education

highest educational attainment

Distribution

1 missings.

Summary statistics

name label data_type value_labels missing complete n mean sd p0 p25 p50 p75 p100 hist format.spss display_width
education highest educational attainment numeric 1. nine years schooling only,
2. professional training,
3. vocational school,
4. university-entrance diploma,
5. academic degree
1 89 90 4.17 0.38 4 4 4 4 5 ▇▁▁▁▁▁▁▂ F1.0 5

Value labels

  • nine years schooling only: 1
  • professional training: 2
  • vocational school: 3
  • university-entrance diploma: 4
  • academic degree: 5

NPI_avg

narcissistic personality inventory - average

Distribution

0 missings.

Summary statistics

name label data_type missing complete n mean sd p0 p25 p50 p75 p100 hist format.spss display_width
NPI_avg narcissistic personality inventory - average numeric 0 90 90 2.61 0.35 1.7 2.42 2.6 2.82 3.65 ▁▂▃▇▆▂▁▁ F8.2 10

Missingness report

Among those who finished the survey. Only variables that have missings are shown.

## Warning: Could not figure out who finished the surveys, because the
## variables expired and ended were missing.
description relStat education var_miss n_miss
Missings in 0 variables 1 1 0 88
Missings per variable 1 1 2 2
Missings in 1 variables 1 0 1 1
Missings in 1 variables 0 1 1 1

Codebook table

name label data_type value_labels missing complete n mean sd p0 p25 p50 p75 p100 hist format.spss display_width
DG dating group numeric NA 0 90 90 2.03 0.77 1 1 2 3 3 ▆▁▁▇▁▁▁▆ F8.0 NA
sex sex numeric 1. female,
2. male
0 90 90 1.49 0.5 1 1 1 2 2 ▇▁▁▁▁▁▁▇ F1.0 5
relStat relationship status numeric 1. single,
2. in a relationship,
3. living separately / divorced
1 89 90 1.09 0.32 1 1 1 1 3 ▇▁▁▁▁▁▁▁ F8.0 10
education highest educational attainment numeric 1. nine years schooling only,
2. professional training,
3. vocational school,
4. university-entrance diploma,
5. academic degree
1 89 90 4.17 0.38 4 4 4 4 5 ▇▁▁▁▁▁▁▂ F1.0 5
NPI_avg narcissistic personality inventory - average numeric NA 0 90 90 2.61 0.35 1.7 2.42 2.6 2.82 3.65 ▁▂▃▇▆▂▁▁ F8.2 10