#> There are 211 variables, grouped in 24 subjects and 5 sections.
The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
chmsflow uses two CSV metadata files to define how raw CHMS variables are harmonized. These files are bundled with the package in inst/extdata/ and are also available as data objects (variables and variable_details).
variables.csv – lists every harmonized variable with its name, label, type, and unitvariable-details.csv – defines the row-by-row recoding rules that rec_with_table() appliesThis vignette is a column-by-column reference for both files. For an explanation of how these files fit into the harmonization workflow, see Methodology.
variables.csv#> There are 211 variables, grouped in 24 subjects and 5 sections.
1. variable – the name of the harmonized variable.
2. label – a short label for the variable.
3. labelLong – a more detailed label for the variable.
4. section – the broad grouping where this variable belongs (e.g., sociodemographics, health behaviour, health status).
5. subject – the specific topic the variable pertains to (e.g., age, smoking, blood pressure).
6. variableType – whether the harmonized variable is Categorical or Continuous.
7. units – the units of the harmonized variable, or N/A if unitless.
8. databaseStart – the CHMS cycles that contain the variable, separated by commas.
9. variableStart – the source variable names as listed in each CHMS cycle. Uses the same format conventions as variable-details.csv (see below).
variable-details.csv#> There are 1111 rows and 17 columns.
Each row defines the recoding rule for one category of one variable. For a categorical variable with 4 categories, plus a not-applicable category, a missing category, and an else row, there are 7 rows.
Missing data rows use haven::tagged_na():
NA::a – valid skip (not applicable)NA::b – missing (don’t know, refusal, not stated)The else row catches values not matched by any other row.
We use clc_sex as a running example.
1. variable – name of the harmonized variable.
| variable |
|---|
| clc_sex |
| clc_sex |
| clc_sex |
| clc_sex |
| clc_sex |
2. dummyVariable – dummy variable name for each category (categorical variables only; N/A for continuous).
| variable | dummyVariable | |
|---|---|---|
| 297 | clc_sex | clc_sex_cat2_1 |
| 298 | clc_sex | clc_sex_cat2_2 |
| 299 | clc_sex | clc_sex_cat2_NAa |
| 300 | clc_sex | clc_sex_cat2_NAb |
| 301 | clc_sex | clc_sex_cat2_NAb |
3. typeEnd – variable type of the harmonized variable (cat or cont).
| variable | dummyVariable | typeEnd | |
|---|---|---|---|
| 297 | clc_sex | clc_sex_cat2_1 | cat |
| 298 | clc_sex | clc_sex_cat2_2 | cat |
| 299 | clc_sex | clc_sex_cat2_NAa | cat |
| 300 | clc_sex | clc_sex_cat2_NAb | cat |
| 301 | clc_sex | clc_sex_cat2_NAb | cat |
4. databaseStart – CHMS cycles containing this variable, separated by commas.
| variable | dummyVariable | typeEnd | databaseStart | |
|---|---|---|---|---|
| 297 | clc_sex | clc_sex_cat2_1 | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 |
| 298 | clc_sex | clc_sex_cat2_2 | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 |
| 299 | clc_sex | clc_sex_cat2_NAa | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 |
| 300 | clc_sex | clc_sex_cat2_NAb | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 |
| 301 | clc_sex | clc_sex_cat2_NAb | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 |
5. variableStart – source variable names in each CHMS cycle. Supports several formats:
| Format | Meaning | Example |
|---|---|---|
[variable_name] |
Same name across all cycles | [clc_sex] |
cycle1::name1, [default_name] |
Cycle-specific exception with a default | cycle1::amsdmva1, [ammdmva1] |
DerivedVar::[var1, var2, ...] |
Computed by a function from listed inputs | DerivedVar::[lab_bcre, pgdcgt, clc_sex, clc_age] |
| variable | dummyVariable | typeEnd | databaseStart | variableStart | |
|---|---|---|---|---|---|
| 297 | clc_sex | clc_sex_cat2_1 | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] |
| 298 | clc_sex | clc_sex_cat2_2 | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] |
| 299 | clc_sex | clc_sex_cat2_NAa | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] |
| 300 | clc_sex | clc_sex_cat2_NAb | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] |
| 301 | clc_sex | clc_sex_cat2_NAb | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] |
6. typeStart – variable type in the source CHMS data (cat or cont).
| variable | dummyVariable | typeEnd | databaseStart | variableStart | typeStart | |
|---|---|---|---|---|---|---|
| 297 | clc_sex | clc_sex_cat2_1 | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat |
| 298 | clc_sex | clc_sex_cat2_2 | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat |
| 299 | clc_sex | clc_sex_cat2_NAa | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat |
| 300 | clc_sex | clc_sex_cat2_NAb | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat |
| 301 | clc_sex | clc_sex_cat2_NAb | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat |
7. recEnd – the value to recode each category to. Special values:
copy – pass through unchanged (for continuous variables)NA::a – not applicableNA::b – missingFunc::function_name – derived variable computed by the named function| variable | dummyVariable | typeEnd | databaseStart | variableStart | typeStart | recEnd | |
|---|---|---|---|---|---|---|---|
| 297 | clc_sex | clc_sex_cat2_1 | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | 1 |
| 298 | clc_sex | clc_sex_cat2_2 | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | 2 |
| 299 | clc_sex | clc_sex_cat2_NAa | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | NA::a |
| 300 | clc_sex | clc_sex_cat2_NAb | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | NA::b |
| 301 | clc_sex | clc_sex_cat2_NAb | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | NA::b |
8. numValidCat – number of non-missing categories (categorical only; N/A for continuous). Not used by rec_with_table().
| variable | dummyVariable | typeEnd | databaseStart | variableStart | typeStart | recEnd | numValidCat | |
|---|---|---|---|---|---|---|---|---|
| 297 | clc_sex | clc_sex_cat2_1 | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | 1 | 2 |
| 298 | clc_sex | clc_sex_cat2_2 | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | 2 | 2 |
| 299 | clc_sex | clc_sex_cat2_NAa | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | NA::a | 2 |
| 300 | clc_sex | clc_sex_cat2_NAb | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | NA::b | 2 |
| 301 | clc_sex | clc_sex_cat2_NAb | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | NA::b | 2 |
9. catLabel – short label for the category.
| variable | dummyVariable | typeEnd | databaseStart | variableStart | typeStart | recEnd | numValidCat | catLabel | |
|---|---|---|---|---|---|---|---|---|---|
| 297 | clc_sex | clc_sex_cat2_1 | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | 1 | 2 | Male |
| 298 | clc_sex | clc_sex_cat2_2 | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | 2 | 2 | Female |
| 299 | clc_sex | clc_sex_cat2_NAa | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | NA::a | 2 | not applicable |
| 300 | clc_sex | clc_sex_cat2_NAb | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | NA::b | 2 | missing |
| 301 | clc_sex | clc_sex_cat2_NAb | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | NA::b | 2 | missing |
10. catLabelLong – detailed label, matching CHMS documentation where possible.
| variable | dummyVariable | typeEnd | databaseStart | variableStart | typeStart | recEnd | numValidCat | catLabel | catLabelLong | |
|---|---|---|---|---|---|---|---|---|---|---|
| 297 | clc_sex | clc_sex_cat2_1 | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | 1 | 2 | Male | Male |
| 298 | clc_sex | clc_sex_cat2_2 | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | 2 | 2 | Female | Female |
| 299 | clc_sex | clc_sex_cat2_NAa | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | NA::a | 2 | not applicable | not applicable |
| 300 | clc_sex | clc_sex_cat2_NAb | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | NA::b | 2 | missing | missing |
| 301 | clc_sex | clc_sex_cat2_NAb | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | NA::b | 2 | missing | missing |
11. units – units of the variable, or N/A. Must be consistent across all rows of the same variable.
| variable | dummyVariable | typeEnd | databaseStart | variableStart | typeStart | recEnd | numValidCat | catLabel | catLabelLong | units | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 297 | clc_sex | clc_sex_cat2_1 | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | 1 | 2 | Male | Male | N/A |
| 298 | clc_sex | clc_sex_cat2_2 | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | 2 | 2 | Female | Female | N/A |
| 299 | clc_sex | clc_sex_cat2_NAa | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | NA::a | 2 | not applicable | not applicable | N/A |
| 300 | clc_sex | clc_sex_cat2_NAb | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | NA::b | 2 | missing | missing | N/A |
| 301 | clc_sex | clc_sex_cat2_NAb | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | NA::b | 2 | missing | missing | N/A |
12. recStart – the source value or range to match. Uses interval notation:
[1, 4] – all integer values from 1 to 4[1, 2.5] – all values from 1 to 2.5 (2.55 would not match)else – all values not matched by other rowscopy – combined with else, copies unmatched values unchanged| variable | dummyVariable | typeEnd | databaseStart | variableStart | typeStart | recEnd | numValidCat | catLabel | catLabelLong | units | recStart | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 297 | clc_sex | clc_sex_cat2_1 | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | 1 | 2 | Male | Male | N/A | 1 |
| 298 | clc_sex | clc_sex_cat2_2 | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | 2 | 2 | Female | Female | N/A | 2 |
| 299 | clc_sex | clc_sex_cat2_NAa | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | NA::a | 2 | not applicable | not applicable | N/A | 6 |
| 300 | clc_sex | clc_sex_cat2_NAb | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | NA::b | 2 | missing | missing | N/A | [7, 9] |
| 301 | clc_sex | clc_sex_cat2_NAb | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | NA::b | 2 | missing | missing | N/A | else |
13. catStartLabel – label for the source category, matching CHMS documentation. For missing rows, describes each missing code and its value.
| variable | dummyVariable | typeEnd | databaseStart | variableStart | typeStart | recEnd | numValidCat | catLabel | catLabelLong | units | recStart | catStartLabel | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 297 | clc_sex | clc_sex_cat2_1 | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | 1 | 2 | Male | Male | N/A | 1 | Male |
| 298 | clc_sex | clc_sex_cat2_2 | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | 2 | 2 | Female | Female | N/A | 2 | Female |
| 299 | clc_sex | clc_sex_cat2_NAa | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | NA::a | 2 | not applicable | not applicable | N/A | 6 | Valid skip |
| 300 | clc_sex | clc_sex_cat2_NAb | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | NA::b | 2 | missing | missing | N/A | [7, 9] | Don’t know (7); Refusal (8); Not stated (9) |
| 301 | clc_sex | clc_sex_cat2_NAb | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | NA::b | 2 | missing | missing | N/A | else | else |
14. variableStartShortLabel – short label for the source variable.
| variable | dummyVariable | typeEnd | databaseStart | variableStart | typeStart | recEnd | numValidCat | catLabel | catLabelLong | units | recStart | catStartLabel | variableStartShortLabel | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 297 | clc_sex | clc_sex_cat2_1 | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | 1 | 2 | Male | Male | N/A | 1 | Male | Sex |
| 298 | clc_sex | clc_sex_cat2_2 | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | 2 | 2 | Female | Female | N/A | 2 | Female | Sex |
| 299 | clc_sex | clc_sex_cat2_NAa | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | NA::a | 2 | not applicable | not applicable | N/A | 6 | Valid skip | Sex |
| 300 | clc_sex | clc_sex_cat2_NAb | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | NA::b | 2 | missing | missing | N/A | [7, 9] | Don’t know (7); Refusal (8); Not stated (9) | Sex |
| 301 | clc_sex | clc_sex_cat2_NAb | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | NA::b | 2 | missing | missing | N/A | else | else | Sex |
15. variableStartLabel – detailed label for the source variable, matching CHMS documentation.
| variable | dummyVariable | typeEnd | databaseStart | variableStart | typeStart | recEnd | numValidCat | catLabel | catLabelLong | units | recStart | catStartLabel | variableStartShortLabel | variableStartLabel | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 297 | clc_sex | clc_sex_cat2_1 | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | 1 | 2 | Male | Male | N/A | 1 | Male | Sex | Sex |
| 298 | clc_sex | clc_sex_cat2_2 | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | 2 | 2 | Female | Female | N/A | 2 | Female | Sex | Sex |
| 299 | clc_sex | clc_sex_cat2_NAa | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | NA::a | 2 | not applicable | not applicable | N/A | 6 | Valid skip | Sex | Sex |
| 300 | clc_sex | clc_sex_cat2_NAb | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | NA::b | 2 | missing | missing | N/A | [7, 9] | Don’t know (7); Refusal (8); Not stated (9) | Sex | Sex |
| 301 | clc_sex | clc_sex_cat2_NAb | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | NA::b | 2 | missing | missing | N/A | else | else | Sex | Sex |
16. notes – relevant notes about changes between CHMS cycles, missing categories, or variable type changes.
| variable | dummyVariable | typeEnd | databaseStart | variableStart | typeStart | recEnd | numValidCat | catLabel | catLabelLong | units | recStart | catStartLabel | variableStartShortLabel | variableStartLabel | notes | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 297 | clc_sex | clc_sex_cat2_1 | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | 1 | 2 | Male | Male | N/A | 1 | Male | Sex | Sex | |
| 298 | clc_sex | clc_sex_cat2_2 | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | 2 | 2 | Female | Female | N/A | 2 | Female | Sex | Sex | |
| 299 | clc_sex | clc_sex_cat2_NAa | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | NA::a | 2 | not applicable | not applicable | N/A | 6 | Valid skip | Sex | Sex | |
| 300 | clc_sex | clc_sex_cat2_NAb | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | NA::b | 2 | missing | missing | N/A | [7, 9] | Don’t know (7); Refusal (8); Not stated (9) | Sex | Sex | |
| 301 | clc_sex | clc_sex_cat2_NAb | cat | cycle1, cycle2, cycle3, cycle4, cycle5, cycle6 | [clc_sex] | cat | NA::b | 2 | missing | missing | N/A | else | else | Sex | Sex |
Derived variables use two special column values:
variableStart: DerivedVar::[var1, var2, var3] – lists the input variablesrecEnd: Func::function_name – names the R function that computes the derived variableSee Derived variables for details on how derived variables work.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.