The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
Raw UKB phenotype data contains encoded column names and values that need to be converted before analysis.
| Source | Column names | Column values |
|---|---|---|
extract_pheno() |
participant.p31 |
Raw integer codes — needs decode_values() |
extract_batch() |
p31, p53_i0 |
Usually already decoded — decode_values() typically not
needed |
Both outputs need decode_names() to convert field ID
column names to human-readable snake_case.
Call order matters: when using
extract_pheno()output, always rundecode_values()beforedecode_names(), because value decoding relies on the numeric field ID still being present in the column name.
library(ukbflow)
df <- extract_pheno(c(31, 54, 20116, 21022))
df <- decode_values(df) # 0/1 → "Female"/"Male", etc.
df <- decode_names(df) # participant.p31 → sexdecode_values() converts raw integer codes to
human-readable labels for categorical fields that have UKB encoding
mappings. Continuous, date, text, and already-decoded fields are left
unchanged.
It requires two metadata files from the UKB Showcase. Download them once with:
Then point decode_values() to the same directory
(default matches fetch_metadata()):
| Column | Raw value | Decoded value |
|---|---|---|
p31 |
0 / 1 |
"Female" / "Male" |
p54 |
11012 |
"Leeds" |
p20116_i0 |
0 / 1 / 2 |
"Never" / "Previous" /
"Current" |
Codes absent from the encoding table (including UKB missing codes
-1, -3, -7) are returned as
NA.
decode_names() renames columns from field ID format to
snake_case labels using the approved UKB field dictionary available to
your project.
| Raw name | Decoded name |
|---|---|
participant.eid |
eid |
participant.p31 |
sex |
participant.p21022 |
age_at_recruitment |
participant.p53_i0 |
date_of_attending_assessment_centre_i0 |
p31 |
sex |
p53_i0 |
date_of_attending_assessment_centre_i0 |
Both extract_pheno() format
(participant.p31) and extract_batch() format
(p31) are handled automatically.
Some UKB field titles are verbose. Names exceeding
max_nchar characters are flagged with a warning (default:
60). Lower the threshold to catch more aggressively:
df <- decode_names(df, max_nchar = 30)
#> ! 1 column name longer than 30 characters - consider renaming manually:
#> • date_of_attending_assessment_centre_i0Rename manually to something concise:
?decode_values, ?decode_namesvignette("extract") — extracting phenotype dataThese binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.