The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Using the Student and Country Data

Introduction

The goal of learningtower is to provide a user-friendly access to a subset of variables from the Programme for International Student Assessment (PISA) data collected by the OECD. The data is collected on a three year basis, between the years 2000-2018.

You can explore more on this dataset for various analysis and statistical computations.

This vignette documents how to access these dataset, and shows a few typical methods to explore the data.

Exploring the student Data

Usage of the student subset data

Below is a quick example of loading the 2018 subset student data.

library(dplyr)
library(ggplot2)
library(learningtower)

#load the subset student data for the year 2018
data(student_subset_2018)
#load the countrycode data
data(countrycode)

glimpse(student_subset_2018)
#> Rows: 4,000
#> Columns: 22
#> Groups: country [80]
#> $ year        <fct> 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018…
#> $ country     <fct> ALB, ALB, ALB, ALB, ALB, ALB, ALB, ALB, ALB, ALB, ALB, ALB…
#> $ school_id   <fct> 800059, 800084, 800093, 800278, 800055, 800279, 800029, 80…
#> $ student_id  <fct> 805376, 802061, 800674, 803561, 801356, 804382, 802763, 80…
#> $ mother_educ <fct> "ISCED 3A", "ISCED 3A", "ISCED 3A", "ISCED 2", "ISCED 3A",…
#> $ father_educ <fct> "ISCED 3A", "ISCED 3B, C", "ISCED 2", "ISCED 2", "ISCED 2"…
#> $ gender      <fct> male, female, male, male, female, male, female, female, fe…
#> $ computer    <fct> yes, yes, yes, yes, NA, yes, yes, yes, yes, yes, yes, no, …
#> $ internet    <fct> yes, no, yes, yes, NA, yes, yes, yes, yes, yes, yes, no, n…
#> $ math        <dbl> 429.666, 435.103, 372.294, 473.772, 441.331, 402.099, 368.…
#> $ read        <dbl> 347.710, 427.958, 271.743, 336.271, 408.466, 402.166, 350.…
#> $ science     <dbl> 273.827, 460.357, 337.150, 368.131, 418.960, 322.163, 342.…
#> $ stu_wgt     <dbl> 9.30357, 3.45567, 3.80952, 1.54402, 2.91673, 3.54291, 6.64…
#> $ desk        <fct> yes, yes, yes, yes, yes, yes, yes, yes, yes, yes, yes, yes…
#> $ room        <fct> yes, no, yes, yes, yes, yes, yes, yes, yes, yes, yes, yes,…
#> $ dishwasher  <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
#> $ television  <fct> 2, 2, 2, 2, 2, 3+, 1, 2, 2, 1, 3+, 1, 1, 2, 1, 2, 2, 3+, 1…
#> $ computer_n  <fct> 3+, 1, 1, 1, 0, 1, 1, 2, 1, 1, 3+, 0, 0, 2, 1, 1, 1, 2, 1,…
#> $ car         <fct> 2, 0, 1, 3+, NA, 3+, 1, 1, 1, 0, 2, 0, 2, 1, 1, 1, 1, 1, 1…
#> $ book        <fct> 11-25, 11-25, 11-25, 26-100, 0-10, more than 500, 26-100, …
#> $ wealth      <dbl> 1.2825, -2.4364, -1.1034, -0.5200, -2.9037, -0.1675, -0.96…
#> $ escs        <dbl> 0.7018, 0.2091, -1.1074, -0.4944, -1.9712, -0.1840, -0.602…
student_subset_2018 %>% 
  group_by(country, gender) %>% 
  dplyr::filter(country %in% c("AUS", "QAT", "USA" , "JPN", 
                               "ALB", "PER", "FIN",  "SGP")) %>% 
  dplyr::left_join(countrycode, by = "country") %>% 
  dplyr::mutate(country_name = factor(
    country_name, 
    levels = c("Singapore", "Australia", "Japan", 
                "United States", "Finland", "Albania", "Peru", "Qatar"))) %>% 
  ggplot(aes(x = math,
             y = country_name,
             fill = gender)) +
  geom_boxplot() +
  scale_fill_manual(values = c("#FF7F0EFF", "#1F77B4FF")) +
  theme_classic() +
  labs(x = "Math score", 
       y = "")

Usage of the entire student data

#load the entire student data for the year 2018
student_data_2018 <- load_student(2018)

#load the entire student data for two/three years (2000, 2012, 2018)
student_data_2012_2018 <- load_student(c(2012, 2018))
student_data_2000_2012_2018 <- load_student(c(2000, 2012, 2018))

#load the entire student 
student_data_all <- load_student("all")

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.