The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

3-way crosstabs

library(pollster)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(knitr)
library(ggplot2)

It’s common to want to view a crosstab of two variables by a third variable, for instance educational attainment by sex and marital status. The function crosstab_3way accomplishes this. Row and cell percents are both supported; column percents are not.

illinois %>%
  # filter for recent years & limited ages
  filter(year > 2009,
         age > 39) %>%
  crosstab_3way(x = sex, y = educ6, z = maritalstatus, weight = weight,
                remove = c("widow/divorced/sep"),
                n = FALSE) %>%
  kable(digits = 0, caption = "Educational attainment by sex and marital status among Illinois residents ages 35+",
        format = "html")
Educational attainment by sex and marital status among Illinois residents ages 35+
sex maritalstatus LT HS HS Some Col AA BA Post-BA
Male Married 7 28 16 8 24 17
Male Never Married 13 35 19 11 15 8
Female Married 6 28 16 10 24 16
Female Never Married 11 27 21 8 17 15

Three-way crosstabs plot well as small multiples using ggplot facets.

illinois %>%
  # filter for recent years & limited ages
  filter(year > 2009,
         age > 34) %>%
  crosstab_3way(x = sex, y = educ6, z = maritalstatus, weight = weight,
                remove = c("widow/divorced/sep"), 
                format = "long") %>%
  ggplot(aes(educ6, pct, fill = maritalstatus)) +
  geom_bar(stat = "identity", position = position_dodge()) +
  facet_wrap(facets = vars(sex)) +
  labs("Educational attainment by sex and marital status",
       subtitle = "Illinois residents ages 40+") +
  theme(legend.position = "top")

The same plot can be made with margin of errors as well. (See the “crosstabs” vignette for a more detailed discussion of margin of errors.)

illinois %>%
  # filter for recent years & limited ages
  filter(year > 2009,
         age > 34) %>%
  moe_crosstab_3way(x = sex, y = educ6, z = maritalstatus, weight = weight,
                remove = c("widow/divorced/sep"), format = "long") %>%
  ggplot(aes(educ6, pct, fill = maritalstatus)) +
  geom_bar(stat = "identity", position = position_dodge(),
           alpha = 0.5) +
  geom_errorbar(aes(ymin = (pct - moe), ymax = (pct + moe),
                    color = maritalstatus),
                position = position_dodge()) +
  facet_wrap(facets = vars(sex)) +
  labs(title = "Educational attainment by sex and marital status",
       subtitle = "Illinois residents ages 35+",
       caption = "Current Population Survey, 2010-2018") +
  theme(legend.position = "top")
#> Your data includes weights equal to zero. These are removed before calculating the design effect.

Special case, when the z-variable identifies survey waves

If the x-variable in your crosstab uniquely identifies survey waves for which the weights were independently generated, it is best practice to calculate the design effect independently for each wave. moe_wave_crosstab_3way does just that. All of the arguments remain the same as in moe_crosstab_3way.

moe_wave_crosstab_3way(df = illinois, x = sex, y = educ6, z = year, weight = weight)
#> Your data includes weights equal to zero. These are removed before calculating the design effect.
#> Your data includes weights equal to zero. These are removed before calculating the design effect.
#> Your data includes weights equal to zero. These are removed before calculating the design effect.
#> Your data includes weights equal to zero. These are removed before calculating the design effect.
#> Your data includes weights equal to zero. These are removed before calculating the design effect.
#> Your data includes weights equal to zero. These are removed before calculating the design effect.
#> Your data includes weights equal to zero. These are removed before calculating the design effect.
#> Your data includes weights equal to zero. These are removed before calculating the design effect.
#> Your data includes weights equal to zero. These are removed before calculating the design effect.
#> Your data includes weights equal to zero. These are removed before calculating the design effect.
#> Your data includes weights equal to zero. These are removed before calculating the design effect.
#> Joining with `by = join_by(year)`
#> # A tibble: 144 × 6
#>     year sex    educ6      pct   moe        n
#>    <dbl> <fct>  <fct>    <dbl> <dbl>    <dbl>
#>  1  1996 Male   LT HS    15.1   1.80 3889089.
#>  2  1996 Male   HS       32.5   2.35 3889089.
#>  3  1996 Male   Some Col 20.3   2.02 3889089.
#>  4  1996 Male   AA        6.11  1.20 3889089.
#>  5  1996 Male   BA       17.7   1.91 3889089.
#>  6  1996 Male   Post-BA   8.38  1.39 3889089.
#>  7  1996 Female LT HS    14.2   1.65 4193383.
#>  8  1996 Female HS       34.8   2.25 4193383.
#>  9  1996 Female Some Col 22.8   1.98 4193383.
#> 10  1996 Female AA        6.72  1.18 4193383.
#> # … with 134 more rows

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.