The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
Adding new segregation indices is not a big trouble. Please open an issue on GitHub to request an index to be added.
If you use the dplyr
package, one pattern that works well is to use group_modify
. Here, we compute the pairwise Black-White dissimilarity index for each state separately:
library("segregation")
library("dplyr")
%>%
schools00 filter(race %in% c("black", "white")) %>%
group_by(state) %>%
group_modify(~ dissimilarity(
data = .x,
group = "race",
unit = "school",
weight = "n"
))#> # A tibble: 3 × 3
#> # Groups: state [3]
#> state stat est
#> <fct> <chr> <dbl>
#> 1 A D 0.706
#> 2 B D 0.655
#> 3 C D 0.704
A similar pattern works also well with data.table
:
library("data.table")
<- as.data.table(schools00)
schools00
schools00[%in% c("black", "white"),
race dissimilarity(data = .SD, group = "race", unit = "school", weight = "n"),
= .(state)
by
]#> state stat est
#> <fctr> <char> <num>
#> 1: A D 0.7063595
#> 2: B D 0.6548485
#> 3: C D 0.7042057
To compute many decompositions at once, it’s easiest to combine the data for the two time points. For instance, here’s a dplyr
solution to decompose the state-specific M indices between 2000 and 2005:
# helper function for decomposition
<- function(df, group) {
diff <- filter(df, year == 2000)
data1 <- filter(df, year == 2005)
data2 mutual_difference(data1, data2, group = "race", unit = "school", weight = "n")
}
# add year indicators
$year <- 2000
schools00$year <- 2005
schools05<- bind_rows(schools00, schools05)
combine
%>%
combine group_by(state) %>%
group_modify(diff) %>%
head(5)
#> # A tibble: 5 × 3
#> # Groups: state [1]
#> state stat est
#> <fct> <chr> <dbl>
#> 1 A M1 0.409
#> 2 A M2 0.445
#> 3 A diff 0.0359
#> 4 A additions -0.0159
#> 5 A removals 0.0390
Again, here’s also a data.table
solution:
setDT(combine)
diff(.SD), by = .(state)] %>% head(5)
combine[, #> state stat est
#> <fctr> <char> <num>
#> 1: A M1 0.40859652
#> 2: A M2 0.44454379
#> 3: A diff 0.03594727
#> 4: A additions -0.01585879
#> 5: A removals 0.03903106
tidycensus
to compute segregation indices?Here are a few examples thanks to Kyle Walker, the author of the tidycensus package.
First, download the data:
library("tidycensus")
<- get_acs(
cook_data geography = "tract",
variables = c(
white = "B03002_003",
black = "B03002_004",
asian = "B03002_006",
hispanic = "B03002_012"
),state = "IL",
county = "Cook"
)#> Getting data from the 2017-2021 5-year ACS
Because this data is in “long” format, it’s easy to compute segregation indices:
# compute index of dissimilarity
%>%
cook_data filter(variable %in% c("black", "white")) %>%
dissimilarity(
group = "variable",
unit = "GEOID",
weight = "estimate"
)#> stat est
#> <char> <num>
#> 1: D 0.7855711
# compute multigroup M/H indices
%>%
cook_data mutual_total(
group = "variable",
unit = "GEOID",
weight = "estimate"
)#> stat est
#> <char> <num>
#> 1: M 0.5114435
#> 2: H 0.4089561
Producing a map of local segregation scores is also not hard:
library("tigris")
library("ggplot2")
<- mutual_local(cook_data,
local_seg group = "variable",
unit = "GEOID",
weight = "estimate",
wide = TRUE
)
# download shapefile
<- tracts("IL", "Cook", cb = TRUE, progress_bar = FALSE) %>%
seg_geom left_join(local_seg, by = "GEOID")
#> Retrieving data for the year 2021
ggplot(seg_geom, aes(fill = ls)) +
geom_sf(color = NA) +
coord_sf(crs = 3435) +
scale_fill_viridis_c() +
theme_void() +
labs(
title = "Local segregation scores for Cook County, IL",
fill = NULL
)
When using mutual_difference
, supply method = "shapley_detailed"
to get two different local segregation scores that are margins-adjusted (one is coming from adjusting forward, the other from adjusting backwards). By averaging them we can create a single margins-adjusted local segregation score:
<- mutual_difference(schools00, schools05, "race", "school",
diff weight = "n", method = "shapley_detailed"
)
%in% c("ls_diff1", "ls_diff2"),
diff[stat ls_diff_adjusted = mean(est)),
.(= .(school)
by
]#> school ls_diff_adjusted
#> <fctr> <num>
#> 1: A1_3 -0.088983164
#> 2: A2_2 -0.044338042
#> 3: A2_3 -0.101696519
#> 4: A2_4 -0.020134162
#> 5: A2_6 -0.138567163
#> ---
#> 1706: C164_2 -0.031329845
#> 1707: C165_1 -0.023978101
#> 1708: C165_3 0.003781632
#> 1709: C166_1 0.010270713
#> 1710: C167_1 -0.002663687
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.