The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
Start with the necessary packages for the vignette.
loadedPackages <- c('dplyr', 'ggplot2', 'ndi', 'sf', 'tidycensus', 'tigris')
invisible(lapply(loadedPackages, library, character.only = TRUE))
options(tigris_use_cache = TRUE)
Set your U.S. Census Bureau access key. Follow this link to
obtain one. Specify your access key in the messer()
or
powell_wiley()
functions using the key
argument of the get_acs()
function from the tidycensus
package called within each or by using the census_api_key()
function from the tidycensus
package before running the messer()
or
powell_wiley()
functions (see an example of the latter
below).
Compute the NDI (Messer) values (2006-2010 5-year ACS) for Georgia, U.S.A., census tracts. This metric is based on Messer et al. (2006) with the following socio-economic status (SES) variables:
Characteristic | SES dimension | ACS table source | Description |
---|---|---|---|
OCC | Occupation | C24030 | Percent males in management, science, and arts occupation |
CWD | Housing | B25014 | Percent of crowded housing |
POV | Poverty | B17017 | Percent of households in poverty |
FHH | Poverty | B25115 | Percent of female headed households with dependents |
PUB | Poverty | B19058 | Percent of households on public assistance |
U30 | Poverty | B19001 | Percent households earning <$30,000 per year |
EDU | Education | B06009 | Percent earning less than a high school education |
EMP | Employment | B23001 (2010 only); B23025 (2011 onward) | Percent unemployed |
One output from the messer()
function is a tibble
containing the identification, geographic name, NDI (Messer)
values, and raw census characteristics for each tract.
## # A tibble: 1,969 × 14
## GEOID state county tract NDI NDIQuart OCC CWD POV FHH PUB U30
## <chr> <chr> <chr> <chr> <dbl> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1300… Geor… Appli… 9501 -0.0075 2-Below… 0 0 0.1 0.1 0.1 0.3
## 2 1300… Geor… Appli… 9502 0.0458 4-Most … 0 0 0.3 0.1 0.2 0.5
## 3 1300… Geor… Appli… 9503 0.0269 3-Above… 0 0 0.2 0 0.2 0.4
## 4 1300… Geor… Appli… 9504 -0.0083 2-Below… 0 0 0.1 0 0.1 0.3
## 5 1300… Geor… Appli… 9505 0.0231 3-Above… 0 0 0.2 0 0.2 0.4
## 6 1300… Geor… Atkin… 9601 0.0619 4-Most … 0 0.1 0.2 0.2 0.2 0.5
## 7 1300… Geor… Atkin… 9602 0.0593 4-Most … 0 0.1 0.3 0.1 0.2 0.4
## 8 1300… Geor… Atkin… 9603 0.0252 3-Above… 0 0 0.3 0.1 0.2 0.4
## 9 1300… Geor… Bacon… 9701 0.0061 3-Above… 0 0 0.2 0 0.2 0.4
## 10 1300… Geor… Bacon… 9702… 0.0121 3-Above… 0 0 0.2 0.1 0.1 0.5
## # ℹ 1,959 more rows
## # ℹ 2 more variables: EDU <dbl>, EMP <dbl>
A second output from the messer()
function is the
results from the principal component analysis used to compute the
NDI (Messer) values.
## Principal Components Analysis
## Call: psych::principal(r = ndi_data_pca, nfactors = 1, n.obs = nrow(ndi_data_pca),
## covar = FALSE, scores = TRUE, missing = imp)
## Standardized loadings (pattern matrix) based upon correlation matrix
## PC1 h2 u2 com
## OCC -0.59 0.35 0.65 1
## CWD 0.47 0.22 0.78 1
## POV 0.87 0.76 0.24 1
## FHH 0.67 0.45 0.55 1
## PUB 0.89 0.79 0.21 1
## U30 0.90 0.81 0.19 1
## EDU 0.79 0.62 0.38 1
## EMP 0.46 0.21 0.79 1
##
## PC1
## SS loadings 4.21
## Proportion Var 0.53
##
## Mean item complexity = 1
## Test of the hypothesis that 1 component is sufficient.
##
## The root mean square of the residuals (RMSR) is 0.11
## with the empirical chi square 1221.09 with prob < 2.3e-246
##
## Fit based upon off diagonal values = 0.95
A third output from the messer()
function is a tibble
containing a breakdown of the missingness of the census characteristics
used to compute the NDI (Messer) values.
## # A tibble: 8 × 4
## variable total n_missing percent_missing
## <chr> <int> <int> <chr>
## 1 CWD 1969 14 0.71 %
## 2 EDU 1969 13 0.66 %
## 3 EMP 1969 13 0.66 %
## 4 FHH 1969 14 0.71 %
## 5 OCC 1969 15 0.76 %
## 6 POV 1969 14 0.71 %
## 7 PUB 1969 14 0.71 %
## 8 U30 1969 14 0.71 %
We can visualize the NDI (Messer) values geographically by linking them to spatial information from the tigris package and plotting with the ggplot2 package suite.
# Obtain the 2010 counties from the 'tigris' package
county2010GA <- counties(state = 'GA', year = 2010, cb = TRUE)
# Remove first 9 characters from GEOID for compatibility with tigris information
county2010GA$GEOID <- substring(county2010GA$GEO_ID, 10)
# Obtain the 2010 census tracts from the 'tigris' package
tract2010GA <- tracts(state = 'GA', year = 2010, cb = TRUE)
# Remove first 9 characters from GEOID for compatibility with tigris information
tract2010GA$GEOID <- substring(tract2010GA$GEO_ID, 10)
# Join the NDI (Messer) values to the census tract geometry
GA2010messer <- tract2010GA %>%
left_join(messer2010GA$ndi, by = 'GEOID')
# Visualize the NDI (Messer) values (2006-2010 5-year ACS) for Georgia, U.S.A., census tracts
## Continuous Index
ggplot() +
geom_sf(
data = GA2010messer,
aes(fill = NDI),
size = 0.05,
color = 'transparent'
) +
geom_sf(
data = county2010GA,
fill = 'transparent',
color = 'white',
size = 0.10
) +
theme_minimal() +
scale_fill_viridis_c() +
labs(fill = 'Index (Continuous)', caption = 'Source: U.S. Census ACS 2006-2010 estimates') +
ggtitle(
'Neighborhood Deprivation Index (Messer)',
subtitle = 'GA census tracts as the referent'
)
## Categorical Index
### Rename '9-NDI not avail' level as NA for plotting
GA2010messer$NDIQuartNA <-
factor(
replace(
as.character(GA2010messer$NDIQuart),
GA2010messer$NDIQuart == '9-NDI not avail',
NA
),
c(levels(GA2010messer$NDIQuart)[-5], NA)
)
ggplot() +
geom_sf(
data = GA2010messer,
aes(fill = NDIQuartNA),
size = 0.05,
color = 'transparent'
) +
geom_sf(
data = county2010GA,
fill = 'transparent',
color = 'white',
size = 0.10
) +
theme_minimal() +
scale_fill_viridis_d(guide = guide_legend(reverse = TRUE), na.value = 'grey80') +
labs(fill = 'Index (Categorical)', caption = 'Source: U.S. Census ACS 2006-2010 estimates') +
ggtitle(
'Neighborhood Deprivation Index (Messer) Quartiles',
subtitle = 'GA census tracts as the referent'
)
The results above are at the tract level. The NDI (Messer) values can also be calculated at the county level.
messer2010GA_county <- messer(geo = 'county', state = 'GA', year = 2010)
# Join the NDI (Messer) values to the county geometry
GA2010messer_county <- county2010GA %>%
left_join(messer2010GA_county$ndi, by = 'GEOID')
# Visualize the NDI (Messer) values (2006-2010 5-year ACS) for Georgia, U.S.A., counties
## Continuous Index
ggplot() +
geom_sf(
data = GA2010messer_county,
aes(fill = NDI),
size = 0.10,
color = 'white'
) +
theme_minimal() +
scale_fill_viridis_c() +
labs(fill = 'Index (Continuous)', caption = 'Source: U.S. Census ACS 2006-2010 estimates') +
ggtitle(
'Neighborhood Deprivation Index (Messer)',
subtitle = 'GA counties as the referent'
)
## Categorical Index
### Rename '9-NDI not avail' level as NA for plotting
GA2010messer_county$NDIQuartNA <-
factor(
replace(
as.character(GA2010messer_county$NDIQuart),
GA2010messer_county$NDIQuart == '9-NDI not avail',
NA
),
c(levels(GA2010messer_county$NDIQuart)[-5], NA)
)
ggplot() +
geom_sf(
data = GA2010messer_county,
aes(fill = NDIQuartNA),
size = 0.10,
color = 'white'
) +
theme_minimal() +
scale_fill_viridis_d(guide = guide_legend(reverse = TRUE), na.value = 'grey80') +
labs(fill = 'Index (Categorical)', caption = 'Source: U.S. Census ACS 2006-2010 estimates') +
ggtitle(
'Neighborhood Deprivation Index (Messer) Quartiles',
subtitle = 'GA counties as the referent'
)
Compute the NDI (Powell-Wiley) values (2016-2020 5-year ACS) for Maryland, Virginia, Washington, D.C., and West Virginia, U.S.A., census tracts. This metric is based on Andrews et al. (2020) and Slotman et al. (2022) with socio-economic status (SES) variables chosen by Roux and Mair (2010):
Characteristic | SES dimension | ACS table source | Description |
---|---|---|---|
MedHHInc | Wealth and income | B19013 | Median household income (dollars) |
PctRecvIDR | Wealth and income | B19054 | Percent of households receiving dividends, interest, or rental income |
PctPubAsst | Wealth and income | B19058 | Percent of households receiving public assistance |
MedHomeVal | Wealth and income | B25077 | Median home value (dollars) |
PctMgmtBusSciArt | Occupation | C24060 | Percent in a management, business, science, or arts occupation |
PctFemHeadKids | Occupation | B11005 | Percent of households that are female headed with any children under 18 years |
PctOwnerOcc | Housing conditions | DP04 | Percent of housing units that are owner occupied |
PctNoPhone | Housing conditions | DP04 | Percent of households without a telephone |
PctNComPlmb | Housing conditions | DP04 | Percent of households without complete plumbing facilities |
PctEducHSPlus | Education | S1501 | Percent with a high school degree or higher (population 25 years and over) |
PctEducBchPlus | Education | S1501 | Percent with a college degree or higher (population 25 years and over) |
PctFamBelowPov | Wealth and income | S1702 | Percent of families with incomes below the poverty level |
PctUnempl | Occupation | S2301 | Percent unemployed |
More information about the codebook and computation of the NDI (Powell-Wiley) can be found on a GIS Portal for Cancer Research website.
powell_wiley2020DMVW <- powell_wiley(
state = c('DC', 'MD', 'VA', 'WV'),
year = 2020,
round_output = TRUE
)
One output from the powell_wiley()
function is a tibble
containing the identification, geographic name, NDI
(Powell-Wiley) values, and raw census characteristics for each
tract.
## # A tibble: 4,425 × 20
## GEOID state county tract NDI NDIQuint MedHHInc PctRecvIDR PctPubAsst
## <chr> <chr> <chr> <chr> <dbl> <fct> <dbl> <dbl> <dbl>
## 1 11001000101 Distr… Distr… 1.01 -2.13 1-Least… 187839 50.9 0.8
## 2 11001000102 Distr… Distr… 1.02 -2.46 1-Least… 184167 52.2 0.6
## 3 11001000201 Distr… Distr… 2.01 NA 9-NDI n… NA NaN NaN
## 4 11001000202 Distr… Distr… 2.02 -2.30 1-Least… 164261 49.6 0.9
## 5 11001000300 Distr… Distr… 3 -2.06 1-Least… 156483 46 0.6
## 6 11001000400 Distr… Distr… 4 -2.09 1-Least… 153397 47.8 0
## 7 11001000501 Distr… Distr… 5.01 -2.11 1-Least… 119911 44.5 0.8
## 8 11001000502 Distr… Distr… 5.02 -2.21 1-Least… 153264 46.8 0.5
## 9 11001000600 Distr… Distr… 6 -2.16 1-Least… 154266 60.8 7.4
## 10 11001000702 Distr… Distr… 7.02 -1.20 1-Least… 71747 22.9 0
## # ℹ 4,415 more rows
## # ℹ 11 more variables: MedHomeVal <dbl>, PctMgmtBusScArti <dbl>,
## # PctFemHeadKids <dbl>, PctOwnerOcc <dbl>, PctNoPhone <dbl>,
## # PctNComPlmb <dbl>, PctEducHSPlus <dbl>, PctEducBchPlus <dbl>,
## # PctFamBelowPov <dbl>, PctUnempl <dbl>, TotalPop <dbl>
A second output from the powell_wiley()
function is the
results from the principal component analysis used to compute the
NDI (Powell-Wiley) values.
## $loadings
##
## Loadings:
## PC1 PC2 PC3
## logMedHHInc -0.638 -0.364
## PctNoIDRZ 0.612 0.319
## PctPubAsstZ 0.379 0.615
## logMedHomeVal -0.893
## PctWorkClassZ 0.974
## PctFemHeadKidsZ 0.128 0.697 -0.233
## PctNotOwnerOccZ -0.375 0.923
## PctNoPhoneZ 0.329 0.524
## PctNComPlmbZ -0.141 0.869
## PctEducLTHSZ 0.642 0.164
## PctEducLTBchZ 1.020 -0.121
## PctFamBelowPovZ 0.219 0.698
## PctUnemplZ 0.596
##
## PC1 PC2 PC3
## SS loadings 4.340 2.971 1.102
## Proportion Var 0.334 0.229 0.085
## Cumulative Var 0.334 0.562 0.647
##
## $rotmat
## [,1] [,2] [,3]
## [1,] 0.68516447 0.4403017 0.04517229
## [2,] -0.95446519 1.0806634 0.03196818
## [3,] -0.09078904 -0.2028145 1.02053725
##
## $rotation
## [1] "promax"
##
## $Phi
## [,1] [,2] [,3]
## [1,] 1.0000000 0.5250277 0.1649550
## [2,] 0.5250277 1.0000000 0.1923867
## [3,] 0.1649550 0.1923867 1.0000000
##
## $Structure
## [,1] [,2] [,3]
## [1,] -0.8432264 -0.71542812 -0.26037289
## [2,] 0.7750210 0.63477178 0.13357439
## [3,] 0.7001489 0.81237803 0.17181801
## [4,] -0.8931597 -0.46211477 -0.19777858
## [5,] 0.9253217 0.42037509 0.12424330
## [6,] 0.4559503 0.71962940 -0.07780352
## [7,] 0.1195748 0.73799969 0.17788881
## [8,] 0.2552300 0.42753150 0.58693047
## [9,] 0.1155170 0.05008712 0.84948765
## [10,] 0.7325510 0.50620197 0.16403994
## [11,] 0.9557341 0.41335682 0.13787215
## [12,] 0.5881663 0.81650181 0.18717861
## [13,] 0.3841044 0.63036827 0.09925687
##
## $communality
## [1] 0.5468870 0.4774810 0.5220533 0.8012746 0.9576793 0.5566938 0.9968490
## [8] 0.3829550 0.7774209 0.4398519 1.0560478 0.5359573 0.3616523
##
## $uniqueness
## logMedHHInc PctNoIDRZ PctPubAsstZ logMedHomeVal PctWorkClassZ
## 0.453113047 0.522518997 0.477946655 0.198725386 0.042320711
## PctFemHeadKidsZ PctNotOwnerOccZ PctNoPhoneZ PctNComPlmbZ PctEducLTHSZ
## 0.443306153 0.003150984 0.617045044 0.222579117 0.560148142
## PctEducLTBchZ PctFamBelowPovZ PctUnemplZ
## -0.056047809 0.464042693 0.638347726
##
## $Vaccounted
## [,1] [,2] [,3]
## SS loadings 4.5987837 3.2131788 1.10386926
## Proportion Var 0.3537526 0.2471676 0.08491302
## Cumulative Var 0.3537526 0.6009202 0.68583321
## Proportion Explained 0.5157997 0.3603902 0.12381002
## Cumulative Proportion 0.5157997 0.8761900 1.00000000
A third output from the powell_wiley()
function is a
tibble containing a breakdown of the missingness of the census
characteristics used to compute the NDI (Powell-Wiley)
values.
## # A tibble: 13 × 4
## variable total n_missing percent_missing
## <chr> <int> <int> <chr>
## 1 PctEducLTBchZ 4425 47 1.06 %
## 2 PctEducLTHSZ 4425 47 1.06 %
## 3 PctFamBelowPovZ 4425 63 1.42 %
## 4 PctFemHeadKidsZ 4425 60 1.36 %
## 5 PctNComPlmbZ 4425 60 1.36 %
## 6 PctNoIDRZ 4425 60 1.36 %
## 7 PctNoPhoneZ 4425 60 1.36 %
## 8 PctNotOwnerOccZ 4425 60 1.36 %
## 9 PctPubAsstZ 4425 60 1.36 %
## 10 PctUnemplZ 4425 57 1.29 %
## 11 PctWorkClassZ 4425 57 1.29 %
## 12 logMedHHInc 4425 73 1.65 %
## 13 logMedHomeVal 4425 148 3.34 %
A fourth output from the powell_wiley()
function is a
character string or numeric value of a standardized Cronbach’s alpha. A
value greater than 0.7 is desired.
## [1] 0.9321693
We can visualize the NDI (Powell-Wiley) values geographically by linking them to spatial information from the tigris package and plotting with the ggplot2 package suite.
# Obtain the 2020 counties from the 'tigris' package
county2020 <- counties(cb = TRUE)
county2020DMVW <- county2020[county2020$STUSPS %in% c('DC', 'MD', 'VA', 'WV'), ]
# Obtain the 2020 census tracts from the 'tigris' package
tract2020D <- tracts(state = 'DC', year = 2020, cb = TRUE)
tract2020M <- tracts(state = 'MD', year = 2020, cb = TRUE)
tract2020V <- tracts(state = 'VA', year = 2020, cb = TRUE)
tract2020W <- tracts(state = 'WV', year = 2020, cb = TRUE)
tracts2020DMVW <- rbind(tract2020D, tract2020M, tract2020V, tract2020W)
# Join the NDI (Powell-Wiley) values to the census tract geometry
DMVW2020pw <- tracts2020DMVW %>%
left_join(powell_wiley2020DMVW$ndi, by = 'GEOID')
# Visualize the NDI (Powell-Wiley) values (2016-2020 5-year ACS)
## Maryland, Virginia, Washington, D.C., and West Virginia, U.S.A., census tracts
## Continuous Index
ggplot() +
geom_sf(
data = DMVW2020pw,
aes(fill = NDI),
color = NA
) +
geom_sf(
data = county2020DMVW,
fill = 'transparent',
color = 'white'
) +
theme_minimal() +
scale_fill_viridis_c(na.value = 'grey80') +
labs(fill = 'Index (Continuous)', caption = 'Source: U.S. Census ACS 2016-2020 estimates') +
ggtitle(
'Neighborhood Deprivation Index (Powell-Wiley)',
subtitle = 'DC, MD, VA, and WV tracts as the referent'
)
## Categorical Index (Population-weighted quintiles)
### Rename '9-NDI not avail' level as NA for plotting
DMVW2020pw$NDIQuintNA <-
factor(replace(
as.character(DMVW2020pw$NDIQuint),
DMVW2020pw$NDIQuint == '9-NDI not avail',
NA
),
c(levels(DMVW2020pw$NDIQuint)[-6], NA))
ggplot() +
geom_sf(data = DMVW2020pw, aes(fill = NDIQuintNA), color = NA) +
geom_sf(data = county2020DMVW, fill = 'transparent', color = 'white') +
theme_minimal() +
scale_fill_viridis_d(guide = guide_legend(reverse = TRUE), na.value = 'grey80') +
labs(fill = 'Index (Categorical)', caption = 'Source: U.S. Census ACS 2016-2020 estimates') +
ggtitle(
'Neighborhood Deprivation Index (Powell-Wiley) Population-weighted Quintiles',
subtitle = 'DC, MD, VA, and WV tracts as the referent'
)
Like the NDI (Messer), we also compute county-level NDI (Powell-Wiley).
# Obtain the 2020 counties from the 'tigris' package
county2020DMVW <- counties(state = c('DC', 'MD', 'VA', 'WV'), year = 2020, cb = TRUE)
# NDI (Powell-Wiley) at the county level (2016-2020)
powell_wiley2020DMVW_county <- powell_wiley(
geo = 'county',
state = c('DC', 'MD', 'VA', 'WV'),
year = 2020
)
# Join the NDI (Powell-Wiley) values to the county geometry
DMVW2020pw_county <- county2020DMVW %>%
left_join(powell_wiley2020DMVW_county$ndi, by = 'GEOID')
# Visualize the NDI (Powell-Wiley) values (2016-2020 5-year ACS)
## Maryland, Virginia, Washington, D.C., and West Virginia, U.S.A., counties
## Continuous Index
ggplot() +
geom_sf(
data = DMVW2020pw_county,
aes(fill = NDI),
size = 0.10,
color = 'white'
) +
theme_minimal() +
scale_fill_viridis_c() +
labs(fill = 'Index (Continuous)', caption = 'Source: U.S. Census ACS 2016-2020 estimates') +
ggtitle(
'Neighborhood Deprivation Index (Powell-Wiley)',
subtitle = 'DC, MD, VA, and WV counties as the referent'
)
## Categorical Index
### Rename '9-NDI not avail' level as NA for plotting
DMVW2020pw_county$NDIQuintNA <-
factor(
replace(
as.character(DMVW2020pw_county$NDIQuint),
DMVW2020pw_county$NDIQuint == '9-NDI not avail',
NA
),
c(levels(DMVW2020pw_county$NDIQuint)[-6], NA)
)
ggplot() +
geom_sf(
data = DMVW2020pw_county,
aes(fill = NDIQuint),
size = 0.10,
color = 'white'
) +
theme_minimal() +
scale_fill_viridis_d(guide = guide_legend(reverse = TRUE), na.value = 'grey80') +
labs(fill = 'Index (Categorical)', caption = 'Source: U.S. Census ACS 2016-2020 estimates') +
ggtitle(
'Neighborhood Deprivation Index (Powell-Wiley) Population-weighted Quintiles',
subtitle = 'DC, MD, VA, and WV counties as the referent'
)
In the messer()
and powell_wiley()
functions, missing census characteristics can be imputed using the
missing
and impute
arguments of the
pca()
function in the psych
package called within the messer()
and
powell_wiley()
functions. Impute values using the logical
imp
argument (currently only calls
impute = 'median'
by default, which assigns the median
values of each missing census variable for a geography).
powell_wiley2020DC <- powell_wiley(state = 'DC', year = 2020) # without imputation
powell_wiley2020DCi <- powell_wiley(state = 'DC', year = 2020, imp = TRUE) # with imputation
table(is.na(powell_wiley2020DC$ndi$NDI)) # n=13 tracts without NDI (Powell-Wiley) values
table(is.na(powell_wiley2020DCi$ndi$NDI)) # n=0 tracts without NDI (Powell-Wiley) values
# Obtain the 2020 census tracts from the 'tigris' package
tract2020DC <- tracts(state = 'DC', year = 2020, cb = TRUE)
# Join the NDI (Powell-Wiley) values to the census tract geometry
DC2020pw <- tract2020DC %>%
left_join(powell_wiley2020DC$ndi, by = 'GEOID')
DC2020pw <- DC2020pw %>%
left_join(powell_wiley2020DCi$ndi, by = 'GEOID', suffix = c('_nonimp', '_imp'))
# Visualize the NDI (Powell-Wiley) values (2016-2020 5-year ACS) for
## Washington, D.C., census tracts
## Continuous Index
ggplot() +
geom_sf(
data = DC2020pw,
aes(fill = NDI_nonimp),
size = 0.10,
color = 'white'
) +
theme_minimal() +
scale_fill_viridis_c() +
labs(fill = 'Index (Continuous)', caption = 'Source: U.S. Census ACS 2016-2020 estimates') +
ggtitle(
'Neighborhood Deprivation Index (Powell-Wiley), Non-Imputed',
subtitle = 'DC census tracts as the referent'
)
ggplot() +
geom_sf(
data = DC2020pw,
aes(fill = NDI_imp),
size = 0.10,
color = 'white'
) +
theme_minimal() +
scale_fill_viridis_c() +
labs(fill = 'Index (Continuous)', caption = 'Source: U.S. Census ACS 2016-2020 estimates') +
ggtitle(
'Neighborhood Deprivation Index (Powell-Wiley), Imputed',
subtitle = 'DC census tracts as the referent'
)
## Categorical Index
### Rename '9-NDI not avail' level as NA for plotting
DC2020pw$NDIQuintNA_nonimp <-
factor(
replace(
as.character(DC2020pw$NDIQuint_nonimp),
DC2020pw$NDIQuint_nonimp == '9-NDI not avail',
NA
),
c(levels(DC2020pw$NDIQuint_nonimp)[-6], NA)
)
ggplot() +
geom_sf(
data = DC2020pw,
aes(fill = NDIQuintNA_nonimp),
size = 0.10,
color = 'white'
) +
theme_minimal() +
scale_fill_viridis_d(guide = guide_legend(reverse = TRUE), na.value = 'grey80') +
labs(fill = 'Index (Categorical)', caption = 'Source: U.S. Census ACS 2016-2020 estimates') +
ggtitle(
'Neighborhood Deprivation Index (Powell-Wiley) Quintiles, Non-Imputed',
subtitle = 'DC census tracts as the referent'
)
### Rename '9-NDI not avail' level as NA for plotting
DC2020pw$NDIQuintNA_imp <-
factor(
replace(
as.character(DC2020pw$NDIQuint_imp),
DC2020pw$NDIQuint_imp == '9-NDI not avail',
NA
),
c(levels(DC2020pw$NDIQuint_imp)[-6], NA)
)
ggplot() +
geom_sf(
data = DC2020pw,
aes(fill = NDIQuintNA_imp),
size = 0.10,
color = 'white'
) +
theme_minimal() +
scale_fill_viridis_d(guide = guide_legend(reverse = TRUE), na.value = 'grey80') +
labs(fill = 'Index (Categorical)', caption = 'Source: U.S. Census ACS 2016-2020 estimates') +
ggtitle(
'Neighborhood Deprivation Index (Powell-Wiley) Quintiles, Imputed',
subtitle = 'DC census tracts as the referent'
)
To conduct a contiguous US-standardized index, compute an
NDI for all states as in the example below that replicates the
nationally standardized NDI (Powell-Wiley) values (2013-2017
ACS-5) found in Slotman et
al. (2022) and available from a GIS Portal for
Cancer Research website. To replicate the nationally standardized
NDI (Powell-Wiley) values (2006-2010 ACS-5) found in Andrews et
al. (2020) change the year
argument to
2010
(i.e., year = 2010
).
us <- states()
n51 <- c(
'Commonwealth of the Northern Mariana Islands',
'Guam',
'American Samoa',
'Puerto Rico',
'United States Virgin Islands'
)
y51 <- us$STUSPS[!(us$NAME %in% n51)]
start_time <- Sys.time() # record start time
powell_wiley2017US <- powell_wiley(state = y51, year = 2017)
end_time <- Sys.time() # record end time
time_srr <- end_time - start_time # Calculate run time
ggplot(powell_wiley2017US$ndi, aes(x = NDI)) +
geom_histogram(color = 'black', fill = 'white') +
theme_minimal() +
ggtitle(
'Histogram of US-standardized NDI (Powell-Wiley) values (2013-2017)',
subtitle = 'U.S. census tracts as the referent (including AK, HI, and DC)'
)
The process to compute a US-standardized NDI (Powell-Wiley) took about 12.7 minutes to run on a machine with the features listed at the end of the vignette.
## R version 4.5.1 (2025-06-13 ucrt)
## Platform: x86_64-w64-mingw32/x64
## Running under: Windows 11 x64 (build 26100)
##
## Matrix products: default
## LAPACK version 3.12.1
##
## locale:
## [1] LC_COLLATE=English_United States.utf8
## [2] LC_CTYPE=English_United States.utf8
## [3] LC_MONETARY=English_United States.utf8
## [4] LC_NUMERIC=C
## [5] LC_TIME=English_United States.utf8
##
## time zone: America/New_York
## tzcode source: internal
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] tigris_2.2.1 tidycensus_1.7.3 sf_1.0-21 ndi_0.2.0
## [5] ggplot2_3.5.2 dplyr_1.1.4 knitr_1.50
##
## loaded via a namespace (and not attached):
## [1] tidyselect_1.2.1 psych_2.5.6 viridisLite_0.4.2 farver_2.1.2
## [5] fastmap_1.2.0 digest_0.6.37 rpart_4.1.24 lifecycle_1.0.4
## [9] cluster_2.1.8.1 Cairo_1.6-2 magrittr_2.0.3 compiler_4.5.1
## [13] rlang_1.1.6 Hmisc_5.2-3 sass_0.4.10 tools_4.5.1
## [17] utf8_1.2.6 yaml_2.3.10 data.table_1.17.8 labeling_0.4.3
## [21] htmlwidgets_1.6.4 classInt_0.4-11 mnormt_2.1.1 curl_6.4.0
## [25] xml2_1.3.8 RColorBrewer_1.1-3 abind_1.4-8 KernSmooth_2.23-26
## [29] withr_3.0.2 foreign_0.8-90 purrr_1.1.0 nnet_7.3-20
## [33] grid_4.5.1 e1071_1.7-16 colorspace_2.1-1 scales_1.4.0
## [37] MASS_7.3-65 cli_3.6.5 rmarkdown_2.29 crayon_1.5.3
## [41] generics_0.1.4 rstudioapi_0.17.1 httr_1.4.7 tzdb_0.5.0
## [45] DBI_1.2.3 cachem_1.1.0 proxy_0.4-27 stringr_1.5.1
## [49] rvest_1.0.4 parallel_4.5.1 base64enc_0.1-3 vctrs_0.6.5
## [53] Matrix_1.7-3 jsonlite_2.0.0 carData_3.0-5 car_3.1-3
## [57] hms_1.1.3 Formula_1.2-5 htmlTable_2.4.3 jquerylib_0.1.4
## [61] tidyr_1.3.1 units_0.8-7 glue_1.8.0 stringi_1.8.7
## [65] gtable_0.3.6 tibble_3.3.0 pillar_1.11.0 rappdirs_0.3.3
## [69] htmltools_0.5.8.1 R6_2.6.1 evaluate_1.0.5 lattice_0.22-7
## [73] readr_2.1.5 backports_1.5.0 bslib_0.9.0 class_7.3-23
## [77] Rcpp_1.1.0 uuid_1.2-1 gridExtra_2.3 nlme_3.1-168
## [81] checkmate_2.3.2 xfun_0.52 pkgconfig_2.0.3
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.