The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
CongressData
is a package designed to allow a user with
only basic knowledge of R interact with CongressData, a
dataset with nearly 800 variables that compiles information about all US
congressional districts across 1789-2023, and its codebook. The dataset
tracks district characteristics, members of congress, and the behavior
of those members in policymaking. Users can find variables related to
demographics, politics, and policy; subset the data across multiple
dimensions; create custom aggregations of the dataset; and access
citations in both plain text and BibTeX for every variable.
CongressData
is a functional package that interacts with
the CongressData dataset via the internet. Install the package from
GitHub like so:
# use the devtools library to download the package from GitHub
library(devtools)
# if there are issues or you only want to download CongressData
install_github("ippsr/CongressData")
get_var_info
: Retrieve information regarding variables
in CongressData and identify variables of interest with
get_var_info
. The function allows you to search to codebook
to find the years each variable is observed in the data; a short and
long description of each variable; and the source and citation/s for
each variable. Citations are available in both bibtex and plain text.
Use the function to search for broad terms like ‘tax’ with the
related_to
argument and/or partial-match variable names
with var_names
.
suppressMessages(library(dplyr))
library(CongressData)
#> Please cite:
#> Grossmann, M., Lucas, C., McCrain, J, & Ostrander, I. (2022). CongressData.
#> East Lansing, MI: Institute for Public Policy and Social Research (IPPSR).
#>
#> Run `CongressData::get_congress_version()` to print the version of CongressData the package is using.
# variables related to health insurance
<- get_var_info(related_to = "health insurance")
h_ins_cong
cat("There are",nrow(h_ins_cong),"variables related to health insurance in CongressData")
#> There are 41 variables related to health insurance in CongressData
head(h_ins_cong$variable)
#> [1] "percent_under18_healthins" "percent_private_under18"
#> [3] "percent_public_under18" "percent_privpub_under18"
#> [5] "percent_pop18_34" "percent_private_18_34"
# variables with 'under18' in their name
<- get_var_info(var_names = "under18")
under18_cong
head(under18_cong$variable)
#> [1] "percent_under18" "percent_under18_healthins"
#> [3] "percent_private_under18" "percent_public_under18"
#> [5] "percent_privpub_under18" "under18"
get_var_info
returns the following information to
simplify using CongressData:
get_cong_data
: Access all or a part of CongressData with
get_cong_data
. Subset by state names with
state
and years with years
(either a single
year or a two-year vector that represents the min/max of what you want).
You can also use the related_to
argument to search across
variable names, short/long descriptions from the codebook, and citations
for non-exact matches of a supplied term. For example, searching ‘tax’
will return variables with words like ‘taxes’ and ‘taxable’ in any of
those columns.
# load the entire dataset
<- get_cong_data()
all_the_dat
# subset by state, topic, and years
<- get_cong_data(states = c("Indiana","Kentucky","Michigan")
cong_subset related_to = "tax"
,years = c(1960,1980)) ,
Run get_congress_version
to see what version of the
dataset is available in CongressData
.
::get_congress_version()
CongressData#> You are using CongressData version: 1.1
get_var_info
: Each variable in CongressData was
collected from external sources, please use get_var_info
to
obtain their citations (plain text and BibTeX). We’ve made it easy to
cite the source of each variable you use with the
get_var_info
function described above. Supply a vector of
variable names to the function with the var_names
function
and collect the citations provided in the plain text or BibTeX columns.
NOTE: Some variables have multiple citations, so do check you have them
all.
# bibtex is also available
get_var_info(var_names = "com_benghazi_299") %>%
pull(plaintext_cite)
#> [1] "Charles Stewart III and Jonathan Woon. Congressional Committee Assignments, 103rd to 114th Congresses, 1993--2017: House of Representatives, 2017.\n"
# bibtex is also available
get_var_info(var_names = "percent_bus") %>%
pull(plaintext_cite)
#> [1] "U.S. Census Bureau. (2022). 2009-2019 American Community Survey 1-year Estimates. Retrieved from the Census Bureau Data API."
In addition to citing each variable’s source, we ask that you cite CongressData if use this package or the dataset. A recommended citation is below.
Grossmann, M., Lucas, C., McCrain, J, & Ostrander, I. (2022). CongressData. East Lansing, MI: Institute for Public Policy and Social Research (IPPSR)
For questions about the CongressData dataset, contact Ben Yoel (yoelbenj@msu.edu).
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.