| Type: | Package |
| Title: | Import Data from Spanish Sociological Research Center (CIS) |
| Version: | 0.1.0 |
| Description: | Search and import data directly to R from the Spanish Sociological Research Center (CIS) https://www.cis.es/inicio. The CIS is a public institution that conducts electoral and sociological research studies on the Spanish society. The CIS has a large database of surveys that can be accessed through its website. The package includes functions to search for surveys, survey questions and timeseries, and import the data directly to R. |
| License: | GPL (≥ 3) |
| Encoding: | UTF-8 |
| Language: | en-US |
| RoxygenNote: | 7.3.3 |
| Depends: | R (≥ 3.5.0) |
| Imports: | httr (≥ 1.4.7), tibble (≥ 3.0.0), purrr (≥ 1.0.0), haven (≥ 2.5.3), magrittr (≥ 2.0.0), rvest (≥ 1.0.0), stringr (≥ 1.0.0), memoise (≥ 2.0.0) |
| URL: | https://opencis.spainelectoralproject.com, https://github.com/hmeleiro/opencis |
| BugReports: | https://github.com/hmeleiro/opencis/issues |
| Suggests: | knitr, testthat (≥ 3.0.0), rmarkdown |
| VignetteBuilder: | knitr |
| Config/Needs/website: | hmeleiro/spainelectoraltheme |
| Config/testthat/edition: | 3 |
| NeedsCompilation: | no |
| Packaged: | 2026-04-29 07:47:02 UTC; hmele |
| Author: | Héctor Meleiro [aut, cre] |
| Maintainer: | Héctor Meleiro <hmeleiros@gmail.com> |
| Repository: | CRAN |
| Date/Publication: | 2026-04-29 19:10:07 UTC |
opencis: Import Data from Spanish Sociological Research Center (CIS)
Description
Search and import data directly to R from the Spanish Sociological Research Center (CIS) https://www.cis.es/inicio. The CIS is a public institution that conducts electoral and sociological research studies on the Spanish society. The CIS has a large database of surveys that can be accessed through its website. The package includes functions to search for surveys, survey questions and timeseries, and import the data directly to R.
Author(s)
Maintainer: Héctor Meleiro hmeleiros@gmail.com
See Also
Useful links:
Report bugs at https://github.com/hmeleiro/opencis/issues
Open the questionnaire PDF of a CIS study
Description
Opens a PDF document from a CIS study in the default browser.
Usage
browse_pdf(study_code, wanted_file = "cues")
Arguments
study_code |
A string with the study code. |
wanted_file |
A keyword used to match the PDF filename inside the ZIP.
Use |
Details
CIS study ZIP files typically contain two PDF documents:
The questionnaire (cuestionario): use
wanted_file = "cues".The technical sheet (ficha técnica): use
wanted_file = "ft".
Value
Called for its side effect of opening the PDF in the browser.
Returns NULL invisibly.
Examples
if (interactive()) {
# Open the questionnaire (cuestionario) for study 3328
browse_pdf("3328")
# Open the technical sheet (ficha técnica) for study 3328
browse_pdf("3328", wanted_file = "ft")
}
Build CIS catalog URL with date range
Description
Constructs a URL for querying the CIS catalog with optional date range filters.
Usage
cis_catalog_url_date(
start = 1,
q = "",
from = NULL,
to = NULL,
sort = "relevance",
catalogo = "estudio",
...
)
Arguments
start |
Integer. The starting page for the search results. Default is 1, iterate to get more results. |
q |
String. The search query. Default is an empty string. |
from |
Date or NULL. The start date for filtering results. Default is NULL |
to |
Date or NULL. The end date for filtering results. Default is NULL. |
sort |
String. The sorting order for the results ("publishDate-", "publishDate+", "relevance"). Default is "relevance". |
catalogo |
String. The catalog type ("estudio", "pregunta", "serie"). Default is "estudio". |
... |
Additional parameters (not used). |
Value
A string representing the constructed URL.
Clear the opencis session cache
Description
Clears the in-memory cache used by search_cis and
read_cis. Call this when you want to force fresh data
to be retrieved from the CIS server within the same R session.
Usage
clear_cache()
Value
NULL invisibly.
Download and read a CIS study from a given URL
Description
Download and read a CIS study from a given URL
Usage
download_file(url, destfile = tempfile(fileext = ".zip"))
Arguments
url |
A string with the URL of the CIS study page. |
destfile |
A string with the path where the ZIP file will be saved. Defaults to a temporary file. |
Download a CIS study ZIP file to disk
Description
Downloads the data ZIP file for a CIS study to a specified directory, instead of a temporary folder. Useful for projects that need to keep the raw data files.
Usage
download_study(study_code, destdir = ".")
Arguments
study_code |
A string with the study code. |
destdir |
A string with the directory where the ZIP file will be saved. Defaults to the current working directory. |
Value
The path to the saved ZIP file, invisibly.
Examples
# Save the ZIP file to a temporary directory
path <- download_study("3328", destdir = tempdir())
cat("Saved to:", path, "\n")
Find CIS study data URL in HTML content
Description
Searches for the URL of the CIS study data ZIP file within the provided HTML content.
Usage
find_url(html, ids = NULL, allow_uuid = FALSE)
Arguments
html |
A character string containing the HTML content to search. |
ids |
An optional vector of two strings representing the numeric subroutes of the URL (e.g., c("3411", "3411") for "https://www.cis.es/documents/3411/3411/MD3411.zip"). If NULL, the function searches for any valid CIS study data URL. |
allow_uuid |
A boolean indicating whether to allow UUIDs in the URL. Defaults to FALSE. |
Value
A character vector of unique URLs found in the HTML content.
Extract a data dictionary from a CIS study data frame
Description
Returns a tibble listing each variable in the data along with its
variable label and value labels, as loaded by haven.
Usage
get_data_dictionary(data)
Arguments
data |
A data.frame loaded from a CIS |
Value
A tibble with columns:
- variable
Variable name.
- label
Variable label, or
NAif none.- value_labels
A named numeric vector of value labels, or
NULLfor unlabelled variables (list-column).
Examples
# Create a small labelled data frame
df <- data.frame(
SEXO = haven::labelled(c(1, 2, 1), labels = c(Hombre = 1, Mujer = 2)),
EDAD = c(34, 51, 29)
)
attr(df$SEXO, "label") <- "Sexo"
attr(df$EDAD, "label") <- "Edad"
# Inspect its variable dictionary
dict <- get_data_dictionary(df)
print(dict)
# Find variables with a specific keyword in their label
dict[grepl("sexo", dict$label, ignore.case = TRUE), ]
# Inspect value labels for a specific variable
sex_var <- match("SEXO", dict$variable)
if (!is.na(sex_var)) {
dict$value_labels[[sex_var]]
}
Get metadata of a CIS study
Description
Retrieves the technical metadata of a CIS study from its detail page, including study dates, type, country, author, and thematic indices.
Usage
get_metadata(study_code)
Arguments
study_code |
A string with the study code. |
Value
A tibble with two columns: field and value.
Examples
# Get metadata for study 3328
meta <- get_metadata("3328")
print(meta)
# Access a specific field
meta$value[meta$field == "Tipo de estudio"]
Get the URL of a CIS study
Description
Retrieves the URL of a specific CIS study using its study ID.
Usage
get_study_url(study_code)
Arguments
study_code |
A string with the study ID. |
Value
A string with the URL of the study, or NULL if not found.
List file paths inside a ZIP archive
Description
Returns a data.frame with the files contained in a ZIP archive, optionally filtered by file extension.
Usage
list_file_paths(zip_file, type = NULL)
Arguments
zip_file |
A string with the path to the ZIP file. |
type |
A string with the file extension to filter by (e.g. |
Value
A data.frame with the files in the ZIP archive.
Imports
Description
Package import declarations
Details
Package imports
These declarations ensure NAMESPACE keeps required imports.
Parse CIS question search results
Description
Parse CIS question search results
Usage
parse_question(resp)
Arguments
resp |
The HTTP response object from the CIS search. |
Value
A tibble with the parsed data series information.
Parse CIS data series search results
Description
Parse CIS data series search results
Usage
parse_serie(resp)
Arguments
resp |
The HTTP response object from the CIS search. |
Value
A tibble with the parsed data series information.
Parse CIS study search results
Description
Parse CIS study search results
Usage
parse_study(resp)
Arguments
resp |
The HTTP response object from the CIS search. |
Value
A tibble with the parsed studies information.
Import a CIS study
Description
Download and import the data of a CIS study.
Usage
read_cis(study_code)
Arguments
study_code |
A string with the study code. |
Value
A data.frame with the study data.
Examples
# If you know the study code you can just read it into R
df <- read_cis("3328")
print(df)
# If you dont know the study code, you can search for a study using search_cis() function:
studies <- search_cis(q = "gastronomia")
print(studies)
df <- read_cis(studies$study[1])
print(df)
Read a SAV file from a ZIP archive
Description
Extracts and reads the SPSS (.sav) data file contained in a ZIP archive.
Usage
read_sav_from_zip(zip_path)
Arguments
zip_path |
A string with the path to the ZIP file. |
Value
A data.frame with the data read from the .sav file.
Search all CIS results with automatic pagination
Description
Calls search_cis repeatedly, incrementing the page index until
no more results are returned, and returns all results in a single tibble.
Usage
search_all_cis(
q = "",
from = NULL,
to = NULL,
sort = "relevance",
catalogo = "estudio",
...
)
Arguments
q |
String. The search query. Default is an empty string. |
from |
Date or NULL. The start date for filtering results. Default is NULL. The date format must be "YYYY-MM-DD". |
to |
Date or NULL. The end date for filtering results. Default is NULL. The date format must be "YYYY-MM-DD". |
sort |
String. The sorting order for the results
( |
catalogo |
String. The catalog type ( |
... |
Additional parameters passed to |
Value
A tibble with all search results across all pages.
Examples
# Retrieve all postelectoral studies (all pages)
all_studies <- search_all_cis(q = "postelectoral")
print(nrow(all_studies))
# Filter by date range
studies_2010_2020 <- search_all_cis(
q = "ideologia",
from = "2010-01-01",
to = "2020-12-31"
)
print(studies_2010_2020)
Search for CIS studies.
Description
Searches for CIS studies using the CIS search engine.
Usage
search_cis(
start = 1,
q = "",
from = NULL,
to = NULL,
sort = "relevance",
catalogo = "estudio",
...
)
Arguments
start |
Integer. The starting page for the search results. Default is 1, iterate to get more results. |
q |
String. The search query. Default is an empty string. |
from |
Date or NULL. The start date for filtering results. Default is NULL. The date format must be "YYYY-MM-DD". |
to |
Date or NULL. The end date for filtering results. Default is NULL. The date format must be "YYYY-MM-DD". |
sort |
String. The sorting order for the results ("publishDate-", "publishDate+", "relevance"). Default is "relevance". |
catalogo |
String. The catalog type ("estudio", "pregunta", "serie"). Default is "estudio". |
... |
Additional parameters (not used). |
Value
A data.frame with the search results.
Examples
# Search by search terms
studies <- search_cis(q = "postelectoral")
print(studies)
# Narrow the search by dates
studies <- search_cis(q = "postelectoral",
from = "2011-01-01",
to = "2020-01-01")
print(studies)
# Use the catalogo parameter to search for questions ("pregunta") or data series ("serie")
studies <- search_cis(q = "ideologia",
from = "2011-01-01",
to = "2020-01-01",
catalogo = "serie")
print(studies)