The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

gutenbergr gutenbergr website

CRAN version CRAN checks rOpenSci peer-review Project Status: Active R-CMD-check Integration Tests Codecov test coverage Monthly Downloads Total Downloads

Search, download, and process public domain texts from the Project Gutenberg collection.

Installation

Install the released version from CRAN:

install.packages("gutenbergr")

Install the development version from GitHub:

# install.packages("pak")
pak::pak("ropensci/gutenbergr")

Quick Start

Load the package and any other required libraries:

library(gutenbergr)
library(dplyr)

We’ll get and set our Project Gutenberg mirror:

gutenberg_get_mirror()
#> [1] "https://aleph.pglaf.org"

Search through the metadata to find Jane Austen’s Persuasion:

gutenberg_works(title == "Persuasion")
#> # A tibble: 1 × 8
#>   gutenberg_id title      author       gutenberg_author_id language
#>          <int> <chr>      <chr>                      <int> <fct>   
#> 1          105 Persuasion Austen, Jane                  68 en      
#>   gutenberg_bookshelf                           rights                    has_text
#>   <chr>                                         <fct>                     <lgl>   
#> 1 Category: Novels/Category: British Literature Public domain in the USA. TRUE

Persuasion’s gutenberg_id is 105. We’ll use this ID to download it and also set our cache option to "persistent" so that we don’t have to re-download it later.

options(gutenbergr_cache_type = "persistent")
persuasion <- gutenberg_download(105)
persuasion
#> # A tibble: 8,357 × 2
#>    gutenberg_id text            
#>           <int> <chr>           
#>  1          105 "Persuasion"    
#>  2          105 ""              
#>  3          105 ""              
#>  4          105 "by Jane Austen"
#>  5          105 ""              
#>  6          105 "(1818)"        
#>  7          105 ""              
#>  8          105 ""              
#>  9          105 ""              
#> 10          105 ""              
#> # ℹ 8,347 more rows

Multiple works can be downloaded at once. We’ll also download Edna St. Vincent Millay’s Renascence and Other Poems (gutenberg_id 161) and throw in title data from the metadata.

books <- gutenberg_download(c(105, 161), meta_fields = "title")
books |> count(title)
#> # A tibble: 2 × 2
#>   title                           n
#>   <chr>                       <int>
#> 1 Persuasion                   8357
#> 2 Renascence, and Other Poems  1222

Vignettes

See the following vignettes for more advanced usage of gutenbergr.

FAQ

How were the metadata files generated?

See the data-raw directory for scripts. Metadata was generated from the Project Gutenberg catalog on 13 March 2026.

Do you respect robot access rules?

Yes! The package follows Project Gutenberg’s rules:

See their Terms of Use for details.

Contributing

See CONTRIBUTING.md.

Note that this package is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

ropensci_footer

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.