The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Title: Interface to the arXiv API
Version: 0.12
Date: 2025-07-29
Description: An interface to the API for 'arXiv', a repository of electronic preprints for computer science, mathematics, physics, quantitative biology, quantitative finance, and statistics.
URL: https://docs.ropensci.org/aRxiv/, https://github.com/ropensci/aRxiv
BugReports: https://github.com/ropensci/aRxiv/issues
Depends: R (≥ 3.5.0)
License: MIT + file LICENSE
Imports: httr, utils, XML
Suggests: devtools, knitr, rmarkdown, roxygen2, testthat
VignetteBuilder: knitr
Encoding: UTF-8
LazyData: true
RoxygenNote: 7.3.2
NeedsCompilation: no
Packaged: 2025-07-29 14:37:34 UTC; kbroman
Author: Karthik Ram ORCID iD [aut], Karl Broman ORCID iD [aut, cre]
Maintainer: Karl Broman <broman@wisc.edu>
Repository: CRAN
Date/Publication: 2025-07-29 15:20:09 UTC

arXiv subject classifications

Description

arXiv subject classifications: their abbreviations and corresponding descriptions.

Usage

data(arxiv_cats)

Format

A data frame with five columns: the abbreviations of the subject classifications (category), the field of study, subfield of study (within Physics; NA otherwise), a short description, and a longer description.

Source

https://arxiv.org/category_taxonomy

Examples

arxiv_cats

Count number of results for a given search

Description

Count the number of results for a given search. Useful to check before attempting to pull down a very large number of records.

Usage

arxiv_count(query = NULL, id_list = NULL)

Arguments

query

Search pattern as a string; a vector of such strings is also allowed, in which case the elements are combined with AND.

id_list

arXiv doc IDs, as comma-delimited string or a vector of such strings

Value

Number of results (integer). An attribute "search_info" contains information about the search parameters and the time at which it was performed.

See Also

arxiv_search(), query_terms(), arxiv_cats()

Examples



# count papers in category stat.AP (applied statistics)
arxiv_count(query = "cat:stat.AP")

# count papers by Peter Hall in any stat category
arxiv_count(query = 'au:"Peter Hall" AND cat:stat*')

# count papers for a range of dates
#    here, everything in 2013
arxiv_count("submittedDate:[2013 TO 2014]")




Open abstract for results of arXiv search

Description

Open, in web browser, the abstract pages for each of set of arXiv search results.

Usage

arxiv_open(search_results, limit = 20)

Arguments

search_results

Data frame of search results, as returned from arxiv_search().

limit

Maximum number of abstracts to open in one call.

Details

There is a delay between calls to utils::browseURL(), with the amount taken from the R option "aRxiv_delay" (in seconds); if missing, the default is 3 sec.

Value

(Invisibly) Vector of character strings with URLs of abstracts opened.

See Also

arxiv_search()

Examples

z <- arxiv_search('au:"Peter Hall" AND ti:deconvolution')
arxiv_open(z)


Description

Allows for progammatic searching of the arXiv pre-print repository.

Usage

arxiv_search(
  query = NULL,
  id_list = NULL,
  start = 0,
  limit = 10,
  sort_by = c("submitted", "updated", "relevance"),
  ascending = TRUE,
  batchsize = 100,
  force = FALSE,
  output_format = c("data.frame", "list"),
  sep = "|"
)

Arguments

query

Search pattern as a string; a vector of such strings also allowed, in which case the elements are combined with AND.

id_list

arXiv doc IDs, as comma-delimited string or a vector of such strings

start

An offset for the start of search

limit

Maximum number of records to return.

sort_by

How to sort the results (ignored if id_list is provided)

ascending

If TRUE, sort in ascending order; else descending (ignored if id_list is provided)

batchsize

Maximum number of records to request at one time

force

If TRUE, force search request even if it seems extreme

output_format

Indicates whether output should be a data frame or a list.

sep

String to use to separate multiple authors, affiliations, DOI links, and categories, in the case that output_format="data.frame".

Value

If output_format="data.frame", the result is a data frame with each row being a manuscript and columns being the various fields.

If output_format="list", the result is a list parsed from the XML output of the search, closer to the raw output from arXiv.

The data frame format has the following columns.

[,1] id arXiv ID
[,2] submitted date first submitted
[,3] updated date last updated
[,4] title manuscript title
[,5] summary abstract
[,6] authors author names
[,7] affiliations author affiliations
[,8] link_abstract hyperlink to abstract
[,9] link_pdf hyperlink to pdf
[,10] link_doi hyperlink to DOI
[,11] comment authors' comment
[,12] journal_ref journal reference
[,13] doi published DOI
[,14] primary_category primary category
[,15] categories all categories

The contents are all strings; missing values are empty strings ("").

The columns authors, affiliations, link_doi, and categories may have multiple entries separated by sep (by default, "|").

The result includes an attribute "search_info" that includes information about the details of the search parameters, including the time at which it was completed. Another attribute "total_results" is the total number of records that match the query.

See Also

arxiv_count(), arxiv_open(), query_terms(), arxiv_cats()

Examples



# search for author Peter Hall with deconvolution in title
z <- arxiv_search(query = 'au:"Peter Hall" AND ti:deconvolution', limit=2)
attr(z, "total_results") # total no. records matching query
z$title

# search for a set of documents by arxiv identifiers
z <- arxiv_search(id_list = c("0710.3491v1", "0804.0713v1", "1003.0315v1"))
# can also use a comma-separated string
z <- arxiv_search(id_list = "0710.3491v1,0804.0713v1,1003.0315v1")
# Journal references, if available
z$journal_ref

# search for a range of dates (in this case, one day)
z <- arxiv_search("submittedDate:[199701010000 TO 199701012400]", limit=2)




Check for connection to arXiv API

Description

Check for connection to arXiv API

Usage

can_arxiv_connect(max_time = 5)

Arguments

max_time

Maximum wait time in seconds

Value

Returns TRUE if connection is established and FALSE otherwise.

Examples


can_arxiv_connect(2)



arXiv query field terms

Description

Possible terms that correspond to different fields in arXiv searches.

Usage

data(query_terms)

Format

A data frame with two columns: the term and corresponding description.

Author(s)

Karl W Broman

Source

https://arxiv.org/help/api/user-manual.html

Examples

query_terms

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.