Repository Mirror for your Cloud Server and Webhosting

Title:

Interface to the arXiv API

Version:

0.16

Date:

2025-12-08

Description:

An interface to the API for 'arXiv', a repository of electronic preprints for computer science, mathematics, physics, quantitative biology, quantitative finance, and statistics.

URL:

https://docs.ropensci.org/aRxiv/, https://github.com/ropensci/aRxiv

BugReports:

https://github.com/ropensci/aRxiv/issues

Depends:

R (≥ 3.5.0)

License:

MIT + file LICENSE

Imports:

httr, utils, XML

Suggests:

devtools, knitr, rmarkdown, roxygen2, testthat

VignetteBuilder:

knitr

Encoding:

UTF-8

LazyData:

true

RoxygenNote:

7.3.3

NeedsCompilation:

Packaged:

2025-12-08 19:45:51 UTC; kbroman

Author:

Karthik Ram

[aut], Karl Broman

[aut, cre]

Maintainer:

Karl Broman <broman@wisc.edu>

Repository:

CRAN

Date/Publication:

2025-12-08 21:30:02 UTC

arXiv subject classifications

Description

arXiv subject classifications: their abbreviations and corresponding descriptions.

Usage

data(arxiv_cats)

Format

A data frame with five columns: the abbreviations of the subject classifications (category), the field of study, subfield of study (within Physics; NA otherwise), a short description, and a longer description.

Source

https://arxiv.org/category_taxonomy

Examples

arxiv_cats

Count number of results for a given search

Description

Count the number of results for a given search. Useful to check before attempting to pull down a very large number of records.

Usage

arxiv_count(query = NULL, id_list = NULL)

Arguments

query

Search pattern as a string; a vector of such strings is also allowed, in which case the elements are combined with AND.

id_list

arXiv doc IDs, as comma-delimited string or a vector of such strings

Value

Number of results (integer). An attribute "search_info" contains information about the search parameters and the time at which it was performed.

Examples



# count papers in category stat.AP (applied statistics)
arxiv_count(query = "cat:stat.AP")

# count papers by Peter Hall in any stat category
arxiv_count(query = 'au:"Peter Hall" AND cat:stat*')

# count papers for a range of dates
#    here, everything in 2013
arxiv_count("submittedDate:[2013 TO 2013]")

Open abstract for results of arXiv search

Description

Open, in web browser, the abstract pages for each of set of arXiv search results.

Usage

arxiv_open(search_results, limit = 20)

Arguments

search_results

Data frame of search results, as returned from arxiv_search().

limit

Maximum number of abstracts to open in one call.

Details

There is a delay between calls to utils::browseURL(), with the amount taken from the R option "aRxiv_delay" (in seconds); if missing, the default is 3 sec.

Value

(Invisibly) Vector of character strings with URLs of abstracts opened.

Examples

z <- arxiv_search('au:"Peter Hall" AND ti:deconvolution')
arxiv_open(z)

The main search function for aRxiv

Description

Allows for progammatic searching of the arXiv pre-print repository.

Usage

arxiv_search(
  query = NULL,
  id_list = NULL,
  start = 0,
  limit = 10,
  sort_by = c("submitted", "updated", "relevance"),
  ascending = TRUE,
  batchsize = 100,
  force = FALSE,
  output_format = c("data.frame", "list"),
  sep = "|"
)

Arguments

query

Search pattern as a string; a vector of such strings also allowed, in which case the elements are combined with AND.

id_list

arXiv doc IDs, as comma-delimited string or a vector of such strings

start

An offset for the start of search

limit

Maximum number of records to return (must be > 0).

sort_by

How to sort the results (ignored if id_list is provided)

ascending

If TRUE, sort in ascending order; else descending (ignored if id_list is provided)

batchsize

Maximum number of records to request at one time

force

If TRUE, force search request even if it seems extreme

output_format

Indicates whether output should be a data frame or a list.

sep

String to use to separate multiple authors, affiliations, DOI links, and categories, in the case that output_format="data.frame".

Value

If output_format="data.frame", the result is a data frame with each row being a manuscript and columns being the various fields.

If output_format="list", the result is a list parsed from the XML output of the search, closer to the raw output from arXiv.

The data frame format has the following columns.

[,1]	id	arXiv ID
[,2]	submitted	date first submitted
[,3]	updated	date last updated
[,4]	title	manuscript title
[,5]	summary	abstract
[,6]	authors	author names
[,7]	affiliations	author affiliations
[,8]	link_abstract	hyperlink to abstract
[,9]	link_pdf	hyperlink to pdf
[,10]	link_doi	hyperlink to DOI
[,11]	comment	authors' comment
[,12]	journal_ref	journal reference
[,13]	doi	published DOI
[,14]	primary_category	primary category
[,15]	categories	all categories

The contents are all strings; missing values are empty strings ("").

The columns authors, affiliations, link_doi, and categories may have multiple entries separated by sep (by default, "|").

The result includes an attribute "search_info" that includes information about the details of the search parameters, including the time at which it was completed. Another attribute "total_results" is the total number of records that match the query.

Examples



# search for author Peter Hall with deconvolution in title
z <- arxiv_search(query = 'au:"Peter Hall" AND ti:deconvolution', limit=2)
attr(z, "total_results") # total no. records matching query
z$title

# search for a set of documents by arxiv identifiers
z <- arxiv_search(id_list = c("0710.3491v1", "0804.0713v1", "1003.0315v1"))
# can also use a comma-separated string
z <- arxiv_search(id_list = "0710.3491v1,0804.0713v1,1003.0315v1")
# Journal references, if available
z$journal_ref

# search for a range of dates (in this case, one day)
z <- arxiv_search("submittedDate:[199701010000 TO 199701012359]", limit=2)

Check for connection to arXiv API

Description

Check for connection to arXiv API

Usage

can_arxiv_connect(max_time = 5)

Arguments

max_time

Maximum wait time in seconds

Value

Returns TRUE if connection is established and FALSE otherwise.

Examples


can_arxiv_connect(2)

arXiv query field terms

Description

Possible terms that correspond to different fields in arXiv searches.

Usage

data(query_terms)

Format

A data frame with two columns: the term and corresponding description.

Author(s)

Karl W Broman

Source

https://arxiv.org/help/api/user-manual.html

Examples

query_terms

arXiv subject classifications

Description

Usage

Format

Source

Examples

Count number of results for a given search

Description

Usage

Arguments

Value

See Also

Examples

Open abstract for results of arXiv search

Description

Usage

Arguments

Details

Value

See Also

Examples

The main search function for aRxiv

Description

Usage

Arguments

Value

See Also

Examples

Check for connection to arXiv API

Description

Usage

Arguments

Value

Examples

arXiv query field terms

Description

Usage

Format

Author(s)

Source

Examples