The rplos
package interacts with the API services of PLoS (Public Library of Science) Journals. In order to use rplos
, you need to obtain your own key to their API services. Instruction for obtaining and installing keys so they load automatically when you launch R are on our GitHub Wiki page Installation and use of API keys.
This tutorial will go through three use cases to demonstrate the kinds
of things possible in rplos
.
install.packages("rplos")
library(rplos)
searchplos
is a general search, and in this case searches for the term
Helianthus and returns the DOI's of matching papers
searchplos(q= "Helianthus", fl= "id", limit = 5)
## id
## 1 10.1371/journal.pone.0057533
## 2 10.1371/journal.pone.0045899
## 3 10.1371/journal.pone.0037191
## 4 10.1371/journal.pone.0051360
## 5 10.1371/journal.pone.0070347
Get only full article DOIs
searchplos(q="*:*", fl='id', fq='doc_type:full', start=0, limit=5)
## id
## 1 10.1371/journal.pntd.0001738
## 2 10.1371/journal.pone.0035044
## 3 10.1371/journal.pone.0067070
## 4 10.1371/journal.pone.0095362
## 5 10.1371/journal.pone.0031015
Get DOIs for only PLoS One articles
searchplos(q="*:*", fl='id', fq='cross_published_journal_key:PLoSONE', start=0, limit=5)
## id
## 1 10.1371/journal.pone.0067079/title
## 2 10.1371/journal.pone.0067079/abstract
## 3 10.1371/journal.pone.0067079/references
## 4 10.1371/journal.pone.0067079/body
## 5 10.1371/journal.pone.0095366/body
Get DOIs for full article in PLoS One
searchplos(q="*:*", fl='id',
fq=list('cross_published_journal_key:PLoSONE', 'doc_type:full'),
start=0, limit=5)
## id
## 1 10.1371/journal.pone.0035044
## 2 10.1371/journal.pone.0067070
## 3 10.1371/journal.pone.0095362
## 4 10.1371/journal.pone.0031015
## 5 10.1371/journal.pone.0059231
Serch for many terms
q <- c('ecology','evolution','science')
lapply(q, function(x) searchplos(x, limit=2))
## [[1]]
## id
## 1 10.1371/journal.pone.0059813
## 2 10.1371/journal.pone.0001248
##
## [[2]]
## id
## 1 10.1371/annotation/c55d5089-ba2f-449d-8696-2bc8395978db
## 2 10.1371/annotation/9773af53-a076-4946-a3f1-83914226c10d
##
## [[3]]
## id
## 1 10.1371/journal.pbio.0020122
## 2 10.1371/journal.pbio.1001166
A suite of functions were created as light wrappers around searchplos
as a shorthand to search specific sections of a paper.
plosauthor
searchers in authorsplosabstract
searches in abstractsplostitle
searches in titlesplosfigtabcaps
searches in figure and table captionsplossubject
searches in subject areasplosauthor
searches across authors, and in this case returns the authors of the matching papers. the fl parameter determines what is returned
plosauthor(q = "Eisen", fl = "author", limit = 5)
## author
## 1 Jonathan A Eisen
## 2 Jonathan A Eisen
## 3 Jonathan A Eisen
## 4 Jonathan A Eisen
## 5 Jonathan A Eisen
plosabstract
searches across abstracts, and in this case returns the id and title of the matching papers
plosabstract(q = 'drosophila', fl='id,title', limit = 5)
## id
## 1 10.1371/journal.pbio.0040198
## 2 10.1371/journal.pbio.0030246
## 3 10.1371/journal.pone.0012421
## 4 10.1371/journal.pbio.0030389
## 5 10.1371/journal.pone.0002817
## title
## 1 All for All
## 2 School Students as Drosophila Experimenters
## 3 Host Range and Specificity of the Drosophila C Virus
## 4 New Environments Set the Stage for Changing Tastes in Mates
## 5 High-Resolution, In Vivo Magnetic Resonance Imaging of Drosophila at 18.8 Tesla
plostitle
searches across titles, and in this case returns the title and journal of the matching papers
plostitle(q='drosophila', fl='title,journal', limit=5)
## journal title
## 1 PLoS Biology Combinatorial Coding for Drosophila Neurons
## 2 PLoS Biology School Students as Drosophila Experimenters
## 3 PLoS ONE A Tripartite Synapse Model in Drosophila
## 4 PLoS ONE A DNA Virus of Drosophila
## 5 PLoS Genetics Phenotypic Plasticity of the Drosophila Transcriptome
Facet by journal
facetplos(q='*:*', facet.field='journal')
## $facet_queries
## NULL
##
## $facet_fields
## $facet_fields$journal
## X1 X2
## 1 plos one 792679
## 2 plos genetics 36699
## 3 plos pathogens 32259
## 4 plos computational biology 26944
## 5 plos biology 25123
## 6 plos neglected tropical diseases 21071
## 7 plos medicine 17693
## 8 plos clinical trials 521
## 9 plos collections 20
## 10 plos medicin 9
##
##
## $facet_dates
## NULL
##
## $facet_ranges
## NULL
Using facet.query
to get counts
facetplos(q='*:*', facet.field='journal', facet.query='cell,bird')
## $facet_queries
## term value
## 1 cell,bird 14
##
## $facet_fields
## $facet_fields$journal
## X1 X2
## 1 plos one 792679
## 2 plos genetics 36699
## 3 plos pathogens 32259
## 4 plos computational biology 26944
## 5 plos biology 25123
## 6 plos neglected tropical diseases 21071
## 7 plos medicine 17693
## 8 plos clinical trials 521
## 9 plos collections 20
## 10 plos medicin 9
##
##
## $facet_dates
## NULL
##
## $facet_ranges
## NULL
Date faceting
facetplos(q='*:*', url=url, facet.date='publication_date',
facet.date.start='NOW/DAY-5DAYS', facet.date.end='NOW', facet.date.gap='+1DAY')
## $facet_queries
## NULL
##
## $facet_fields
## NULL
##
## $facet_dates
## $facet_dates$publication_date
## date value
## 1 2014-05-08T00:00:00Z 2872
## 2 2014-05-09T00:00:00Z 1208
## 3 2014-05-10T00:00:00Z 0
## 4 2014-05-11T00:00:00Z 1111
## 5 2014-05-12T00:00:00Z 2379
## 6 2014-05-13T00:00:00Z 1268
##
##
## $facet_ranges
## NULL
Search for the term alcohol in the abstracts of articles, return only 10 results
highplos(q='alcohol', hl.fl = 'abstract', rows=2)
## $`10.1371/journal.pmed.0040151`
## $`10.1371/journal.pmed.0040151`$abstract
## [1] "Background: <em>Alcohol</em> consumption causes an estimated 4% of the global disease burden, prompting"
##
##
## $`10.1371/journal.pone.0027752`
## $`10.1371/journal.pone.0027752`$abstract
## [1] "Background: The negative influences of <em>alcohol</em> on TB management with regard to delays in seeking"
Search for the term alcohol in the abstracts of articles, and return fragment size of 20 characters, return only 5 results
highplos(q='alcohol', hl.fl='abstract', hl.fragsize=20, rows=2)
## $`10.1371/journal.pmed.0040151`
## $`10.1371/journal.pmed.0040151`$abstract
## [1] "Background: <em>Alcohol</em>"
##
##
## $`10.1371/journal.pone.0027752`
## $`10.1371/journal.pone.0027752`$abstract
## [1] " of <em>alcohol</em> on TB management"
Search for the term experiment across all sections of an article, return id (DOI) and title fl only, search in full articles only (via fq='doc_type:full'
), and return only 10 results
highplos(q='everything:"experiment"', fl='id,title', fq='doc_type:full',
rows=2)
## $`10.1371/journal.pone.0039681`
## $`10.1371/journal.pone.0039681`$everything
## [1] " Selection of Transcriptomics <em>Experiments</em> Improves Guilt-by-Association Analyses Transcriptomics <em>Experiment</em>"
##
##
## $`10.1371/annotation/9b8741e2-0f5f-49f9-9eaa-1b0cb9b8d25f`
## $`10.1371/annotation/9b8741e2-0f5f-49f9-9eaa-1b0cb9b8d25f`$everything
## [1] " in the labels under the bars. The labels should read <em>Experiment</em> 3 / <em>Experiment</em> 4 instead of <em>Experiment</em>"
plosword
allows you to search for 1 to K words and visualize the results
as a histogram, comparing number of matching papers for each word
out <- plosword(list("monkey", "Helianthus", "sunflower", "protein", "whale"),
vis = "TRUE")
out$table
## No_Articles Term
## 1 7340 monkey
## 2 245 Helianthus
## 3 673 sunflower
## 4 79470 protein
## 5 876 whale
out$plot
You can also pass in curl options, in this case get verbose information on the curl call.
plosword('Helianthus', callopts=list(verbose=TRUE))
## Number of articles with search term
## 245
plot_througtime
allows you to search for up to 2 words and visualize the results as a line plot through time, comparing number of articles matching through time. Visualize with the ggplot2 package, only up to two terms for now.
plot_throughtime(terms = "phylogeny", limit = 200)
OR using google visualizations through the googleVis package, check it your self using, e.g. (not shown here)
plot_throughtime(terms = list("drosophila", "flower"), limit = 200, gvis = TRUE)
…And a google visualization will render on your local browser and you can play with three types of plots (point, histogram, line), all through time. The plot is not shown here, but try it out for yourself!!