The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Type: Package
Title: Automatically Fetching References Metadata from Literature Databases
Version: 0.2.1
Maintainer: Thomas Dumond <thomas.dumond@adelaide.edu.au>
Description: Provides functions to automatically retrieve and deduplicate reference metadata based on saved search strings. Access to Web of Science and Scopus requires personal API keys, while PubMed can be queried without one. The optional deduplication functionality requires the package 'ASySD' available from https://github.com/camaradesuk/ASySD.
License: MIT + file LICENSE
Encoding: UTF-8
Imports: dplyr, httr, jsonlite, openxlsx, purrr, readxl, xml2
Suggests: ASySD, cronR, knitr, rmarkdown, taskscheduleR, testthat (≥ 3.0.0)
RoxygenNote: 7.3.3
Config/testthat/edition: 3
URL: https://github.com/thomasdumond/LitFetchR, https://thomasdumond.github.io/LitFetchR/
BugReports: https://github.com/thomasdumond/LitFetchR/issues
VignetteBuilder: knitr
NeedsCompilation: no
Packaged: 2026-02-07 11:09:31 UTC; a1224158
Author: Thomas Dumond ORCID iD [cre, aut, cph], Charles Caraguel ORCID iD [ctb], Torben Nielsen ORCID iD [ctb]
Repository: CRAN
Date/Publication: 2026-02-10 20:40:02 UTC

Automating the retrieval of references based on a saved search string(s).

Description

Creates a read-only Rscript and a task to run the code automatically at a specified frequency and time, to retrieve references corresponding to the saved search string(s) on up to three platforms (e.g. Web of Science, Scopus and PubMed).

Usage

auto_LitFetchR_setup(
  task_id = "task_id",
  when = "DAILY",
  time = "08:00",
  wos = FALSE,
  scp = FALSE,
  pmd = FALSE,
  directory,
  dedup = FALSE,
  open_file = FALSE,
  dry_run = FALSE
)

Arguments

task_id

Name of the automated reference retrieval task (e.g. one keyword describing your review).

when

Frequency of the automated reference retrieval task (DAILY, WEEKLY or MONTHLY).

time

Time of the automated reference retrieval task (must be HH:MM 24-hour clock format).

wos

Runs the search on Web of Science (TRUE or FALSE).

scp

Runs the search on Scopus (TRUE or FALSE).

pmd

Runs the search on PubMed (TRUE or FALSE).

directory

Choose the directory in which the search string is saved (Project's directory). That is also where the references metadata will be saved.

dedup

Deduplicates the retrieved references (TRUE or FALSE).

open_file

Automatically opens the CSV file after reference retrieval.

dry_run

Simulation run option.

Value

NULL (invisibly). Called for its side effects: writes an R script and schedules a task (Windows Task Scheduler or cron) to run the script automatically.

Examples

# This is a "dry run" example.
# No task will actually be scheduled,
# it only shows how the function should react.
auto_LitFetchR_setup(task_id = "fish_vibrio",
                       when = "WEEKLY",
                       time = "14:00",
                       wos = TRUE,
                       scp = TRUE,
                       pmd = TRUE,
                       directory,
                       dedup = FALSE,
                       open_file = FALSE,
                       dry_run = TRUE
                       )


Creates a unique name for any document.

Description

Creates a unique name for any document.

Usage

build_sheet_name(time = Sys.time())

Arguments

time

System time at the time the function is run.

Value

Character scalar. A unique name based on the system time with the format YYYY-MM-DD-HHMMSS.


Creates an excel file to store the deduplication history.

Description

Creates an excel file to store the deduplication history.

Usage

create_dedup_history(directory)

Arguments

directory

Choose the directory in which the references deduplication history will be saved.

Value

A list with elements:

history_dedup

A Workbook object (from openxlsx).

hist_dedup_path

Character. Path to the created .xlsx file.


Creates an excel file to store the references identification retrieved at each search.

Description

Creates an excel file to store the references identification retrieved at each search.

Usage

create_id_history(directory)

Arguments

directory

Choose the directory in which the references identification history will be saved.

Value

A list with element:

history_id

A Workbook object (from openxlsx).


Description

An interactive function that ask the user to enter a search string and provide the number of results from 3 platforms: Web of Science, Scopus and PubMed. You can then save one or more search strings to retrieve the references later.

Usage

create_save_search(
  wos = FALSE,
  scp = FALSE,
  pmd = FALSE,
  directory,
  dry_run = FALSE
)

Arguments

wos

Runs the search on Web of Science (TRUE or FALSE).

scp

Runs the search on Scopus (TRUE or FALSE).

pmd

Runs the search on PubMed (TRUE or FALSE).

directory

Choose the directory in which the search string and the search history will be saved.

dry_run

Simulation run option.

Value

NULL (invisibly). Called for its side effects: interactive querying and writing search history files.

Examples

# This is a "dry run" example.
# No search will be created and no database will be accessed.
# It only shows how the function should react.
create_save_search(wos = TRUE,
                   scp = TRUE,
                   pmd = TRUE,
                   directory,
                   dry_run = TRUE)


Creates an excel file to store the history of searches made using create_save_search().

Description

Creates an excel file to store the history of searches made using create_save_search().

Usage

create_search_history(directory)

Arguments

directory

Choose the directory in which the search history will be saved.

Value

A list with elements:

history_search

Workbook object.

sheet_name

Character scalar.


Deduplicates the references from up to three dataframes.

Description

Deduplicates the references from up to three dataframes.

Usage

dedup_refs(
  df1 = NULL,
  df2 = NULL,
  df3 = NULL,
  directory,
  open_file = FALSE,
  dry_run = FALSE
)

Arguments

df1

Dataframe 1 (can be NULL)

df2

Dataframe 2 (can be NULL)

df3

Dataframe 3 (can be NULL)

directory

Choose the directory in which the references deduplication history will be saved.

open_file

Automatically opens the CSV file after reference retrieval.

dry_run

Simulation run option.

Value

NULL (invisibly). Called for its side effects: writes a CSV of deduplicated citations and an Excel workbook recording the deduplication history.

Examples

# This is a "dry run" example.
# No deduplication will happen.
# It only shows how the function should react.
dedup_refs(df1 = df_vibrio_wos,
           df2 = df_vibrio_scp,
           df3 = df_vibrio_pmd,
           directory = tempdir(),
           open_file = FALSE,
           dry_run = TRUE
           )


Extracts the metadata from the new references found on PubMed based on the search string(s) saved in "search_list.txt".

Description

Extracts the metadata from the new references found on PubMed based on the search string(s) saved in "search_list.txt".

Usage

extract_pmd_list(search_list_path, directory)

Arguments

search_list_path

Path to "search_list.txt".

directory

Choose the directory in which the references identification history will be saved.

Value

A data.frame with one row per retrieved PubMed record and columns:

author

Character. Publication authors.

year

Character. Publication year.

title

Character. Publication title.

journal

Character. Publication journal name.

volume

Character. Publication journal volume.

issue

Character. Publication journal issue.

abstract

Character. Publication abstract.

doi

Character. Publication Digital Object Identifier (DOI).

source

Character. Data source.

platform_id

Character. Publication unique identifier in data source.

If search_list_path does not exist, returns NULL.


Extracts the metadata from the new references found on Scopus based on the search string(s) saved in "search_list.txt".

Description

Extracts the metadata from the new references found on Scopus based on the search string(s) saved in "search_list.txt".

Usage

extract_scp_list(search_list_path, directory)

Arguments

search_list_path

Path to "search_list.txt".

directory

Choose the directory in which the references identification history will be saved.

Value

A data.frame with one row per retrieved Scopus record and columns:

author

Character. Publication authors.

year

Character. Publication year.

title

Character. Publication title.

journal

Character. Publication journal name.

volume

Character. Publication journal volume.

issue

Character. Publication journal issue.

abstract

Character. Publication abstract.

doi

Character. Publication Digital Object Identifier (DOI).

source

Character. Data source.

platform_id

Character. Publication unique identifier in data source.

If search_list_path does not exist, returns NULL.


extract the metadata from the new references from Web of Science based on the search strings found in search_list.txt

Description

extract the metadata from the new references from Web of Science based on the search strings found in search_list.txt

Usage

extract_wos_list(search_list_path, directory)

Arguments

search_list_path

path to search_list

directory

Choose the directory in which the references identification history will be saved.

Value

A data.frame with one row per retrieved Web of Science record and columns:

author

Character. Publication authors.

year

Character. Publication year.

title

Character. Publication title.

journal

Character. Publication journal name.

volume

Character. Publication journal volume.

issue

Character. Publication journal issue.

abstract

Character. Publication abstract.

doi

Character. Publication Digital Object Identifier (DOI).

source

Character. Data source.

platform_id

Character. Publication unique identifier in data source.

If search_list_path does not exist, returns NULL.


Transforms a long computer path into a shorter.

Description

Transforms a long computer path into a shorter.

Usage

get_short_path(path)

Arguments

path

Path to the document.

Value

Character scalar, a shorter path to use in Windows OS.


Manual literature retrieval.

Description

Retrieves references corresponding to the saved search string(s) on up to three platforms (e.g. Web of Science, Scopus and PubMed).

Usage

manual_fetch(
  wos = FALSE,
  scp = FALSE,
  pmd = FALSE,
  directory,
  dedup = FALSE,
  open_file = FALSE,
  dry_run = FALSE
)

Arguments

wos

Runs the search on Web of Science (TRUE or FALSE).

scp

Runs the search on Scopus (TRUE or FALSE).

pmd

Runs the search on PubMed (TRUE or FALSE).

directory

Choose the directory in which the search string is saved (Project's directory). That is also where the references metadata will be saved.

dedup

Deduplicates the retrieved references (TRUE or FALSE).

open_file

Automatically opens the CSV file after reference retrieval.

dry_run

Simulation run option.

Value

NULL (invisibly). Called for its side effects: Create a CSV file with the references metadata, a history file of the references retrieved and a history file of the deduplication (if the option is selected).

Examples

# This is a "dry run" example.
# No references will actually be scheduled, it only shows how the function should react.
manual_fetch(wos = TRUE,
             scp = TRUE,
             pmd = TRUE,
             directory,
             dedup = TRUE,
             open_file = FALSE,
             dry_run = TRUE
             )



Removes a scheduled task.

Description

Removes a scheduled task using the "task_id" from Task Scheduler (Windows) or Cron (Mac/Linux).

Usage

remove_scheduled_task(task_id, dry_run = FALSE)

Arguments

task_id

Name/ID of the scheduled task (Windows Task Scheduler or Cron).

dry_run

Simulation run option.

Value

NULL (invisibly). Called for its side effects: removes a scheduled task saved using the function 'auto_LitFetchR_setup'.

Examples

# This is a "dry run" example.
# No task will actually be removed, it only shows how the function should react.
remove_scheduled_task("fish_vibrio",
                      dry_run = TRUE
                      )


Saves Web of Science and/or Scopus API keys in .Renviron.

Description

You can set wos_api_key, scp_api_key, or both at the same time. Remember to restart the R session after saving your API keys.

Usage

save_api_keys(wos_api_key = NULL, scp_api_key = NULL, dry_run = FALSE)

Arguments

wos_api_key

The API key value for Web of Science (use quotation marks).

scp_api_key

The API key value for Scopus (use quotation marks).

dry_run

Simulation run option.

Value

Logical. TRUE if at least one value was written, FALSE if left unchanged.

Examples

save_api_keys(wos_api_key = "abcd01234",
               scp_api_key = "efgh5678",
               dry_run = TRUE
               )

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.