Repository Mirror for your Cloud Server and Webhosting

Type:

Package

Title:

Automatically Fetching References Metadata from Literature Databases

Version:

0.2.1

Maintainer:

Thomas Dumond <thomas.dumond@adelaide.edu.au>

Description:

Provides functions to automatically retrieve and deduplicate reference metadata based on saved search strings. Access to Web of Science and Scopus requires personal API keys, while PubMed can be queried without one. The optional deduplication functionality requires the package 'ASySD' available from https://github.com/camaradesuk/ASySD.

License:

MIT + file LICENSE

Encoding:

UTF-8

Imports:

dplyr, httr, jsonlite, openxlsx, purrr, readxl, xml2

Suggests:

ASySD, cronR, knitr, rmarkdown, taskscheduleR, testthat (≥ 3.0.0)

RoxygenNote:

7.3.3

Config/testthat/edition:

URL:

https://github.com/thomasdumond/LitFetchR, https://thomasdumond.github.io/LitFetchR/

BugReports:

https://github.com/thomasdumond/LitFetchR/issues

VignetteBuilder:

knitr

NeedsCompilation:

Packaged:

2026-02-07 11:09:31 UTC; a1224158

Author:

Thomas Dumond

[cre, aut, cph], Charles Caraguel

[ctb], Torben Nielsen

[ctb]

Repository:

CRAN

Date/Publication:

2026-02-10 20:40:02 UTC

Automating the retrieval of references based on a saved search string(s).

Description

Creates a read-only Rscript and a task to run the code automatically at a specified frequency and time, to retrieve references corresponding to the saved search string(s) on up to three platforms (e.g. Web of Science, Scopus and PubMed).

Usage

auto_LitFetchR_setup(
  task_id = "task_id",
  when = "DAILY",
  time = "08:00",
  wos = FALSE,
  scp = FALSE,
  pmd = FALSE,
  directory,
  dedup = FALSE,
  open_file = FALSE,
  dry_run = FALSE
)

Arguments

task_id

Name of the automated reference retrieval task (e.g. one keyword describing your review).

when

Frequency of the automated reference retrieval task (DAILY, WEEKLY or MONTHLY).

time

Time of the automated reference retrieval task (must be HH:MM 24-hour clock format).

wos

Runs the search on Web of Science (TRUE or FALSE).

scp

Runs the search on Scopus (TRUE or FALSE).

pmd

Runs the search on PubMed (TRUE or FALSE).

directory

Choose the directory in which the search string is saved (Project's directory). That is also where the references metadata will be saved.

dedup

Deduplicates the retrieved references (TRUE or FALSE).

open_file

Automatically opens the CSV file after reference retrieval.

dry_run

Simulation run option.

Value

NULL (invisibly). Called for its side effects: writes an R script and schedules a task (Windows Task Scheduler or cron) to run the script automatically.

Examples

# This is a "dry run" example.
# No task will actually be scheduled,
# it only shows how the function should react.
auto_LitFetchR_setup(task_id = "fish_vibrio",
                       when = "WEEKLY",
                       time = "14:00",
                       wos = TRUE,
                       scp = TRUE,
                       pmd = TRUE,
                       directory,
                       dedup = FALSE,
                       open_file = FALSE,
                       dry_run = TRUE
                       )

Creates a unique name for any document.

Description

Creates a unique name for any document.

Usage

build_sheet_name(time = Sys.time())

Arguments

time

System time at the time the function is run.

Value

Character scalar. A unique name based on the system time with the format YYYY-MM-DD-HHMMSS.

Creates an excel file to store the deduplication history.

Description

Creates an excel file to store the deduplication history.

Usage

create_dedup_history(directory)

Arguments

directory

Choose the directory in which the references deduplication history will be saved.

Value

A list with elements:

history_dedup: A Workbook object (from openxlsx).
hist_dedup_path: Character. Path to the created .xlsx file.

Creates an excel file to store the references identification retrieved at each search.

Description

Creates an excel file to store the references identification retrieved at each search.

Usage

create_id_history(directory)

Arguments

directory

Choose the directory in which the references identification history will be saved.

Value

A list with element:

history_id: A Workbook object (from openxlsx).

Creates and saves search string(s).

Description

An interactive function that ask the user to enter a search string and provide the number of results from 3 platforms: Web of Science, Scopus and PubMed. You can then save one or more search strings to retrieve the references later.

Usage

create_save_search(
  wos = FALSE,
  scp = FALSE,
  pmd = FALSE,
  directory,
  dry_run = FALSE
)

Arguments

wos

Runs the search on Web of Science (TRUE or FALSE).

scp

Runs the search on Scopus (TRUE or FALSE).

pmd

Runs the search on PubMed (TRUE or FALSE).

directory

Choose the directory in which the search string and the search history will be saved.

dry_run

Simulation run option.

Value

NULL (invisibly). Called for its side effects: interactive querying and writing search history files.

Examples

# This is a "dry run" example.
# No search will be created and no database will be accessed.
# It only shows how the function should react.
create_save_search(wos = TRUE,
                   scp = TRUE,
                   pmd = TRUE,
                   directory,
                   dry_run = TRUE)

Creates an excel file to store the history of searches made using `create_save_search()`.

Description

Creates an excel file to store the history of searches made using create_save_search().

Usage

create_search_history(directory)

Arguments

directory

Choose the directory in which the search history will be saved.

Value

A list with elements:

history_search: Workbook object.
sheet_name: Character scalar.

Deduplicates the references from up to three dataframes.

Description

Deduplicates the references from up to three dataframes.

Usage

dedup_refs(
  df1 = NULL,
  df2 = NULL,
  df3 = NULL,
  directory,
  open_file = FALSE,
  dry_run = FALSE
)

Arguments

df1

Dataframe 1 (can be NULL)

df2

Dataframe 2 (can be NULL)

df3

Dataframe 3 (can be NULL)

directory

Choose the directory in which the references deduplication history will be saved.

open_file

Automatically opens the CSV file after reference retrieval.

dry_run

Simulation run option.

Value

NULL (invisibly). Called for its side effects: writes a CSV of deduplicated citations and an Excel workbook recording the deduplication history.

Examples

# This is a "dry run" example.
# No deduplication will happen.
# It only shows how the function should react.
dedup_refs(df1 = df_vibrio_wos,
           df2 = df_vibrio_scp,
           df3 = df_vibrio_pmd,
           directory = tempdir(),
           open_file = FALSE,
           dry_run = TRUE
           )

Extracts the metadata from the new references found on PubMed based on the search string(s) saved in "search_list.txt".

Description

Extracts the metadata from the new references found on PubMed based on the search string(s) saved in "search_list.txt".

Usage

extract_pmd_list(search_list_path, directory)

Arguments

search_list_path

Path to "search_list.txt".

directory

Choose the directory in which the references identification history will be saved.

Value

A data.frame with one row per retrieved PubMed record and columns:

author: Character. Publication authors.
year: Character. Publication year.
title: Character. Publication title.
journal: Character. Publication journal name.
volume: Character. Publication journal volume.
issue: Character. Publication journal issue.
abstract: Character. Publication abstract.
doi: Character. Publication Digital Object Identifier (DOI).
source: Character. Data source.
platform_id: Character. Publication unique identifier in data source.

If search_list_path does not exist, returns NULL.

Extracts the metadata from the new references found on Scopus based on the search string(s) saved in "search_list.txt".

Description

Extracts the metadata from the new references found on Scopus based on the search string(s) saved in "search_list.txt".

Usage

extract_scp_list(search_list_path, directory)

Arguments

search_list_path

Path to "search_list.txt".

directory

Choose the directory in which the references identification history will be saved.

Value

A data.frame with one row per retrieved Scopus record and columns:

author: Character. Publication authors.
year: Character. Publication year.
title: Character. Publication title.
journal: Character. Publication journal name.
volume: Character. Publication journal volume.
issue: Character. Publication journal issue.
abstract: Character. Publication abstract.
doi: Character. Publication Digital Object Identifier (DOI).
source: Character. Data source.
platform_id: Character. Publication unique identifier in data source.

If search_list_path does not exist, returns NULL.

extract the metadata from the new references from Web of Science based on the search strings found in search_list.txt

Description

extract the metadata from the new references from Web of Science based on the search strings found in search_list.txt

Usage

extract_wos_list(search_list_path, directory)

Arguments

search_list_path

path to search_list

directory

Choose the directory in which the references identification history will be saved.

Value

A data.frame with one row per retrieved Web of Science record and columns:

author: Character. Publication authors.
year: Character. Publication year.
title: Character. Publication title.
journal: Character. Publication journal name.
volume: Character. Publication journal volume.
issue: Character. Publication journal issue.
abstract: Character. Publication abstract.
doi: Character. Publication Digital Object Identifier (DOI).
source: Character. Data source.
platform_id: Character. Publication unique identifier in data source.

If search_list_path does not exist, returns NULL.

Transforms a long computer path into a shorter.

Description

Transforms a long computer path into a shorter.

Usage

get_short_path(path)

Arguments

path

Path to the document.

Value

Character scalar, a shorter path to use in Windows OS.

Manual literature retrieval.

Description

Retrieves references corresponding to the saved search string(s) on up to three platforms (e.g. Web of Science, Scopus and PubMed).

Usage

manual_fetch(
  wos = FALSE,
  scp = FALSE,
  pmd = FALSE,
  directory,
  dedup = FALSE,
  open_file = FALSE,
  dry_run = FALSE
)

Arguments

wos

Runs the search on Web of Science (TRUE or FALSE).

scp

Runs the search on Scopus (TRUE or FALSE).

pmd

Runs the search on PubMed (TRUE or FALSE).

directory

Choose the directory in which the search string is saved (Project's directory). That is also where the references metadata will be saved.

dedup

Deduplicates the retrieved references (TRUE or FALSE).

open_file

Automatically opens the CSV file after reference retrieval.

dry_run

Simulation run option.

Value

NULL (invisibly). Called for its side effects: Create a CSV file with the references metadata, a history file of the references retrieved and a history file of the deduplication (if the option is selected).

Examples

# This is a "dry run" example.
# No references will actually be scheduled, it only shows how the function should react.
manual_fetch(wos = TRUE,
             scp = TRUE,
             pmd = TRUE,
             directory,
             dedup = TRUE,
             open_file = FALSE,
             dry_run = TRUE
             )

Removes a scheduled task.

Description

Removes a scheduled task using the "task_id" from Task Scheduler (Windows) or Cron (Mac/Linux).

Usage

remove_scheduled_task(task_id, dry_run = FALSE)

Arguments

task_id

Name/ID of the scheduled task (Windows Task Scheduler or Cron).

dry_run

Simulation run option.

Value

NULL (invisibly). Called for its side effects: removes a scheduled task saved using the function 'auto_LitFetchR_setup'.

Examples

# This is a "dry run" example.
# No task will actually be removed, it only shows how the function should react.
remove_scheduled_task("fish_vibrio",
                      dry_run = TRUE
                      )

Saves Web of Science and/or Scopus API keys in .Renviron.

Description

You can set wos_api_key, scp_api_key, or both at the same time. Remember to restart the R session after saving your API keys.

Usage

save_api_keys(wos_api_key = NULL, scp_api_key = NULL, dry_run = FALSE)

Arguments

wos_api_key

The API key value for Web of Science (use quotation marks).

scp_api_key

The API key value for Scopus (use quotation marks).

dry_run

Simulation run option.

Value

Logical. TRUE if at least one value was written, FALSE if left unchanged.

Examples

save_api_keys(wos_api_key = "abcd01234",
               scp_api_key = "efgh5678",
               dry_run = TRUE
               )

Automating the retrieval of references based on a saved search string(s).

Description

Usage

Arguments

Value

Examples

Creates a unique name for any document.

Description

Usage

Arguments

Value

Creates an excel file to store the deduplication history.

Description

Usage

Arguments

Value

Creates an excel file to store the references identification retrieved at each search.

Description

Usage

Arguments

Value

Creates and saves search string(s).

Description

Usage

Arguments

Value

Examples

Creates an excel file to store the history of searches made using create_save_search().

Description

Usage

Arguments

Value

Deduplicates the references from up to three dataframes.

Description

Usage

Arguments

Value

Examples

Extracts the metadata from the new references found on PubMed based on the search string(s) saved in "search_list.txt".

Description

Usage

Arguments

Value

Extracts the metadata from the new references found on Scopus based on the search string(s) saved in "search_list.txt".

Description

Usage

Arguments

Value

extract the metadata from the new references from Web of Science based on the search strings found in search_list.txt

Description

Usage

Arguments

Value

Transforms a long computer path into a shorter.

Description

Usage

Arguments

Value

Manual literature retrieval.

Description

Usage

Arguments

Value

Examples

Removes a scheduled task.

Description

Usage

Arguments

Value

Examples

Saves Web of Science and/or Scopus API keys in .Renviron.

Description

Usage

Arguments

Value

Examples

Creates an excel file to store the history of searches made using `create_save_search()`.