The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
A research compendium is a self-contained collection of data, code, and documentation that accompanies a research project. By structuring a project as an R package, you gain:
DESCRIPTION,roxygen2,
vignettes),testthat),SCIproj automates the creation of such a compendium, adding
opinionated defaults for reproducible workflows (targets),
dependency snapshots (renv), and FAIR-compliant metadata
(CITATION.cff).
Install SCIproj from GitHub:
Create a new project with a single call:
This creates a fully scaffolded research compendium with
renv and targets enabled by default.
create_proj("~/projects/baltic_cod",
add_license = "MIT",
license_holder = "Jane Doe",
orcid = "0000-0001-2345-6789",
use_docker = TRUE,
use_git = TRUE
)Directory names with underscores or hyphens are fine — the R package
name in DESCRIPTION is automatically sanitized (e.g.,
baltic_cod becomes baltic.cod).
After creation, the project directory looks like this:
your-project/
├── DESCRIPTION # Project metadata, dependencies, and author info (with ORCID).
├── README.Rmd # Top-level project description.
├── your-project.Rproj # RStudio project file.
├── CITATION.cff # Machine-readable citation metadata for FAIR compliance.
├── CONTRIBUTING.md # Contribution guidelines.
├── LICENSE.md # Full license text (here: MIT).
├── NAMESPACE # Auto-generated by roxygen2 (do not edit by hand).
│
├── data-raw/ # Raw data files and pre-processing scripts.
│ ├── clean_data.R # Script template for data cleaning.
│ ├── DATA_SOURCES.md # Data provenance: source, license, DOI, download date.
│ └── ...
│
├── data/ # Cleaned datasets stored as .rda files.
│
├── R/ # Custom R functions and dataset documentation.
│ ├── function_ex.R # Template for custom functions.
│ ├── data.R # Template for dataset documentation.
│ └── ...
│
├── analyses/ # R scripts or R Markdown/Quarto documents for analyses.
│ ├── figures/ # Generated plots.
│ └── ...
│
├── docs/ # Publication-ready documents (article, report, presentation).
├── trash/ # Temporary files that can be safely deleted.
│
├── _targets.R # Pipeline definition for reproducible workflow.
├── renv/ # renv library and settings.
├── renv.lock # Lockfile for reproducible package versions.
└── Dockerfile # Container definition for full reproducibility.
| Directory / File | Purpose |
|---|---|
R/ |
Reusable R functions (documented with roxygen2) |
data/ |
Cleaned, analysis-ready datasets (.rda format) |
data-raw/ |
Raw data files and the script that cleans them |
analyses/ |
Analysis scripts, R Markdown reports, figures |
docs/ |
Manuscripts, presentations, supplementary material |
trash/ |
Temporary files not under version control |
_targets.R |
Pipeline definition for targets |
CITATION.cff |
Machine-readable citation metadata |
CONTRIBUTING.md |
Guidelines for collaborators |
SCIproj encourages FAIR (Findable, Accessible, Interoperable, Reusable) research practices through several built-in features:
A Citation File Format file is created automatically. It includes the project title, author name, version, release date, and optionally a license and ORCID iD. Services like GitHub and Zenodo can parse this file to generate proper citations.
When data_raw = TRUE (the default), a
DATA_SOURCES.md template is placed in
data-raw/. Use it to document the provenance of every
dataset: source, URL, DOI, license, download date, and file names.
Pass your ORCID iD via the
orcid parameter to embed it in CITATION.cff,
making your authorship unambiguously machine-readable.
By default (use_targets = TRUE), SCIproj adds a
_targets.R pipeline template. The targets package
provides:
_targets/ data store.tar_visnetwork() shows
the pipeline as a graph.A typical workflow:
# 1. Define targets in _targets.R
# 2. Inspect the pipeline
targets::tar_manifest()
targets::tar_visnetwork()
# 3. Run the pipeline
targets::tar_make()
# 4. Read a result
targets::tar_read(my_result)Edit _targets.R to define your data-loading, analysis,
and reporting steps. Each step is a target that depends on upstream
targets and R functions in R/.
By default (use_renv = TRUE), SCIproj initializes renv with the
"explicit" snapshot type. This means renv discovers
dependencies from DESCRIPTION rather than scanning all R
files, which is the recommended approach for package-based
compendia.
Key commands:
renv::status() # check if lockfile is in sync
renv::snapshot() # update the lockfile after adding packages
renv::restore() # reinstall packages from the lockfileThe renv.lock file should be committed to version
control so collaborators can reproduce your exact package versions.
Set use_docker = TRUE to add a Dockerfile
and .dockerignore. The Dockerfile provides a template for
building a container that reproduces your computational environment,
independent of the host system.
Set create_github_repo = TRUE to create a GitHub
repository (requires a configured GITHUB_PAT). Add
ci = "gh-actions" to include a GitHub Actions workflow for
automated R CMD check on push.
Choose from "MIT", "GPL",
"AGPL", "LGPL", "Apache",
"CCBY", or"CC0" via the
add_license parameter. The selected license is applied to
DESCRIPTION and recorded in CITATION.cff.
Set testthat = TRUE to add testing infrastructure
(tests/testthat.R and tests/testthat/).
Writing tests for your analysis functions helps catch regressions
early.
Set makefile = TRUE to add a makefile.R
script as an alternative to targets for orchestrating your
workflow.
Create the project
Open the .Rproj file in
RStudio.
Add raw data to data-raw/ and
document it in DATA_SOURCES.md.
Write cleaning code in
data-raw/clean_data.R; save cleaned data to
data/ with usethis::use_data().
Write analysis functions in R/ and
document them with roxygen2.
Define the pipeline in _targets.R
to connect data, functions, and reports.
Run targets::tar_make() to execute
the pipeline.
Write reports in analyses/ using R
Markdown or Quarto, reading results with
targets::tar_read().
Snapshot dependencies with
renv::snapshot() before sharing.
Push to GitHub and let CI run
R CMD check automatically.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.