The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
An R package for structured, reproducible data analysis projects.
Status: Active development. APIs may change before version 1.0.
# Install from GitHub
remotes::install_github("table1/framework")
# One-time global setup (author info, preferences)
framework::setup()
# Create projects using your saved defaults
framework::new()
framework::new("my-analysis", "~/projects/my-analysis")
framework::new_presentation("quarterly-review", "~/talks/q4")
framework::new_course("stats-101", "~/teaching/stats")Example project structure:
project/
├── notebooks/ # Exploratory analysis
├── scripts/ # Production pipelines
├── inputs/
│ ├── raw/ # Raw data (gitignored)
│ ├── intermediate/ # Cleaned datasets (gitignored)
│ ├── final/ # Curated analytic datasets (gitignored)
│ └── reference/ # External documentation (gitignored)
├── outputs/
│ ├── private/ # Tables, figures, models, cache (gitignored)
│ └── public/ # Share-ready artifacts
├── functions/ # Custom functions
├── docs/ # Documentation
├── settings.yml # Project configuration
├── framework.db # Metadata tracking database
└── .env # Secrets (gitignored)
Framework reduces boilerplate and enforces best practices:
library() callslibrary(framework)
scaffold() # Loads packages, functions, config, standardizes working directory# Quarto notebook (default)
make_notebook("exploration") # → notebooks/exploration.qmd
make_qmd("analysis") # Always Quarto
make_rmd("report") # RMarkdown
# Presentations
make_revealjs("slides") # reveal.js presentation
# Scripts
make_script("process-data") # → scripts/process-data.R
# List available templates
stubs_list()Custom stubs: Create a stubs/ directory
with your own templates.
Via config (recommended):
# settings.yml
data:
inputs:
raw:
survey:
path: inputs/raw/survey.csv
type: csv
locked: true # Errors if file changesdf <- data_load("inputs.raw.survey")Direct path:
df <- data_load("inputs/raw/my_file.csv") # CSV
df <- data_load("inputs/raw/stata_file.dta") # Stata
df <- data_load("inputs/raw/spss_file.sav") # SPSSEvery read is logged with a SHA-256 hash for integrity tracking.
model <- get_or_cache("model_v1", {
expensive_model_fit(df)
}, expire_after = 1440) # 24 hoursSave data files:
data_save(processed_df, "intermediate.cleaned_data")
# → saves to inputs/intermediate/cleaned_data.rds
data_save(final_df, "final.analysis_ready", type = "csv")
# → saves to inputs/final/analysis_ready.csvSave analysis outputs:
result_save("regression_model", model, type = "model")
result_save("report", file = "report.html", type = "notebook", blind = TRUE)# settings.yml
connections:
db:
driver: postgresql
host: env("DB_HOST")
database: env("DB_NAME")
user: env("DB_USER")
password: env("DB_PASS")df <- query_get("SELECT * FROM users WHERE active = true", "db")view_detail() provides rich, browser-based data
exploration:
view_detail(mtcars) # Interactive table with search/filter/export
view_detail(config) # Tabbed YAML + R structure for lists
view_detail(ggplot(mtcars, aes(mpg, hp)) + geom_point()) # Interactive plotsSimple:
default:
packages:
- dplyr
- ggplot2
data:
example: data/example.csvAdvanced (split files):
default:
data: settings/data.yml
packages: settings/packages.yml
connections: settings/connections.ymlSecrets in .env:
DB_HOST=localhost
DB_PASS=secret
Reference in config:
connections:
db:
host: env("DB_HOST")
password: env("DB_PASS", "default")Framework creates instruction files for AI coding assistants:
framework::configure_ai_agents()Supported: Claude Code (CLAUDE.md), GitHub Copilot, AGENTS.md
| Function | Purpose |
|---|---|
scaffold() |
Initialize session (load packages, functions, config) |
data_load() |
Load data from path or config |
data_save() |
Save data with integrity tracking |
view_detail() |
Browser-based data viewer with search/export |
query_get() |
Execute SQL query, return data |
query_execute() |
Execute SQL command |
get_or_cache() |
Lazy evaluation with caching |
result_save() |
Save analysis output |
result_get() |
Retrieve saved result |
scratch_capture() |
Quick debug/temp file save |
renv_enable() |
Enable renv for reproducibility |
packages_snapshot() |
Save package versions to renv.lock |
packages_restore() |
Restore packages from renv.lock |
security_audit() |
Scan for data leaks and security issues |
security_audit()
detects data leaks# Save encrypted data
data_save(sensitive_df, "private.data", encrypted = TRUE)
# Load (auto-detects encryption)
data <- data_load("private.data")Password from ENCRYPTION_PASSWORD env var or interactive
prompt.
audit <- security_audit() # Full audit
audit <- security_audit(auto_fix = TRUE) # Auto-fix .gitignore issuesOptional renv integration (off by default):
renv_enable() # Enable for this project
packages_snapshot() # Save current versions
packages_restore() # Restore from renv.lock
renv_disable() # Disable (keeps renv.lock)Version pinning in settings.yml:
packages:
- dplyr # Latest from CRAN
- ggplot2@3.4.0 # Specific version
- tidyverse/dplyr@main # GitHub with branchThese binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.