The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Modernising Citation Metadata in R: Introducing bibrecord

library(dataset)

Descriptive metadata is often added as an afterthought or stored separately from the data it describes. This separation can lead to loss of context when datasets are shared, archived, or reused. To avoid this, the dataset package encourages metadata to be embedded at the time of dataset creation.

For a dataset_df, this means not only providing variable-level definitions, units, and namespaces, but also including a complete, standards-aligned citation record for the dataset itself. Encoding citation information early ensures that it travels with the data, supports the FAIR principles (Findable, Accessible, Interoperable, Reusable), and is ready for export to modern metadata formats.

In the Design Principles & Future Work Semantically Enriched, Standards-Aligned Datasets in R, we identify three objectives for dataset-level citation metadata:

  1. Full compliance with standards such as Dublin Core Terms (DCTERMS) and DataCite
  2. Interoperability with the R ecosystem, including dataset_df and base R tools
  3. Preservation of meaning throughout the dataset’s lifecycle — from creation to publication and reuse

Purpose

The base R function utils::bibentry() offers a way to structure citation metadata and works well for simple references. However, it does not fully support DCTERMS or DataCite, which require:

The bibrecord class builds on bibentry to bridge this gap while remaining fully compatible with base R. It adds:

Ideally, bibrecord should evolve in close coordination with utils::bibentry() or be replaced by a modernised bibentry that supports these capabilities natively, achieving the three objectives described above.

What is bibrecord

A bibrecord is a standard bibentry object with additional fields stored as attributes. This means:

Creating a bibrecord

person_jane <- person("Jane", "Doe", role = "cre")
person_alice <- person("Alice", "Smith", role = "dtm")

rec <- bibrecord(
  title = "GDP of Small States",
  author = list(person_jane),
  contributor = list(person_alice),
  publisher = "Tinystat",
  identifier = "doi:10.1234/example",
  date = "2023-05-01",
  subject = "Economic indicators"
)

Printing a bibrecord

print(rec)
#> Doe J (2023). "GDP of Small States."
#> 
#> Contributors:
#> {Alice Smith [dtm]}

When printed, a bibrecord shows the standard citation along with clearly labelled contributor and metadata fields.

Compatibility with existing infrastructure

Because bibrecord inherits from bibentry:

Future extensions

Planned enhancements to bibrecord include:

In the broader context described in Design Principles & Future Work Semantically Enriched, Standards-Aligned Datasets in R, the long-term goal is to ensure that dataset-level citation metadata in R meets three objectives:

  1. Full compliance with modern metadata standards such as Dublin Core Terms (DCTERMS) and DataCite
  2. Seamless interoperability with the R ecosystem, including dataset_df and base R tools
  3. Preservation of meaning across the entire data lifecycle, from dataset creation to long-term publication and reuse

To achieve this, bibrecord should either evolve in close coordination with utils::bibentry() or, ideally, be replaced entirely by a modernised version of bibentry that supports these capabilities natively.

Summary

The bibrecord class extends base R’s bibentry to provide structured, standards-aligned citation metadata that can be embedded directly into a dataset_df. It keeps full compatibility with existing R workflows while adding support for contributor roles, richer metadata fields, and export to standards like DCTERMS and DataCite.

Embedding a bibrecord in a dataset_df ensures that citation information is:

By adopting bibrecord, you can create datasets that are ready for FAIR-compliant publishing, are easier to share, and maintain their full descriptive context throughout their lifecycle.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.