The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Title: Cora Data for Entity Resolution
Version: 0.1.0
Description: Duplicated publication data (pre-processed and formatted) for entity resolution. This data set contains a total of 1879 records. The following variables are included in the data set: id, title, book title, authors, address, date, year, editor, journal, volume, pages, publisher, institution, type, tech, note. The data set has a respective gold data set that provides information on which records match based on id.
URL: https://github.com/resteorts/cora
BugReports: https://github.com/resteorts/cora/issues
Depends: R (≥ 3.4.0)
License: CC0
Encoding: UTF-8
LazyData: true
RoxygenNote: 7.1.1.9000
NeedsCompilation: no
Packaged: 2020-10-05 10:57:07 UTC; rebeccasteorts
Author: Rebecca Steorts [aut, cre], Andee Kaplan [aut], Srini Sunil [aut]
Maintainer: Rebecca Steorts <beka@stat.duke.edu>
Repository: CRAN
Date/Publication: 2020-10-13 12:50:06 UTC

CORA data set

Description

This provides a record linkage data set with information about different CORA research papers.

Usage

cora

Format

A data frame with 16 variables: id, title, book_title, authors, address, date, year, editor,journal, volume, pages, publisher, institution, type, tech, note.

This data set includes 1879 CORA research papers. It is appropriate for performing various types of record linkage and can be assessed by standard record linkage methods.

Examples

head(cora)
dim(cora)


Cora Gold

Description

This data set includes the matched record pairs based on ID.

Usage

cora_gold

Format

A data frame with 2 variables: id1, id2

This data set includes the matched record pairs based on ID from the CORA data set. This data set can be used to evaluate the performance of record linkage methods performed on the CORA data set.

Examples

head(cora_gold)
dim(cora_gold)


Cora Gold Update

Description

This data set includes the matched record pairs based on ID.

Usage

cora_gold_update

Format

A data frame with 2 variables: cora_id, unique_id

This data set includes the matched record pairs based on ID from the CORA data set. This data set can be used to evaluate the performance of record linkage methods performed on the CORA data set.

Examples

head(cora_gold_update)
dim(cora_gold_update)

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.