The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

zoomerjoin: Superlatively Fast Fuzzy Joins

Empowers users to fuzzily-merge data frames with millions or tens of millions of rows in minutes with low memory usage. The package uses the locality sensitive hashing algorithms developed by Datar, Immorlica, Indyk and Mirrokni (2004) <doi:10.1145/997817.997857>, and Broder (1998) <doi:10.1109/SEQUEN.1997.666900> to avoid having to compare every pair of records in each dataset, resulting in fuzzy-merges that finish in linear time.

Version: 0.2.0
Depends: R (≥ 2.10)
Imports: collapse, dplyr, tibble, tidyr
Suggests: babynames, covr, fuzzyjoin, igraph, knitr, microbenchmark, profmem, purrr, rmarkdown, stringdist, testthat (≥ 3.0.0), tidyverse, vdiffr
Published: 2024-09-24
DOI: 10.32614/CRAN.package.zoomerjoin
Author: Beniamino Green [aut, cre, cph], Etienne Bacher ORCID iD [ctb], The authors of the dependency Rust crates [ctb, cph] (see inst/AUTHORS file for details)
zoomerjoin author details
Maintainer: Beniamino Green <beniamino.green at yale.edu>
BugReports: https://github.com/beniaminogreen/zoomerjoin/issues
License: GPL (≥ 3)
URL: https://beniamino.org/zoomerjoin/, https://github.com/beniaminogreen/zoomerjoin
NeedsCompilation: yes
SystemRequirements: Cargo (>= 1.56) (Rust's package manager), rustc (>= 1.70)
Materials: README NEWS
CRAN checks: zoomerjoin results

Documentation:

Reference manual: zoomerjoin.pdf
Vignettes: benchmarks (source, R code)
A Zoomerjoin Guided Tour (source, R code)
matching_vectors (source, R code)

Downloads:

Package source: zoomerjoin_0.2.0.tar.gz
Windows binaries: r-devel: zoomerjoin_0.2.0.zip, r-release: zoomerjoin_0.2.0.zip, r-oldrel: zoomerjoin_0.2.0.zip
macOS binaries: r-release (arm64): zoomerjoin_0.2.0.tgz, r-oldrel (arm64): zoomerjoin_0.2.0.tgz, r-release (x86_64): zoomerjoin_0.2.0.tgz, r-oldrel (x86_64): zoomerjoin_0.1.4.tgz
Old sources: zoomerjoin archive

Linking:

Please use the canonical form https://CRAN.R-project.org/package=zoomerjoin to link to this page.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.