The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

textreuse: Detect Text Reuse and Document Similarity

Tools for measuring similarity among documents and detecting passages which have been reused. Implements shingled n-gram, skip n-gram, and other tokenizers; similarity/dissimilarity functions; pairwise comparisons; minhash and locality sensitive hashing algorithms; and a version of the Smith-Waterman local alignment algorithm suitable for natural language.

Version: 0.1.5
Depends: R (≥ 3.1.1)
Imports: assertthat (≥ 0.1), digest (≥ 0.6.8), dplyr (≥ 0.8.0), NLP (≥ 0.1.8), Rcpp (≥ 0.12.0), RcppProgress (≥ 0.1), stringr (≥ 1.0.0), tibble (≥ 3.0.1), tidyr (≥ 0.3.1)
LinkingTo: BH, Rcpp, RcppProgress
Suggests: testthat (≥ 0.11.0), knitr (≥ 1.11), rmarkdown (≥ 0.8), covr
Published: 2020-05-15
Author: Lincoln Mullen ORCID iD [aut, cre]
Maintainer: Lincoln Mullen <lincoln at lincolnmullen.com>
BugReports: https://github.com/ropensci/textreuse/issues
License: MIT + file LICENSE
URL: https://docs.ropensci.org/textreuse, https://github.com/ropensci/textreuse
NeedsCompilation: yes
Materials: README NEWS
In views: NaturalLanguageProcessing
CRAN checks: textreuse results

Documentation:

Reference manual: textreuse.pdf
Vignettes: Text alignment
Introduction to the textreuse packages
Minhash and locality-sensitive hashing
Pairwise comparisons for document similarity

Downloads:

Package source: textreuse_0.1.5.tar.gz
Windows binaries: r-devel: textreuse_0.1.5.zip, r-release: textreuse_0.1.5.zip, r-oldrel: textreuse_0.1.5.zip
macOS binaries: r-release (arm64): textreuse_0.1.5.tgz, r-oldrel (arm64): textreuse_0.1.5.tgz, r-release (x86_64): textreuse_0.1.5.tgz, r-oldrel (x86_64): textreuse_0.1.5.tgz
Old sources: textreuse archive

Reverse dependencies:

Reverse suggests: textrank

Linking:

Please use the canonical form https://CRAN.R-project.org/package=textreuse to link to this page.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.