The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

AhoCorasickTrie: Fast Searching for Multiple Keywords in Multiple Texts

Aho-Corasick is an optimal algorithm for finding many keywords in a text. It can locate all matches in a text in O(N+M) time; i.e., the time needed scales linearly with the number of keywords (N) and the size of the text (M). Compare this to the naive approach which takes O(N*M) time to loop through each pattern and scan for it in the text. This implementation builds the trie (the generic name of the data structure) and runs the search in a single function call. If you want to search multiple texts with the same trie, the function will take a list or vector of texts and return a list of matches to each text. By default, all 128 ASCII characters are allowed in both the keywords and the text. A more efficient trie is possible if the alphabet size can be reduced. For example, DNA sequences use at most 19 distinct characters and usually only 4; protein sequences use at most 26 distinct characters and usually only 20. UTF-8 (Unicode) matching is not currently supported.

Version: 0.1.2
Imports: Rcpp (≥ 0.12.5)
LinkingTo: Rcpp
Suggests: Biostrings, microbenchmark, testthat
Published: 2020-09-29
Author: Matt Chambers [aut, cre], Tomas Petricek [aut, cph], Vanderbilt University [cph]
Maintainer: Matt Chambers <matt.chambers42 at gmail.com>
BugReports: https://github.com/chambm/AhoCorasickTrie/issues
License: Apache License 2.0
URL: https://github.com/chambm/AhoCorasickTrie
NeedsCompilation: yes
SystemRequirements: C++11
CRAN checks: AhoCorasickTrie results

Documentation:

Reference manual: AhoCorasickTrie.pdf

Downloads:

Package source: AhoCorasickTrie_0.1.2.tar.gz
Windows binaries: r-devel: AhoCorasickTrie_0.1.2.zip, r-release: AhoCorasickTrie_0.1.2.zip, r-oldrel: AhoCorasickTrie_0.1.2.zip
macOS binaries: r-release (arm64): AhoCorasickTrie_0.1.2.tgz, r-oldrel (arm64): AhoCorasickTrie_0.1.2.tgz, r-release (x86_64): AhoCorasickTrie_0.1.2.tgz, r-oldrel (x86_64): AhoCorasickTrie_0.1.2.tgz
Old sources: AhoCorasickTrie archive

Reverse dependencies:

Reverse imports: customProDB, prozor

Linking:

Please use the canonical form https://CRAN.R-project.org/package=AhoCorasickTrie to link to this page.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.