The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
This natural language processing toolkit provides language-agnostic 'tokenization', 'parts of speech tagging', 'lemmatization' and 'dependency parsing' of raw text. Next to text parsing, the package also allows you to train annotation models based on data of 'treebanks' in 'CoNLL-U' format as provided at <https://universaldependencies.org/format.html>. The techniques are explained in detail in the paper: 'Tokenizing, POS Tagging, Lemmatizing and Parsing UD 2.0 with UDPipe', available at <doi:10.18653/v1/K17-3009>. The toolkit also contains functionalities for commonly used data manipulations on texts which are enriched with the output of the parser. Namely functionalities and algorithms for collocations, token co-occurrence, document term matrix handling, term frequency inverse document frequency calculations, information retrieval metrics (Okapi BM25), handling of multi-word expressions, keyword detection (Rapid Automatic Keyword Extraction, noun phrase extraction, syntactical patterns) sentiment scoring and semantic similarity analysis.
Version: | 0.8.11 |
Depends: | R (≥ 2.10) |
Imports: | Rcpp (≥ 0.11.5), data.table (≥ 1.9.6), Matrix, methods, stats |
LinkingTo: | Rcpp |
Suggests: | knitr, rmarkdown, topicmodels, lattice, parallel |
Published: | 2023-01-06 |
DOI: | 10.32614/CRAN.package.udpipe |
Author: | Jan Wijffels [aut, cre, cph], BNOSAC [cph], Institute of Formal and Applied Linguistics, Faculty of Mathematics and Physics, Charles University in Prague, Czech Republic [cph], Milan Straka [ctb, cph], Jana Straková [ctb, cph] |
Maintainer: | Jan Wijffels <jwijffels at bnosac.be> |
License: | MPL-2.0 |
URL: | https://bnosac.github.io/udpipe/en/index.html, https://github.com/bnosac/udpipe |
NeedsCompilation: | yes |
Materials: | README NEWS |
In views: | NaturalLanguageProcessing |
CRAN checks: | udpipe results |
Package source: | udpipe_0.8.11.tar.gz |
Windows binaries: | r-devel: udpipe_0.8.11.zip, r-release: udpipe_0.8.11.zip, r-oldrel: udpipe_0.8.11.zip |
macOS binaries: | r-release (arm64): udpipe_0.8.11.tgz, r-oldrel (arm64): udpipe_0.8.11.tgz, r-release (x86_64): udpipe_0.8.11.tgz, r-oldrel (x86_64): udpipe_0.8.11.tgz |
Old sources: | udpipe archive |
Reverse imports: | cleanNLP, corpustools, finnsurveytext, MadanText, MadanTextNetwork, TextForecast |
Reverse suggests: | BTM, crfsuite, doc2vec, nametagger, pseudobibeR, ruimtehol, text2vec, textplot, textrank, textrecipes, topicmodels.etm, word2vec |
Reverse enhances: | NLP |
Please use the canonical form https://CRAN.R-project.org/package=udpipe to link to this page.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.