The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
Functions for extracting text and tables from PDF-based order documents. It provides an n-gram-based approach for identifying the language of an order document. It furthermore uses R-package 'pdftools' to extract the text from an order document. In the case that the PDF document is only including an image (because it is scanned document), R package 'tesseract' is used for OCR. Furthermore, the package provides functionality for identifying and extracting order position tables in order documents based on a clustering approach.
Version: | 1.0.0 |
Depends: | R (≥ 4.3.0), tidyselect |
Imports: | data.table, dplyr, matrixcalc, quanteda, rlist, stringr, tibble, tidyr, utils, purrr, digest, lubridate |
Suggests: | pdftools, tesseract, xml2 |
Published: | 2024-12-12 |
DOI: | 10.32614/CRAN.package.orderanalyzer |
Author: | Michael Scholz [cre, aut], Joerg Bauer [aut] |
Maintainer: | Michael Scholz <michael.scholz at th-deg.de> |
License: | GPL-3 |
NeedsCompilation: | no |
SystemRequirements: | Tesseract >= 5.0.0, libtesseract-dev (deb), tesseract-devel (rpm), libleptonica-dev (deb), leptonica-devel (rpm), tesseract-ocr-eng (deb), libpoppler-cpp-dev (deb), poppler-cpp-devel (rpm), poppler-data (rpm/deb), libxml2-dev (deb), libxml2-devel (rpm) |
CRAN checks: | orderanalyzer results |
Reference manual: | orderanalyzer.pdf |
Package source: | orderanalyzer_1.0.0.tar.gz |
Windows binaries: | r-devel: orderanalyzer_1.0.0.zip, r-release: not available, r-oldrel: orderanalyzer_1.0.0.zip |
macOS binaries: | r-release (arm64): orderanalyzer_1.0.0.tgz, r-oldrel (arm64): orderanalyzer_1.0.0.tgz, r-release (x86_64): orderanalyzer_1.0.0.tgz, r-oldrel (x86_64): orderanalyzer_1.0.0.tgz |
Please use the canonical form https://CRAN.R-project.org/package=orderanalyzer to link to this page.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.