The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

orderanalyzer: Extracting Order Position Tables from PDF-Based Order Documents

Functions for extracting text and tables from PDF-based order documents. It provides an n-gram-based approach for identifying the language of an order document. It furthermore uses R-package 'pdftools' to extract the text from an order document. In the case that the PDF document is only including an image (because it is scanned document), R package 'tesseract' is used for OCR. Furthermore, the package provides functionality for identifying and extracting order position tables in order documents based on a clustering approach.

Version: 1.0.0
Depends: R (≥ 4.3.0), tidyselect
Imports: data.table, dplyr, matrixcalc, quanteda, rlist, stringr, tibble, tidyr, utils, purrr, digest, lubridate
Suggests: pdftools, tesseract, xml2
Published: 2024-12-12
DOI: 10.32614/CRAN.package.orderanalyzer
Author: Michael Scholz [cre, aut], Joerg Bauer [aut]
Maintainer: Michael Scholz <michael.scholz at th-deg.de>
License: GPL-3
NeedsCompilation: no
SystemRequirements: Tesseract >= 5.0.0, libtesseract-dev (deb), tesseract-devel (rpm), libleptonica-dev (deb), leptonica-devel (rpm), tesseract-ocr-eng (deb), libpoppler-cpp-dev (deb), poppler-cpp-devel (rpm), poppler-data (rpm/deb), libxml2-dev (deb), libxml2-devel (rpm)
CRAN checks: orderanalyzer results

Documentation:

Reference manual: orderanalyzer.pdf

Downloads:

Package source: orderanalyzer_1.0.0.tar.gz
Windows binaries: r-devel: orderanalyzer_1.0.0.zip, r-release: not available, r-oldrel: orderanalyzer_1.0.0.zip
macOS binaries: r-release (arm64): orderanalyzer_1.0.0.tgz, r-oldrel (arm64): orderanalyzer_1.0.0.tgz, r-release (x86_64): orderanalyzer_1.0.0.tgz, r-oldrel (x86_64): orderanalyzer_1.0.0.tgz

Linking:

Please use the canonical form https://CRAN.R-project.org/package=orderanalyzer to link to this page.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.