CRAN Package Check Results for Maintainer ‘Kurt Hornik <Kurt.Hornik at R-project.org>’

Last updated on 2025-12-14 05:51:16 CET.

Package ERROR NOTE OK
bindata 13
cclust 13
chron 13
clue 13
date 13
ISOcodes 13
mlbench 13
movMF 13
NLP 13
NLPutils 13
OAIHarvester 13
openNLP 3 10
openNLPdata 3 10
oz 13
relations 3 10
RKEA 13
RKEAjars 3 10
Rpoppler 3 10
Rsymphony 2 11
RWeka 4 9
RWekajars 3 10
skmeans 3 10
slam 4 9
tau 4 9
textcat 13
tm 3 2 8
tm.plugin.mail 13
tseries 2 11
Unicode 13
W3CMarkupValidator 13
wordnet 3 10

Package bindata

Current CRAN status: OK: 13

Package cclust

Current CRAN status: OK: 13

Package chron

Current CRAN status: OK: 13

Package clue

Current CRAN status: OK: 13

Package date

Current CRAN status: OK: 13

Package ISOcodes

Current CRAN status: OK: 13

Package mlbench

Current CRAN status: OK: 13

Package movMF

Current CRAN status: OK: 13

Package NLP

Current CRAN status: OK: 13

Package NLPutils

Current CRAN status: OK: 13

Package OAIHarvester

Current CRAN status: OK: 13

Package openNLP

Current CRAN status: NOTE: 3, OK: 10

Additional issues

donttest

Version: 0.2-7
Check: package dependencies
Result: NOTE Package suggested but not available for checking: ‘openNLPmodels.en’ Flavors: r-oldrel-macos-arm64, r-oldrel-macos-x86_64, r-oldrel-windows-x86_64

Package openNLPdata

Current CRAN status: NOTE: 3, OK: 10

Version: 1.5.3-5
Check: installed package size
Result: NOTE installed size is 7.2Mb sub-directories of 1Mb or more: java 1.2Mb models 6.0Mb Flavors: r-oldrel-macos-arm64, r-oldrel-macos-x86_64, r-oldrel-windows-x86_64

Package oz

Current CRAN status: OK: 13

Package relations

Current CRAN status: NOTE: 3, OK: 10

Version: 0.6-15
Check: package dependencies
Result: NOTE Package which this enhances but not available for checking: ‘Rcplex’ Flavors: r-oldrel-macos-arm64, r-oldrel-macos-x86_64, r-oldrel-windows-x86_64

Package RKEA

Current CRAN status: OK: 13

Package RKEAjars

Current CRAN status: NOTE: 3, OK: 10

Version: 5.0-4
Check: installed package size
Result: NOTE installed size is 10.8Mb sub-directories of 1Mb or more: java 10.8Mb Flavors: r-oldrel-macos-arm64, r-oldrel-macos-x86_64, r-oldrel-windows-x86_64

Package Rpoppler

Current CRAN status: NOTE: 3, OK: 10

Version: 0.1-3
Check: installed package size
Result: NOTE installed size is 49.3Mb sub-directories of 1Mb or more: libs 49.2Mb Flavors: r-oldrel-macos-arm64, r-oldrel-macos-x86_64, r-oldrel-windows-x86_64

Package Rsymphony

Current CRAN status: NOTE: 2, OK: 11

Version: 0.1-33
Check: Rd cross-references
Result: NOTE Package unavailable to check Rd xrefs: ‘Rglpk’ Flavor: r-oldrel-macos-arm64

Version: 0.1-33
Check: installed package size
Result: NOTE installed size is 5.8Mb sub-directories of 1Mb or more: libs 5.8Mb Flavor: r-oldrel-windows-x86_64

Package RWeka

Current CRAN status: NOTE: 4, OK: 9

Version: 0.4-46
Check: tests
Result: NOTE Running ‘data_exchange.R’ Comparing ‘data_exchange.Rout’ to ‘data_exchange.Rout.save’ ...136c136 < 1 2012-12-12 12:12:12 2012-12-12 12:12:12 --- > 1 2012-12-12 12:12:12 2012-12-12 13:12:12 159c159 < 1 2012-12-12 12:12:12 2012-12-12 12:12:12 --- > 1 2012-12-12 12:12:12 2012-12-12 13:12:12 Flavors: r-devel-linux-x86_64-fedora-clang, r-devel-linux-x86_64-fedora-gcc

Version: 0.4-46
Check: tests
Result: NOTE Running ‘data_exchange.R’ [1s/1s] Comparing ‘data_exchange.Rout’ to ‘data_exchange.Rout.save’ ...136c136 < 1 2012-12-12 12:12:12 2012-12-13 01:12:12 --- > 1 2012-12-12 12:12:12 2012-12-12 13:12:12 159c159 < 1 2012-12-12 12:12:12 2012-12-13 01:12:12 --- > 1 2012-12-12 12:12:12 2012-12-12 13:12:12 Flavor: r-release-macos-arm64

Version: 0.4-46
Check: tests
Result: NOTE Running ‘data_exchange.R’ [2s/1s] Comparing ‘data_exchange.Rout’ to ‘data_exchange.Rout.save’ ...136c136 < 1 2012-12-12 12:12:12 2012-12-12 07:12:12 --- > 1 2012-12-12 12:12:12 2012-12-12 13:12:12 159c159 < 1 2012-12-12 12:12:12 2012-12-12 07:12:12 --- > 1 2012-12-12 12:12:12 2012-12-12 13:12:12 Flavor: r-release-macos-x86_64

Package RWekajars

Current CRAN status: NOTE: 3, OK: 10

Version: 3.9.3-2
Check: installed package size
Result: NOTE installed size is 10.8Mb sub-directories of 1Mb or more: java 10.7Mb Flavors: r-oldrel-macos-arm64, r-oldrel-macos-x86_64, r-oldrel-windows-x86_64

Package skmeans

Current CRAN status: NOTE: 3, OK: 10

Version: 0.2-18
Check: package dependencies
Result: NOTE Package which this enhances but not available for checking: ‘kmndirs’ Flavors: r-oldrel-macos-arm64, r-oldrel-macos-x86_64, r-oldrel-windows-x86_64

Package slam

Current CRAN status: NOTE: 4, OK: 9

Version: 0.1-55
Check: tests
Result: NOTE Running 'abind.R' [0s] Comparing 'abind.Rout' to 'abind.Rout.save' ... OK Running 'apply.R' [0s] Comparing 'apply.Rout' to 'apply.Rout.save' ... OK Running 'crossprod.R' [0s] Comparing 'crossprod.Rout' to 'crossprod.Rout.save' ... OK Running 'dimgets.R' [0s] Running 'extract.R' [0s] Comparing 'extract.Rout' to 'extract.Rout.save' ... OK Running 'matrix.R' [0s] Comparing 'matrix.Rout' to 'matrix.Rout.save' ... OK Running 'matrix_dimnames.R' [0s] Comparing 'matrix_dimnames.Rout' to 'matrix_dimnames.Rout.save' ... OK Running 'rollup.R' [0s] Comparing 'rollup.Rout' to 'rollup.Rout.save' ... OK Running 'split.R' [0s] Comparing 'split.Rout' to 'split.Rout.save' ... OK Running 'ssa_valid.R' [0s] Comparing 'ssa_valid.Rout' to 'ssa_valid.Rout.save' ... OK Running 'stm.R' [0s] Comparing 'stm.Rout' to 'stm.Rout.save' ... OK Running 'stm_apply.R' [0s] Comparing 'stm_apply.Rout' to 'stm_apply.Rout.save' ... OK Running 'stm_rollup.R' [0s] Comparing 'stm_rollup.Rout' to 'stm_rollup.Rout.save' ...113a114,115 > _row_tsums: reduced 1 (3) zeros > _row_tsums: 0.000s [0.000s/0.000s] Running 'stm_subassign.R' [0s] Comparing 'stm_subassign.Rout' to 'stm_subassign.Rout.save' ... OK Running 'stm_ttcrossprod.R' [0s] Comparing 'stm_ttcrossprod.Rout' to 'stm_ttcrossprod.Rout.save' ... OK Running 'stm_valid.R' [0s] Comparing 'stm_valid.Rout' to 'stm_valid.Rout.save' ... OK Running 'stm_zeros.R' [0s] Comparing 'stm_zeros.Rout' to 'stm_zeros.Rout.save' ... OK Running 'subassign.R' [0s] Comparing 'subassign.Rout' to 'subassign.Rout.save' ... OK Running 'util.R' [0s] Comparing 'util.Rout' to 'util.Rout.save' ... OK Flavors: r-devel-windows-x86_64, r-release-windows-x86_64

Version: 0.1-55
Check: tests
Result: NOTE Running ‘abind.R’ [0s/0s] Comparing ‘abind.Rout’ to ‘abind.Rout.save’ ... OK Running ‘apply.R’ [0s/0s] Comparing ‘apply.Rout’ to ‘apply.Rout.save’ ... OK Running ‘crossprod.R’ [0s/0s] Comparing ‘crossprod.Rout’ to ‘crossprod.Rout.save’ ... OK Running ‘dimgets.R’ [0s/0s] Running ‘extract.R’ [0s/0s] Comparing ‘extract.Rout’ to ‘extract.Rout.save’ ... OK Running ‘matrix.R’ [0s/0s] Comparing ‘matrix.Rout’ to ‘matrix.Rout.save’ ... OK Running ‘matrix_dimnames.R’ [0s/0s] Comparing ‘matrix_dimnames.Rout’ to ‘matrix_dimnames.Rout.save’ ... OK Running ‘rollup.R’ [0s/0s] Comparing ‘rollup.Rout’ to ‘rollup.Rout.save’ ... OK Running ‘split.R’ [0s/0s] Comparing ‘split.Rout’ to ‘split.Rout.save’ ... OK Running ‘ssa_valid.R’ [0s/0s] Comparing ‘ssa_valid.Rout’ to ‘ssa_valid.Rout.save’ ... OK Running ‘stm.R’ [0s/0s] Comparing ‘stm.Rout’ to ‘stm.Rout.save’ ... OK Running ‘stm_apply.R’ [0s/0s] Comparing ‘stm_apply.Rout’ to ‘stm_apply.Rout.save’ ... OK Running ‘stm_rollup.R’ [0s/0s] Comparing ‘stm_rollup.Rout’ to ‘stm_rollup.Rout.save’ ...113a114,115 > _row_tsums: reduced 1 (3) zeros > _row_tsums: 0.000s [0.000s/0.000s] Running ‘stm_subassign.R’ [0s/0s] Comparing ‘stm_subassign.Rout’ to ‘stm_subassign.Rout.save’ ... OK Running ‘stm_ttcrossprod.R’ [0s/0s] Comparing ‘stm_ttcrossprod.Rout’ to ‘stm_ttcrossprod.Rout.save’ ... OK Running ‘stm_valid.R’ [0s/0s] Comparing ‘stm_valid.Rout’ to ‘stm_valid.Rout.save’ ... OK Running ‘stm_zeros.R’ [0s/0s] Comparing ‘stm_zeros.Rout’ to ‘stm_zeros.Rout.save’ ... OK Running ‘subassign.R’ [0s/0s] Comparing ‘subassign.Rout’ to ‘subassign.Rout.save’ ... OK Running ‘util.R’ [0s/0s] Comparing ‘util.Rout’ to ‘util.Rout.save’ ... OK Flavor: r-release-macos-arm64

Version: 0.1-55
Check: tests
Result: NOTE Running ‘abind.R’ [0s/0s] Comparing ‘abind.Rout’ to ‘abind.Rout.save’ ... OK Running ‘apply.R’ [0s/0s] Comparing ‘apply.Rout’ to ‘apply.Rout.save’ ... OK Running ‘crossprod.R’ [0s/0s] Comparing ‘crossprod.Rout’ to ‘crossprod.Rout.save’ ... OK Running ‘dimgets.R’ [0s/0s] Running ‘extract.R’ [0s/0s] Comparing ‘extract.Rout’ to ‘extract.Rout.save’ ... OK Running ‘matrix.R’ [0s/1s] Comparing ‘matrix.Rout’ to ‘matrix.Rout.save’ ... OK Running ‘matrix_dimnames.R’ [0s/0s] Comparing ‘matrix_dimnames.Rout’ to ‘matrix_dimnames.Rout.save’ ... OK Running ‘rollup.R’ [0s/0s] Comparing ‘rollup.Rout’ to ‘rollup.Rout.save’ ... OK Running ‘split.R’ [0s/0s] Comparing ‘split.Rout’ to ‘split.Rout.save’ ... OK Running ‘ssa_valid.R’ [0s/0s] Comparing ‘ssa_valid.Rout’ to ‘ssa_valid.Rout.save’ ... OK Running ‘stm.R’ [0s/0s] Comparing ‘stm.Rout’ to ‘stm.Rout.save’ ... OK Running ‘stm_apply.R’ [0s/0s] Comparing ‘stm_apply.Rout’ to ‘stm_apply.Rout.save’ ... OK Running ‘stm_rollup.R’ [0s/0s] Comparing ‘stm_rollup.Rout’ to ‘stm_rollup.Rout.save’ ...113a114,115 > _row_tsums: reduced 1 (3) zeros > _row_tsums: 0.000s [0.000s/0.000s] Running ‘stm_subassign.R’ [0s/1s] Comparing ‘stm_subassign.Rout’ to ‘stm_subassign.Rout.save’ ... OK Running ‘stm_ttcrossprod.R’ [0s/1s] Comparing ‘stm_ttcrossprod.Rout’ to ‘stm_ttcrossprod.Rout.save’ ... OK Running ‘stm_valid.R’ [0s/1s] Comparing ‘stm_valid.Rout’ to ‘stm_valid.Rout.save’ ... OK Running ‘stm_zeros.R’ [0s/1s] Comparing ‘stm_zeros.Rout’ to ‘stm_zeros.Rout.save’ ... OK Running ‘subassign.R’ [0s/1s] Comparing ‘subassign.Rout’ to ‘subassign.Rout.save’ ... OK Running ‘util.R’ [0s/1s] Comparing ‘util.Rout’ to ‘util.Rout.save’ ... OK Flavor: r-release-macos-x86_64

Package tau

Current CRAN status: NOTE: 4, OK: 9

Additional issues

rchk

Version: 0.0-26
Check: tests
Result: NOTE Running 'counting.R' [0s] Comparing 'counting.Rout' to 'counting.Rout.save' ...26a27,28 > counting ... 9 string(s) using 19 nodes [0.00s] > writing ... 16 strings [0.00s] 47a50 > counting ... 9 string(s) using 19 nodes [0.00s] 49a53,54 > counting ... 9 string(s) using 19 nodes [0.00s] > writing ... 16 strings [0.00s] 70a76,77 > counting ... 2 string(s) using 5 nodes [0.00s] > writing ... 2 strings [0.00s] 77a85,86 > counting ... 2 string(s) using 5 nodes [0.00s] > writing ... 4 strings [0.00s] 86a96,97 > counting ... 2 string(s) using 6 nodes [0.00s] > writing ... 5 strings [0.00s] 96a108,109 > counting ... 2 string(s) using 6 nodes [0.00s] > writing ... 5 strings [0.00s] 106a120 > counting ... 2 string(s) using 5 nodes [0.00s] 108a123,124 > counting ... 2 string(s) using 5 nodes [0.00s] > writing ... 2 strings [0.00s] Running 'counting_useBytes.R' [0s] Comparing 'counting_useBytes.Rout' to 'counting_useBytes.Rout.save' ...32a33,34 > counting ... 10 string(s) using 19 nodes [0.00s] > writing ... 19 strings [0.00s] 56a59 > counting ... 10 string(s) using 19 nodes [0.00s] 58a62,63 > counting ... 10 string(s) using 19 nodes [0.00s] > writing ... 19 strings [0.00s] 82a88,89 > counting ... 2 string(s) using 5 nodes [0.00s] > writing ... 2 strings [0.00s] 89a97,98 > counting ... 2 string(s) using 5 nodes [0.00s] > writing ... 5 strings [0.00s] 99a109,110 > counting ... 2 string(s) using 6 nodes [0.00s] > writing ... 6 strings [0.00s] 110a122,123 > counting ... 2 string(s) using 6 nodes [0.00s] > writing ... 6 strings [0.00s] 121a135 > counting ... 2 string(s) using 5 nodes [0.00s] 123a138,139 > counting ... 2 string(s) using 5 nodes [0.00s] > writing ... 2 strings [0.00s] Flavors: r-devel-windows-x86_64, r-release-windows-x86_64

Version: 0.0-26
Check: tests
Result: NOTE Running ‘counting.R’ [0s/0s] Comparing ‘counting.Rout’ to ‘counting.Rout.save’ ...26a27,28 > counting ... 9 string(s) using 19 nodes [0.00s] > writing ... 16 strings [0.00s] 47a50 > counting ... 9 string(s) using 19 nodes [0.00s] 49a53,54 > counting ... 9 string(s) using 19 nodes [0.00s] > writing ... 16 strings [0.00s] 70a76,77 > counting ... 2 string(s) using 5 nodes [0.00s] > writing ... 2 strings [0.00s] 77a85,86 > counting ... 2 string(s) using 5 nodes [0.00s] > writing ... 4 strings [0.00s] 86a96,97 > counting ... 2 string(s) using 6 nodes [0.00s] > writing ... 5 strings [0.00s] 96a108,109 > counting ... 2 string(s) using 6 nodes [0.00s] > writing ... 5 strings [0.00s] 106a120 > counting ... 2 string(s) using 5 nodes [0.00s] 108a123,124 > counting ... 2 string(s) using 5 nodes [0.00s] > writing ... 2 strings [0.00s] Running ‘counting_useBytes.R’ [0s/0s] Comparing ‘counting_useBytes.Rout’ to ‘counting_useBytes.Rout.save’ ...32a33,34 > counting ... 10 string(s) using 19 nodes [0.00s] > writing ... 19 strings [0.00s] 56a59 > counting ... 10 string(s) using 19 nodes [0.00s] 58a62,63 > counting ... 10 string(s) using 19 nodes [0.00s] > writing ... 19 strings [0.00s] 82a88,89 > counting ... 2 string(s) using 5 nodes [0.00s] > writing ... 2 strings [0.00s] 89a97,98 > counting ... 2 string(s) using 5 nodes [0.00s] > writing ... 5 strings [0.00s] 99a109,110 > counting ... 2 string(s) using 6 nodes [0.00s] > writing ... 6 strings [0.00s] 110a122,123 > counting ... 2 string(s) using 6 nodes [0.00s] > writing ... 6 strings [0.00s] 121a135 > counting ... 2 string(s) using 5 nodes [0.00s] 123a138,139 > counting ... 2 string(s) using 5 nodes [0.00s] > writing ... 2 strings [0.00s] Flavors: r-release-macos-arm64, r-release-macos-x86_64

Package textcat

Current CRAN status: OK: 13

Package tm

Current CRAN status: ERROR: 3, NOTE: 2, OK: 8

Version: 0.7-17
Check: whether package can be installed
Result: ERROR Installation failed. Flavor: r-devel-windows-x86_64

Version: 0.7-17
Check: examples
Result: ERROR Running examples in ‘tm-Ex.R’ failed The error most likely occurred in: > ### Name: readPDF > ### Title: Read In a PDF Document > ### Aliases: readPDF > ### Keywords: file > > ### ** Examples > > uri <- paste0("file://", + system.file(file.path("doc", "tm.pdf"), package = "tm")) > engine <- if(nzchar(system.file(package = "pdftools"))) { + "pdftools" + } else { + "ghostscript" + } > reader <- readPDF(engine) > pdf <- reader(elem = list(uri = uri), language = "en", id = "id1") > cat(content(pdf)[1]) Introduction to the tm Package Text Mining in R Ingo Feinerer December 10, 2025 Introduction This vignette gives a short introduction to text mining in R utilizing the text mining framework provided by the tm package. We present methods for data import, corpus handling, preprocessing, metadata management, and creation of term-document matrices. Our focus is on the main aspects of getting started with text mining in R—an in-depth description of the text mining infrastructure offered by tm was published in the Journal of Statistical Software (Feinerer et al., 2008). An introductory article on text mining in R was published in R News (Feinerer, 2008). Data Import The main structure for managing documents in tm is a so-called Corpus, representing a collection of text documents. A corpus is an abstract concept, and there can exist several implementations in parallel. The default implementation is the so-called VCorpus (short for Volatile Corpus) which realizes a semantics as known from most R objects: corpora are R objects held fully in memory. We denote this as volatile since once the R object is destroyed, the whole corpus is gone. Such a volatile corpus can be created via the constructor VCorpus(x, readerControl). Another implementation is the PCorpus which implements a Permanent Corpus semantics, i.e., the documents are physically stored outside of R (e.g., in a database), corresponding R objects are basically only pointers to external structures, and changes to the underlying corpus are reflected to all R objects associated with it. Compared to the volatile corpus the corpus encapsulated by a permanent corpus object is not destroyed if the corresponding R object is released. Within the corpus constructor, x must be a Source object which abstracts the input location. tm provides a set of predefined sources, e.g., DirSource, VectorSource, or DataframeSource, which handle a directory, a vector interpreting each component as document, or data frame like structures (like CSV files), respectively. Except DirSource, which is designed solely for directories on a file system, and VectorSource, which only accepts (char- acter) vectors, most other implemented sources can take connections as input (a character string is interpreted as file path). getSources() lists available sources, and users can create their own sources. The second argument readerControl of the corpus constructor has to be a list with the named components reader and language. The first component reader constructs a text document from elements delivered by a source. The tm package ships with several readers (e.g., readPlain(), readPDF(), readDOC(), . . . ). See getReaders() for an up-to-date list of available readers. Each source has a default reader which can be overridden. E.g., for DirSource the default just reads in the input files and interprets their content as text. Finally, the second component language sets the texts’ language (preferably using ISO 639-2 codes). In case of a permanent corpus, a third argument dbControl has to be a list with the named components dbName giving the filename holding the sourced out objects (i.e., the database), and dbType holding a valid database type as supported by package filehash. Activated database support reduces the memory demand, however, access gets slower since each operation is limited by the hard disk’s read and write capabilities. So e.g., plain text files in the directory txt containing Latin (lat) texts by the Roman poet Ovid can be read in with following code: > txt <- system.file("texts", "txt", package = "tm") > (ovid <- VCorpus(DirSource(txt, encoding = "UTF-8"), + readerControl = list(language = "lat"))) <<VCorpus>> Metadata: corpus specific: 0, document level (indexed): 0 Content: documents: 5 1 > VCorpus(URISource(uri, mode = ""), + readerControl = list(reader = readPDF(engine = "ghostscript"))) sh: : command not found Error in `<current-expression>` : error in running command Calls: VCorpus ... mapply -> <Anonymous> -> <Anonymous> -> pdf_info -> system2 Execution halted Flavor: r-release-macos-arm64

Version: 0.7-17
Check: package dependencies
Result: NOTE Packages suggested but not available for checking: 'Rcampdf', 'tm.lexicon.GeneralInquirer' Flavors: r-oldrel-macos-arm64, r-oldrel-macos-x86_64, r-oldrel-windows-x86_64

Version: 0.7-17
Check: examples
Result: ERROR Running examples in ‘tm-Ex.R’ failed The error most likely occurred in: > ### Name: readPDF > ### Title: Read In a PDF Document > ### Aliases: readPDF > ### Keywords: file > > ### ** Examples > > uri <- paste0("file://", + system.file(file.path("doc", "tm.pdf"), package = "tm")) > engine <- if(nzchar(system.file(package = "pdftools"))) { + "pdftools" + } else { + "ghostscript" + } > reader <- readPDF(engine) > pdf <- reader(elem = list(uri = uri), language = "en", id = "id1") > cat(content(pdf)[1]) Introduction to the tm Package Text Mining in R Ingo Feinerer December 10, 2025 Introduction This vignette gives a short introduction to text mining in R utilizing the text mining framework provided by the tm package. We present methods for data import, corpus handling, preprocessing, metadata management, and creation of term-document matrices. Our focus is on the main aspects of getting started with text mining in R—an in-depth description of the text mining infrastructure offered by tm was published in the Journal of Statistical Software (Feinerer et al., 2008). An introductory article on text mining in R was published in R News (Feinerer, 2008). Data Import The main structure for managing documents in tm is a so-called Corpus, representing a collection of text documents. A corpus is an abstract concept, and there can exist several implementations in parallel. The default implementation is the so-called VCorpus (short for Volatile Corpus) which realizes a semantics as known from most R objects: corpora are R objects held fully in memory. We denote this as volatile since once the R object is destroyed, the whole corpus is gone. Such a volatile corpus can be created via the constructor VCorpus(x, readerControl). Another implementation is the PCorpus which implements a Permanent Corpus semantics, i.e., the documents are physically stored outside of R (e.g., in a database), corresponding R objects are basically only pointers to external structures, and changes to the underlying corpus are reflected to all R objects associated with it. Compared to the volatile corpus the corpus encapsulated by a permanent corpus object is not destroyed if the corresponding R object is released. Within the corpus constructor, x must be a Source object which abstracts the input location. tm provides a set of predefined sources, e.g., DirSource, VectorSource, or DataframeSource, which handle a directory, a vector interpreting each component as document, or data frame like structures (like CSV files), respectively. Except DirSource, which is designed solely for directories on a file system, and VectorSource, which only accepts (char- acter) vectors, most other implemented sources can take connections as input (a character string is interpreted as file path). getSources() lists available sources, and users can create their own sources. The second argument readerControl of the corpus constructor has to be a list with the named components reader and language. The first component reader constructs a text document from elements delivered by a source. The tm package ships with several readers (e.g., readPlain(), readPDF(), readDOC(), . . . ). See getReaders() for an up-to-date list of available readers. Each source has a default reader which can be overridden. E.g., for DirSource the default just reads in the input files and interprets their content as text. Finally, the second component language sets the texts’ language (preferably using ISO 639-2 codes). In case of a permanent corpus, a third argument dbControl has to be a list with the named components dbName giving the filename holding the sourced out objects (i.e., the database), and dbType holding a valid database type as supported by package filehash. Activated database support reduces the memory demand, however, access gets slower since each operation is limited by the hard disk’s read and write capabilities. So e.g., plain text files in the directory txt containing Latin (lat) texts by the Roman poet Ovid can be read in with following code: > txt <- system.file("texts", "txt", package = "tm") > (ovid <- VCorpus(DirSource(txt, encoding = "UTF-8"), + readerControl = list(language = "lat"))) <<VCorpus>> Metadata: corpus specific: 0, document level (indexed): 0 Content: documents: 5 1 > VCorpus(URISource(uri, mode = ""), + readerControl = list(reader = readPDF(engine = "ghostscript"))) sh: : command not found Error in system2(gs_cmd, c("-dNODISPLAY -q", sprintf("-sFile=%s", shQuote(file)), : error in running command Calls: VCorpus ... mapply -> <Anonymous> -> <Anonymous> -> pdf_info -> system2 Execution halted Flavor: r-oldrel-macos-arm64

Package tm.plugin.mail

Current CRAN status: OK: 13

Package tseries

Current CRAN status: NOTE: 2, OK: 11

Version: 0.10-58
Check: dependencies in R code
Result: NOTE Namespace in Imports field not imported from: ‘jsonlite’ All declared Imports should be used. Flavors: r-devel-linux-x86_64-fedora-clang, r-devel-linux-x86_64-fedora-gcc

Package Unicode

Current CRAN status: OK: 13

Package W3CMarkupValidator

Current CRAN status: OK: 13

Package wordnet

Current CRAN status: NOTE: 3, OK: 10

Version: 0.1-17
Check: package dependencies
Result: NOTE Package suggested but not available for checking: ‘wordnetDicts’ Flavors: r-oldrel-macos-arm64, r-oldrel-macos-x86_64, r-oldrel-windows-x86_64

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.