The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
Please check the latest news (change log) and keep this package updated.
BERT_info()
.add.tokens
and add.method
parameters for BERT_vocab()
and FMAT_run()
: An experimental functionality to add new tokens (e.g., out-of-vocabulary words, compound words, or even phrases) as [MASK] options. Validation is still needed for this novel practice (one of my ongoing projects), so currently please only use at your own risk, waiting until the publication of my validation work.BERT_download()
now import local model files only, without automatically downloading models. Users must first use BERT_download()
to download models.FMAT_load()
: Better to use FMAT_run()
directly.BERT_vocab()
and ICC_models()
.summary.fmat()
, FMAT_query()
, and FMAT_run()
(significantly faster because now it can simultaneously estimate all [MASK] options for each unique query sentence, with running time only depending on the number of unique queries but not on the number of [MASK] options).reticulate
package version ≥ 1.36.1, then FMAT
should be updated to ≥ 2024.4. Otherwise, out-of-vocabulary [MASK] words may not be identified and marked. Now FMAT_run()
directly uses model vocabulary and token ID to match [MASK] words. To check if a [MASK] word is in the model vocabulary, please use BERT_vocab()
.BERT_download()
(downloading models to local cache folder “%USERPROFILE%/.cache/huggingface”) to differentiate from FMAT_load()
(loading saved models from local cache). But indeed FMAT_load()
can also download models silently if they have not been downloaded.gpu
parameter (see Guidance for GPU Acceleration) in FMAT_run()
to allow for specifying an NVIDIA GPU device on which the fill-mask pipeline will be allocated. GPU roughly performs 3x faster than CPU for the fill-mask pipeline. By default, FMAT_run()
would automatically detect and use any available GPU with an installed CUDA-supported Python torch
package (if not, it would use CPU).FMAT_run()
.BERT_download()
, FMAT_load()
, and FMAT_run()
.parallel
in FMAT_run()
: FMAT_run(model.names, data, gpu=TRUE)
is the fastest.progress
in FMAT_run()
.These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.