NEWS

The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Changes in v0.6.1

Mention doc2vec in package description.
Add perplexity() to asses models’ the goodness-of-fit to data.
Save quanteda’s internal docvars in the textmodel_doc2vec objects.
Add group to as.matrix() to average sentence or paragraph vectors from the same documents.

Changes in v0.6.0

Upgrade textmodel_doc2vec to train the distributed memory (DM) and distributed bag-of-word (DBOW) models.
Add as.textmodel_doc2vec() to create document vectors as weighted average of word vectors.
Add layer to as.matrix() to choose between word or document vectors.
normalize is now defunct in textmodel_word2vec().

Changes in v0.5.1

Add normalize to textmodel_doc2vec() and pass it to as.matrix().
Add weights to textmodel_doc2vec() to adjust the salience of words in the document vectors.
Add include_data to textmodel_word2vec() to save the original tokens object.

Changes in v0.5.0

Add the model argument to textmodel_word2vec() to update existing models.
The normalize argument is moved from textmodel_word2vec() to as.matrix(). The original argument is deprecated and set to FALSE by default.
Remove weights().
Improve the structure of C++ code.

Changes in v0.4.0

Add the tolower argument and set to TRUE to lower-case tokens.
Allow x to be quanteda’s tokens_xptr object to enhance efficiency.

Changes in v0.3.0

Save docvars in the textmodel_doc2vec objects.
Set zero for empty documents in the textmodel_doc2vec objects.
Add probability() to compute probability of words.

Changes in v0.2.0

Rename word2vec(), doc2vec() and lsa() to textmodel_word2vec(), textmodel_doc2vec() and textmodel_lsa() respectively.
Simplify the C++ code to make maintenance easier.
Add normalize to word2vec to disable or enable word vector normalization.
Add weights() to extract back-propagation weights.
Make analogy() to convert a formula to named character vector.
Improve the stability of word2vec() when verbose = TRUE.

Changes in v0.1.0

Fork https://github.com/bnosac/word2vec and change the package name to wordvector.
Replace a list of character with quanteda’s tokens object as an input object.
Recreate word2vec() with new argument names and object structures.
Create lda() to train word vectors using Latent Semantic Analysis.
Add similarity() and analogy() functions using proxyC.
Add data_corpus_news2014 that contain 20,000 news summaries as package data.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.