The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

User-based k-nearest neighbors in rrecsys

Given a target user and her positively rated items, the algorithm will identify the \(k\)-most similar users of the target user.

The choice of the \(k\) nearest neighbors for the neighborhood formation results in a tradeoff: a very small \(k\) leads to few candidate items that can be recommended because there are not a lot of neighbors to support the predictions. In contrast, a very large \(k\) impacts precision as the particularities of user's preferences can be blunted due to the large neighborhood size. In most related works \(k\) has been set to be in the range of values from 10 to 100, where the optimum \(k\) also depends on data characteristics such as sparsity.

The similarity is measured based on three algorithms: cosine(simFunct ='cos') and Pearson Correlation(simFunct = 'Pearson').

For the Rating Prediction task, to train a model with this algorithm, it is required to define an additional argument, neigh the neighborhood size.

data("ml100k")
d <- defineData(ml100k)
e <- evalModel(d, folds = 2)
evalPred(e, "ubknn", simFunct = "Pearson", neigh = 10)

For the Item Recommendation task, to provide item recommendations, it is required to define two additional arguments, positiveThreshold the threshold for “positive” ratings, and the topN the number of recommended items.

data("ml100k")
d <- defineData(ml100k)
e <- evalModel(d, folds = 2)
evalRec(e, "ubknn", simFunct = "Pearson", neigh = 10, positiveThreshold = 3, topN = 3)

The neigh default value is 10. The positiveThreshold default value is 3. The topN default value is 10.

The returned object is of type UBclass.

To get more details about the slots read the reference manual.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.