The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
Creating, and refining data nuggets. Data nuggets reduce a large dataset into a small collection of nuggets of data, each containing a center (location), weight (importance), and scale (variability) parameter. Data nugget centers are created by choosing observations in the dataset which are as equally spaced apart as possible. Data nugget weights are created by counting the number observations closest to a given data nugget center. We then say the data nugget 'contains' these observations and the data nugget center is recalculated as the mean of these observations. Data nugget scales are created by calculating the trace of the covariance matrix of the observations contained within a data nugget divided by the dimension of the dataset. Data nuggets are refined by 'splitting' data nuggets which have scales or shapes (defined as the ratio of the two largest eigenvalues of the covariance matrix of the observations contained within the data nugget) Reference paper: [1] Beavers, T. E., Cheng, G., Duan, Y., Cabrera, J., Lubomirski, M., Amaratunga, D., & Teigler, J. E. (2024). Data Nuggets: A Method for Reducing Big Data While Preserving Data Structure. Journal of Computational and Graphical Statistics, 1-21. [2] Cherasia, K. E., Cabrera, J., Fernholz, L. T., & Fernholz, R. (2022). Data Nuggets in Supervised Learning. \emph{In Robust and Multivariate Statistical Methods: Festschrift in Honor of David E. Tyler} (pp. 429-449). Cham: Springer International Publishing.
Version: | 1.3.1 |
Depends: | R (≥ 4.0), doSNOW (≥ 1.0.16), foreach (≥ 1.5.1), parallel (≥ 4.0.5), Rfast (≥ 2.0.7) |
Published: | 2024-09-14 |
DOI: | 10.32614/CRAN.package.datanugget |
Author: | Yajie Duan [cre, ctb], Traymon Beavers [aut], Javier Cabrera [aut], Ge Cheng [aut], Kunting Qi [aut], Mariusz Lubomirski [aut] |
Maintainer: | Yajie Duan <yajieritaduan at gmail.com> |
License: | GPL-2 |
NeedsCompilation: | no |
CRAN checks: | datanugget results |
Reference manual: | datanugget.pdf |
Package source: | datanugget_1.3.1.tar.gz |
Windows binaries: | r-devel: datanugget_1.3.1.zip, r-release: datanugget_1.3.1.zip, r-oldrel: datanugget_1.3.1.zip |
macOS binaries: | r-release (arm64): datanugget_1.3.1.tgz, r-oldrel (arm64): datanugget_1.3.1.tgz, r-release (x86_64): datanugget_1.3.1.tgz, r-oldrel (x86_64): datanugget_1.3.1.tgz |
Old sources: | datanugget archive |
Reverse depends: | PPbigdata, WCluster |
Please use the canonical form https://CRAN.R-project.org/package=datanugget to link to this page.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.