The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
Changes requested during the initial CRAN review:
normality_tests() no longer touches the global
random-number state. Large columns are now reduced with a deterministic,
evenly-spaced subsample instead of set.seed() +
sample(); the seed argument has been
removed.New analysis and reporting:
report() renders a complete profile to a self-contained
HTML file (requires pandoc, via ).categorical_association() and
plot_association() add Cramer’s V between categorical
columns (the categorical analogue of the correlation matrix).analyze_dates() profiles date/datetime columns: range,
unique count, and the largest gap between consecutive timestamps.compare_groups() summarises numeric columns within the
levels of a grouping column (grouped/comparative profiling).Pipeline changes:
profile_data() gains group_by (adds a
grouped comparison to the diagnostics) and distributions
(set FALSE to skip the eager per-column distribution plots
on wide data). Association and date results are now part of the returned
object, and plot() accepts
which = "association".summary() now also prints date, association and
grouped-comparison sections when present.profile_data() with type inference,
missing-value analysis, summary statistics (incl. skewness/kurtosis),
normality tests, outlier detection (IQR/z-score/robust), correlation
analysis, a data-quality score, and ggplot2 figures,
returned as a data_profile S3 object with
print(), summary() and plot()
methods.These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.