The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
This vignette covers the second goal of distionary: to
evaluate probability distributions, even when that property is not
specified in the distribution’s definition.
A distributional representation is a mathematical function that completely defines a probability distribution. Unlike a simple property (such as the mean or variance), a representation contains enough information that any other property or representation can be calculated from it.
The key innovation in distionary is that these
representations are interconnected through a network of relationships,
allowing you to specify a distribution using any available
representation and automatically derive others as needed. For example,
if you specify only a CDF, distionary can compute the
quantile function, mean, variance, and other properties.
Here is a list of representations recognised by
distionary, and the functions for accessing them.
| Representation | distionary Functions |
|---|---|
| Cumulative Distribution Function | eval_cdf(), enframe_cdf() |
| Survival Function | eval_survival(), enframe_survival() |
| Quantile Function | eval_quantile(), enframe_quantile() |
| Hazard Function | eval_hazard(), enframe_hazard() |
| Cumulative Hazard Function | eval_chf(), enframe_chf() |
| Probability density Function | eval_density(), enframe_density() |
| Probability mass Function (PMF) | eval_pmf(), enframe_pmf() |
| Odds Function | eval_odds(), enframe_odds() |
| Return Level Function | eval_return(), enframe_return() |
All representations can either be accessed by the
eval_*() family of functions, providing a vector of the
evaluated representation.
d1 <- dst_geom(0.6)
eval_pmf(d1, at = 0:5)
#> [1] 0.600000 0.240000 0.096000 0.038400 0.015360 0.006144Alternatively, the enframe_*() family of functions
provides the results in a tibble or data frame paired with the inputs,
useful in a data wrangling workflow.
enframe_pmf(d1, at = 0:5)
#> # A tibble: 6 × 2
#> .arg pmf
#> <int> <dbl>
#> 1 0 0.6
#> 2 1 0.24
#> 3 2 0.096
#> 4 3 0.0384
#> 5 4 0.0154
#> 6 5 0.00614The enframe_*() functions allow for insertion of
multiple distributions, placing a column for each distribution. The
column names can be changed in three ways:
.arg can be renamed with the
arg_name argument.pmf prefix on the evaluation columns can be changed
with the fn_prefix argument.Let’s practice this with the addition of a second distribution.
d2 <- dst_geom(0.4)
enframe_pmf(
model1 = d1, model2 = d2, at = 0:5,
arg_name = "num_failures", fn_prefix = "probability"
)
#> # A tibble: 6 × 3
#> num_failures probability_model1 probability_model2
#> <int> <dbl> <dbl>
#> 1 0 0.6 0.4
#> 2 1 0.24 0.24
#> 3 2 0.096 0.144
#> 4 3 0.0384 0.0864
#> 5 4 0.0154 0.0518
#> 6 5 0.00614 0.0311To draw a random sample from a distribution, use the
realise() or realize() function:
You can read this call as “realise distribution d five
times”. By default, n is set to 1, so that realising
converts a distribution to a numeric draw:
While random sampling falls into the same family as the
p*/d*/q*/r* functions from the stats package
(e.g., rnorm()), this function is not a distributional
representation, hence does not have a eval_*() or
enframe_*() counterpart. This is because it’s impossible to
perfectly describe a distribution based on a sample.
distionary distinguishes between distributional
representations (which fully define a distribution) and
distributional properties (which are characteristics that can
be computed from representations).
A distribution property is any measurable characteristic that can be calculated from a distribution’s representation. Unlike representations, properties do not contain enough information to fully reconstruct the distribution. For example, knowing the mean and variance of a distribution doesn’t tell you whether it’s a Normal, Gamma, or some other distribution family. Properties include statistical moments and other summary measures.
Below is a table of the properties incorporated in
distionary, and the corresponding functions for accessing
them.
| Property | distionary Function |
|---|---|
| Mean | mean() |
| Median | median() |
| Variance | variance() |
| Standard Deviation | sd() |
| Skewness | skewness() |
| Excess Kurtosis | kurtosis_exc() |
| Kurtosis | kurtosis() |
| Range | range() |
Here’s the mean and variance of our original distribution.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.