The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Introduction to Analitica

Carlos Jiménez-Gallardo

2025-06-27

1 Overview

The Analitica package provides essential tools for:

Descriptive statistical summaries
Exploratory visualizations
Homoscedasticity tests
Outlier detection
Parametric and non-parametric group comparisons

It is suitable for researchers, educators, and analysts seeking quick and interpretable workflows.

2 1. Descriptive Analysis

Use descripYG() to explore a numeric variable, optionally grouped by a categorical variable:

data(d_e, package = "Analitica")
descripYG(d_e, vd = Sueldo_actual)

#>     n     Mean Median       SD Kurtosis Skewness        CV   Min    Max   P25
#> 1 474 34419.57  28875 17075.66  8.30863 2.117877 0.4961033 15750 135000 24000
#>       P75     IQR Fence_Low Fence_High
#> 1 36937.5 12937.5   4593.75   56343.75
descripYG(d_e, vd = Sueldo_actual, vi = labor)
#> Picking joint bandwidth of 2460

#>   Group   n     Mean Median        SD  Kurtosis   Skewness         CV   Min
#> 1     1 363 27838.54  26550  7567.995 10.850828  1.8973062 0.27185316 15750
#> 2     2  27 30938.89  30750  2114.616  5.795226 -0.3472238 0.06834817 24300
#> 3     3  84 63977.80  60500 18244.776  4.913269  1.1597365 0.28517355 34410
#>      Max      P25      P75   IQR
#> 1  80000 22800.00 31200.00  8400
#> 2  35250 30150.00 30975.00   825
#> 3 135000 51956.25 71281.25 19325

3 2. Homogeneity of Variance Tests

You can assess variance assumptions using manual implementations:

Levene.Test(Sueldo_actual ~ labor, data = d_e)
#> $Statistic
#> [1] 36.089
#> 
#> $df
#> df_between  df_within 
#>          2        471 
#> 
#> $p_value
#> [1] 0
#> 
#> $Significance
#> [1] "***"
#> 
#> $Decision
#> [1] "Heteroscedastic"
#> 
#> $Method
#> [1] "Levene (median)"
#> 
#> attr(,"class")
#> [1] "homocedasticidad"
BartlettTest(Sueldo_actual ~ labor, data = d_e)
#> $Statistic
#> [1] 194.6489
#> 
#> $df
#> [1] 2
#> 
#> $p_value
#> [1] 0
#> 
#> $Significance
#> [1] "***"
#> 
#> $Decision
#> [1] "Heterocedastic"
#> 
#> $Method
#> [1] "Bartlett"
#> 
#> attr(,"class")
#> [1] "homocedasticidad"
FKTest(Sueldo_actual ~ labor, data = d_e)
#> $Statistic
#> [1] 88.2881
#> 
#> $df
#> [1] 2
#> 
#> $p_value
#> [1] 0
#> 
#> $Significance
#> [1] "***"
#> 
#> $Decision
#> [1] "Heteroscedastic"
#> 
#> $Method
#> [1] "Fligner-Killeen"
#> 
#> attr(,"class")
#> [1] "homocedasticidad"

4 3. Outlier Detection

Detect univariate outliers with Grubbs’ test:

res <- grubbs_outliers(d_e, Sueldo_actual)
head(res[res$outL == TRUE, ])
#>      ID Sexo   FechaNAc educacion labor Sueldo_actual Sueldo_inicial antigüedad
#> 18   18    h 20/03/1986        16     3        103750          27510         97
#> 29   29    h 28/01/1964        19     3        135000          79980         96
#> 32   32    h 28/01/1984        19     3        110625          45000         96
#> 34   34    h 02/02/1969        19     3         92000          39990         96
#> 103 103    h 17/03/1989        19     3         97000          35010         91
#> 106 106    h 04/08/1962        19     3         91250          29490         91
#>     experiencia minoria outL
#> 18           70       0 TRUE
#> 29          199       0 TRUE
#> 32          120       0 TRUE
#> 34          175       0 TRUE
#> 103          68       0 TRUE
#> 106          23       0 TRUE

5 4. Multiple Comparisons (Post Hoc Tests)

Fit an ANOVA model and apply post hoc tests:

mod <- aov(Sueldo_actual ~ as.factor(labor), data = d_e)
resultado <- GHTest(mod)
summary(resultado)
#> =====================================
#>   Multiple Comparison Method Summary
#> =====================================
#> Method used: Games-Howell 
#> 
#> >> Group means:
#>        1        2        3 
#> 27838.54 30938.89 63977.80 
#> 
#> >> Order of means (from highest to lowest):
#> [1] "3" "2" "1"
#> 
#> >> Pairwise comparisons:
#>    Comparacion Diferencia t_value    gl p_value Significancia
#> 1        1 - 2   3100.349  5.4518 93.07       0           ***
#> 11       1 - 3  36139.258 17.8034 89.71       0           ***
#> 2        2 - 3  33038.909 16.2606 89.58       0           ***
plot(resultado)

Other methods include TukeyTest(), ScheffeTest(), DuncanTest(), SNKTest(), T2Test(), and T3Test().

6 5. Non-Parametric Tests

When assumptions are violated, try:

g1 <- d_e$Sueldo_actual[d_e$labor == 1]
g2 <- d_e$Sueldo_actual[d_e$labor == 2]
MWTest(g1, g2)
#> $Resultados
#>            Comparacion Diferencia Valor_Critico p_value Significancia
#> Grupo2 Grupo1 - Grupo2   3100.349            NA   1e-04           ***
#> 
#> $Promedios
#>   Grupo1   Grupo2 
#> 27838.54 30938.89 
#> 
#> $Orden_Medias
#> [1] "Grupo2" "Grupo1"
#> 
#> $Metodo
#> [1] "Mann-Whitney U (two.sided, manual)"
#> 
#> attr(,"class")
#> [1] "comparacion" "mannwhitney"
BMTest(g1, g2)
#> $Resultados
#>            Comparacion Diferencia    df     SE t_critical p_value  p_hat
#> Grupo1 Grupo1 - Grupo2  -3100.349 64.98 9.7586     1.9971       0 0.7297
#>        Significancia
#> Grupo1           ***
#> 
#> $Promedios
#>   Grupo1   Grupo2 
#> 27838.54 30938.89 
#> 
#> $df
#> [1] 64.98189
#> 
#> $Orden_Medias
#> [1] "Grupo2" "Grupo1"
#> 
#> $Metodo
#> [1] "Brunner-Munzel (two.sided)"
#> 
#> $p_hat
#> [1] 0.7296704
#> 
#> attr(,"class")
#> [1] "comparacion"   "brunnermunzel"
BMpTest(g1, g2)
#> $Resultados
#>            Comparacion Diferencia Valor_Critico p_value  p_hat Significancia
#> Grupo2 Grupo1 - Grupo2   3100.349            NA       0 0.7297             *
#> 
#> $Promedios
#>   Grupo1   Grupo2 
#> 27838.54 30938.89 
#> 
#> $Orden_Medias
#> [1] "Grupo2" "Grupo1"
#> 
#> $Metodo
#> [1] "Brunner-Munzel (perm, two.sided)"
#> 
#> attr(,"class")
#> [1] "comparacion"        "brunnermunzel_perm"

7 Conclusion

Analitica integrates descriptive analysis with robust comparison methods for applied data exploration.

For detailed documentation, see ?Analitica or function-specific help pages like ?GHTest or ?descripYG.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.