The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Simple Normal Data Example

Aimee Lee Houde

2024-02-04

In this vignette, the fullfact package is explored using simple (functions designated by no number) for the standard model, i.e. containing random effects for dam, sire, and dam by sire, for normal data type or error structure.

Advanced (functions designated by the number 2) for the standard model with the options of including additional random effects for one position (e.g. tank) and/or one block effect (e.g. several blocks of 2 \(\times\) 2 factorial matings) is explored in the vignette Advanced Normal Data Example.

Expert (functions designated by the number 3) for the standard model with the ability of the user to include additional fixed and/or random effects, such as a model including environment treatments and their interactions is explored in the vignette Expert Normal Data Example.

Non-normal error structures (e.g. binary, proportion, and/or count data types) are explored in another three vignettes: (1) Simple Non-normal Data Example, (2) Advanced Non-normal Data Example, and (3) Expert Non-normal Data Example.

Load the package and example data

The example data set is an 11 \(\times\) 11 full factorial mating: 11 dams and 11 sires with all combinations resulting in 121 families. There are 10 observations per family or 5 observations for each of two replicates per family.

library("fullfact")     
data(chinook_length)
head(chinook_length)
#>   family repli dam sire tray cell length egg_size
#> 1     f1    r1  d1   s1   t7   1A   22.7     7.27
#> 2     f1    r2  d1   s1   t8   1A   22.0     7.27
#> 3     f1    r1  d1   s1   t7   1A   23.8     7.27
#> 4     f1    r2  d1   s1   t8   1A   21.8     7.27
#> 5     f1    r1  d1   s1   t7   1A   22.6     7.27
#> 6     f1    r2  d1   s1   t8   1A   22.4     7.27

Displayed are columns for family identities (ID), replicate ID, dam ID, sire ID, incubation tray ID, incubation cell ID (within tray), Chinook salmon length (mm) at hatch, and dam egg size (mm).

Observed variance components

Model random effects are dam, sire, and dam by sire. Extracts the dam, sire, dam, dam by sire, and residual variance components. Calculates the total variance component. Calculates the additive genetic, non-additive genetic, and maternal variance components.

Assuming the effects of epistasis are of negligible importance, the additive genetic variance (VA) component is calculated as four times the sire (VS), the non-additive genetic variance (VN) component as four times the dam by sire interaction (VD\(\times\)S), and the maternal variance component (VM) as the dam (VD) – sire (VS) (Lynch and Walsh 1998, p. 603). When there is epistasis, those variance components will be overestimated and this may explain why the percentage of phenotypic variance explained by the components can add up to more than 100% in certain cases.

Significance values for the random effects are determined using likelihood ratio tests (Bolker et al. 2009).

Lynch M, Walsh B. 1998. Genetics and Analysis of Quantitative Traits. Sinauer Associates, Massachusetts.

Bolker BM, Brooks ME, Clark CJ, Geange SW, Poulsen JR, Stevens MHH, White J-SS. 2009. Generalized linear mixed models: a practical guide for ecology and evolution. Trends in Ecology and Evolution 24(3): 127-135. DOI: 10.1016/j.tree.2008.10.008

length_mod1<- observLmer(observ=chinook_length,dam="dam",sire="sire",response="length")
#> [1] "2024-02-04 15:48:23 PST"
#> Time difference of 1.15672 secs
length_mod1
#> $random
#>     effect     variance      percent     d.AIC      d.BIC       Chi.sq
#> 1 dam:sire 1.719503e-01 1.730762e+01 111.90386 106.805481 1.139039e+02
#> 2     sire 7.576289e-10 7.625898e-08  -2.00000  -7.098376 5.080710e-08
#> 3      dam 1.900456e-01 1.912900e+01  41.61831  36.519932 4.361831e+01
#>        p.value
#> 1 1.367824e-26
#> 2 9.998202e-01
#> 3 3.990881e-11
#> 
#> $other
#>   component  variance   percent
#> 1  Residual 0.6314988  63.56337
#> 2     Total 0.9934947 100.00000
#> 
#> $calculation
#>   component     variance      percent
#> 1  additive 3.030516e-09 3.050359e-07
#> 2    nonadd 6.878013e-01 6.923049e+01
#> 3  maternal 1.900456e-01 1.912900e+01

Produces a list object containing three data frames. Each data frame contains the raw variance components and the variance components as a percentage of the total variance component. The first data frame also contains the difference in AIC and BIC, and likelihood ratio test Chi-square and p-value for all random effects.

Note

Default is Restricted maximum likelihood (REML) as ml = F. Option for maximum likelihood (ML) is ml = T.

Maximum likelihood (ML) estimates the parameters that maximize the likelihood of the observed data and has the advantage of using all the data and accounting for non-independence (Lynch and Walsh 1998, p. 779; Bolker et al. 2009). On the other hand, ML has the disadvantage of assuming that all fixed effects are known without error, producing a downward bias in the estimation of the residual variance component. This bias can be large if there are lots of fixed effects, especially if sample sizes are small. Restricted maximum likelihood (REML) has the advantage of not assuming the fixed effects are known and averages over the uncertainty, so there can be less bias in the estimation of the residual variance component. However, REML only maximizes a portion of the likelihood to estimate the effect parameters, but is the preferred method for analyzing large data sets with complex structure.

Statistical Power analysis

Power values are calculated by stochastically simulating data for a number of iterations and then calculating the proportion of P-values less than \(\alpha\) (e.g. 0.05) for each component (Bolker 2008). Simulated data are specified by inputs for variance component values and the sample sizes.

Bolker BM. 2008. Ecological Models and Data in R. Princeton University Press, Princeton.

Defaults are alpha = 0.05 for 5%, nsim = 100 for 100 simulations, and ml = F for REML.

varcomp is a vector of dam, sire, dam by sire, and residual variance components, i.e. c(dam,sire,dam \(\times\) sire,residual).

nval is a vector of dam, sire, and offspring per family sample sizes, i.e. c(dam,sire,offspring).

For this example, the variance components of observLmer above are used (i.e. dam= 0.1900, sire= 0, dam \(\times\) sire= 0.1719, residual= 0.6315) and the sample size of the Chinook salmon data set (i.e. dam= 11, sire= 11, offspring= 10).

Full analysis is 100 simulations. Example has 25 simulations.

#>powerLmer(varcomp=c(0.1900,0,0.1719,0.6315),nval=c(11,11,10)) #full
powerLmer(varcomp=c(0.1900,0,0.1719,0.6315),nval=c(11,11,10),nsim=25) #25 simulations
#> [1] "2024-02-04 15:48:24 PST"
#> [1] "Starting simulation: 1"
#> [1] "Starting simulation: 2"
#> [1] "Starting simulation: 3"
#> [1] "Starting simulation: 4"
#> [1] "Starting simulation: 5"
#> [1] "Starting simulation: 6"
#> [1] "Starting simulation: 7"
#> [1] "Starting simulation: 8"
#> [1] "Starting simulation: 9"
#> [1] "Starting simulation: 10"
#> [1] "Starting simulation: 11"
#> [1] "Starting simulation: 12"
#> [1] "Starting simulation: 13"
#> [1] "Starting simulation: 14"
#> [1] "Starting simulation: 15"
#> [1] "Starting simulation: 16"
#> [1] "Starting simulation: 17"
#> [1] "Starting simulation: 18"
#> [1] "Starting simulation: 19"
#> [1] "Starting simulation: 20"
#> [1] "Starting simulation: 21"
#> [1] "Starting simulation: 22"
#> [1] "Starting simulation: 23"
#> [1] "Starting simulation: 24"
#> [1] "Starting simulation: 25"
#> Time difference of 4.383722 secs
#>       term   n var_in     var_out power
#> 1      dam  11 0.1900 0.169688375     1
#> 2     sire  11 0.0000 0.003465225     0
#> 3 dam.sire 121 0.1719 0.156794485     1
#> 4 residual  NA 0.6315 0.636592380    NA

There is sufficient power (\(\ge\) 0.8) for dam and dam by sire variance components, whereas there is insufficient power (< 0.8) for the sire variance component. Albeit, the sire component is near zero, so the low power may be an artifact. In the cases of insufficient power, the sample size of dam, sire, and/or offspring can be increased until there is sufficient power.

Taking the reverse approach (can the sample size of dam, sire, or offspring be reduced while maintaining sufficient power?) using the same variance components and offspring sample size, dam and sire sample sizes could be reduced from 11 to 7.

#>powerLmer(varcomp=c(0.1900,0,0.1719,0.6315),nval=c(7,7,10)) #full
powerLmer(varcomp=c(0.1900,0,0.1719,0.6315),nval=c(7,7,10),nsim=25) #25 simulations
#> [1] "2024-02-04 15:48:28 PST"
#> [1] "Starting simulation: 1"
#> [1] "Starting simulation: 2"
#> [1] "Starting simulation: 3"
#> [1] "Starting simulation: 4"
#> [1] "Starting simulation: 5"
#> [1] "Starting simulation: 6"
#> [1] "Starting simulation: 7"
#> [1] "Starting simulation: 8"
#> [1] "Starting simulation: 9"
#> [1] "Starting simulation: 10"
#> [1] "Starting simulation: 11"
#> [1] "Starting simulation: 12"
#> [1] "Starting simulation: 13"
#> [1] "Starting simulation: 14"
#> [1] "Starting simulation: 15"
#> [1] "Starting simulation: 16"
#> [1] "Starting simulation: 17"
#> [1] "Starting simulation: 18"
#> [1] "Starting simulation: 19"
#> [1] "Starting simulation: 20"
#> [1] "Starting simulation: 21"
#> [1] "Starting simulation: 22"
#> [1] "Starting simulation: 23"
#> [1] "Starting simulation: 24"
#> [1] "Starting simulation: 25"
#> Time difference of 3.205772 secs
#>       term  n var_in     var_out power
#> 1      dam  7 0.1900 0.164639547  0.92
#> 2     sire  7 0.0000 0.008090698  0.04
#> 3 dam.sire 49 0.1719 0.159179025  1.00
#> 4 residual NA 0.6315 0.625006352    NA

Bootstrap confidence intervals

Confidence intervals for the additive genetic, non-additive genetic, and maternal variance components can be produced using the bootstrap-t resampling method described by Efron and Tibshirani (1993, p. 160‒162). Observations are resampled with replacement until the original sample size is reproduced. The resampled data are then used in the model and the additive genetic, non-additive genetic, and maternal variance components are extracted. The process is repeated for a number of iterations, typically 1,000 times, to produce a distribution for each component. The confidence interval lower and upper limits and median are extracted from the distribution.

Efron B, Tibshirani R. 1993. An Introduction to the Bootstrap. Chapman and Hall, New York.

Resample observations

The resampRepli function is used to bootstrap resample observations grouped by replicate ID within family ID for a specified number of iterations to create the resampled data set. A similar resampFamily function is able to resample observations grouped by family ID only.

copy is a vector of column numbers (to copy the contents). Does not need to contain the family and/or replicate columns.

Full analysis is 1000 iterations. Example has 5 iterations.

#>resampRepli(dat=chinook_length,copy=c(3:8),family="family",replicate="repli",iter=1000) #full
#>resampFamily(dat=chinook_length,copy=c(3:8),family="family",iter=1000) #family only
resampRepli(dat=chinook_length,copy=c(3:8),family="family",replicate="repli",iter=5) #5 iterations

Because of the large file sizes that can be produced, the resampling of each replicate Y per family X is saved separately as a common separated (X_Y_resampR.csv) file in the working directory. These files are merged to create the final resampled data set (resamp_datR.csv).

If using resampFamily, the file names are X_resampF.csv per family and resamp_datF.csv for the final resampled data set.

Iteration variance components

The equivalent to observLmer is available for the final bootstrap resampled data set, i.e. resampLmer.

Default is ml = F for REML. The starting model number start = and ending model number end = need to be specified.

Full analysis is 1000 iterations. Example has 5 iterations.

#>length_datR<- read.csv("resamp_datR.csv") #1000 iterations
#>length_rcomp1<- resampLmer(resamp=length_datR,dam="dam",sire="sire",response="length",
#>start=1,end=1000) #full
data(chinook_resampL) #5 iterations
head(chinook_resampL)
#>   dam1 sire1 tray1 cell1 length1 egg_size1 dam2 sire2 tray2 cell2 length2
#> 1   d1    s1    t7    1A    23.1      7.27   d1    s1    t7    1A    23.1
#> 2   d1    s1    t7    1A    22.7      7.27   d1    s1    t7    1A    22.7
#> 3   d1    s1    t7    1A    22.9      7.27   d1    s1    t7    1A    22.9
#> 4   d1    s1    t7    1A    23.1      7.27   d1    s1    t7    1A    22.7
#> 5   d1    s1    t7    1A    23.1      7.27   d1    s1    t7    1A    22.9
#> 6   d1    s1    t8    1A    22.4      7.27   d1    s1    t8    1A    22.3
#>   egg_size2 dam3 sire3 tray3 cell3 length3 egg_size3 dam4 sire4 tray4 cell4
#> 1      7.27   d1    s1    t7    1A    23.8      7.27   d1    s1    t7    1A
#> 2      7.27   d1    s1    t7    1A    23.8      7.27   d1    s1    t7    1A
#> 3      7.27   d1    s1    t7    1A    23.8      7.27   d1    s1    t7    1A
#> 4      7.27   d1    s1    t7    1A    23.8      7.27   d1    s1    t7    1A
#> 5      7.27   d1    s1    t7    1A    23.8      7.27   d1    s1    t7    1A
#> 6      7.27   d1    s1    t8    1A    22.4      7.27   d1    s1    t8    1A
#>   length4 egg_size4 dam5 sire5 tray5 cell5 length5 egg_size5
#> 1    22.6      7.27   d1    s1    t7    1A    22.9      7.27
#> 2    23.8      7.27   d1    s1    t7    1A    22.6      7.27
#> 3    23.1      7.27   d1    s1    t7    1A    22.6      7.27
#> 4    22.9      7.27   d1    s1    t7    1A    22.9      7.27
#> 5    23.8      7.27   d1    s1    t7    1A    22.7      7.27
#> 6    22.0      7.27   d1    s1    t8    1A    22.4      7.27
length_rcomp1<- resampLmer(resamp=chinook_resampL,dam="dam",sire="sire",response="length",
start=1,end=5)
#> [1] "2024-02-04 15:48:31 PST"
#> [1] "Working on model: 1"
#> [1] "Working on model: 2"
#> [1] "Working on model: 3"
#> [1] "Working on model: 4"
#> [1] "Working on model: 5"
#> Time difference of 0.2547581 secs
length_rcomp1[1:5,]
#>    dam:sire         sire       dam  Residual     Total     additive    nonadd
#> 1 0.2144147 2.515484e-03 0.1732970 0.5674395 0.9576666 1.006194e-02 0.8576587
#> 2 0.2353632 0.000000e+00 0.1856530 0.5986349 1.0196511 0.000000e+00 0.9414529
#> 3 0.2090227 4.615273e-06 0.1728588 0.5659071 0.9477933 1.846109e-05 0.8360909
#> 4 0.1836323 3.962034e-03 0.1875104 0.5844500 0.9595547 1.584814e-02 0.7345293
#> 5 0.2158531 0.000000e+00 0.1922630 0.5799275 0.9880435 0.000000e+00 0.8634123
#>    maternal
#> 1 0.1707815
#> 2 0.1856530
#> 3 0.1728542
#> 4 0.1835483
#> 5 0.1922630

The function provides a data frame with columns containing the raw variance components for dam, sire, dam by sire, residual, total, additive genetic, non-additive genetic, and maternal. The number of rows in the data frame matches the number of iterations in the resampled data set and each row represents a model number.

Extract confidence intervals

Extract the bootstrap-t confidence intervals (CI) and median for the additive genetic, non-additive genetic, and maternal values from the data frame of models produced using resampLmer.

Default confidence interval is 95% as level = 95.

#>ciMANA(comp=length_rcomp1) #full
data(chinook_bootL) #same as length_rcomp1 1000 models
ciMANA(comp=chinook_bootL)
#> $raw
#>   component lower median upper
#> 1  additive 0.000  0.000 0.030
#> 2    nonadd 0.713  0.844 0.996
#> 3  maternal 0.168  0.201 0.233
#> 
#> $percentage
#>   component lower median upper
#> 1  additive   0.0    0.0   2.9
#> 2    nonadd  69.4   81.3  94.2
#> 3  maternal  16.7   19.3  22.0

The raw values are presented and are converted to a percentage of the total variance for each model. Defaults are the number of decimal places to round CI raw values as rnd_r = 3 and to round the CI percent values as rnd_p = 1.

The bootstrap-t method may produce medians that are largely different from the observed values. However, the chinook_bootL example data for the bootstrap-t CI were produced using another model including a position effect (see the vignette for Advanced Normal Data Example), which may explain differences between the 95% CI and observed values. Nonetheless, options are provided below for 95% CI that are a poor fit.

Bias and accelerated corrected confidence intervals

The BCa method (bias and acceleration) described by Efron and Tibshirani (1993, p.184‒188) can be used for the correction of bootstrap-t confidence intervals.

bias is a vector of additive, non-additive, and maternal variance components, i.e. c(additive,non-additive,maternal), from the raw observed variance components of observLmer.

The raw variance components of the chinook_bootL model were additive= 0, non-additive= 0.7192, and maternal= 0.2030. Typically the variance components would be from observLmer above for the analysis pipeline.

The ‘bias fail’ warning is if the bias calculation is infinity (negative or positive), e.g. bias contains a zero value, so the uncorrected confidence interval is displayed for the component.

ciMANA(comp=chinook_bootL,bias=c(0,0.7192,0.2030)) #bias only
#> $raw
#>   component lower median upper    change
#> 1  additive 0.000  0.000 0.030 bias fail
#> 2    nonadd 0.619  0.624 0.735      <NA>
#> 3  maternal 0.174  0.206 0.240      <NA>
#> 
#> $percentage
#>   component lower median upper    change
#> 1  additive   0.0    0.0   2.9 bias fail
#> 2    nonadd  62.5   62.6  71.5      <NA>
#> 3  maternal  17.0   19.7  22.5      <NA>
#>ciMANA(comp=length_rcomp1,bias=c(0,0.6878,0.1900)) #full, observLmer components

accel for acceleration correction uses the delete-one observation jackknife data set. See length_jack1 all observations in the next section.

data(chinook_jackL)
ciMANA(comp=chinook_bootL,bias=c(0,0.7192,0.2030),accel=chinook_jackL) #bias and acceleration
#> $raw
#>   component lower median upper    change
#> 1  additive 0.000  0.000 0.030 bias fail
#> 2    nonadd 0.619  0.624 0.735      <NA>
#> 3  maternal 0.174  0.206 0.240      <NA>
#> 
#> $percentage
#>   component lower median upper    change
#> 1  additive   0.0    0.0   2.9 bias fail
#> 2    nonadd  62.5   62.6  71.5      <NA>
#> 3  maternal  17.0   19.7  22.5      <NA>
#>ciMANA(comp=length_rcomp1,bias=c(0,0.6878,0.1900),accel=length_jack1) #full, observLmer

Jackknife confidence intervals

Jackknife resampling is another method for producing confidence intervals.

The equivalent to observLmer is available for jackknife resampling, i.e. JackLmer, using the observed data frame.

Default is delete-one jackknife resampling as size = 1 and REML as ml = F.

Full analysis uses all observations. Example has the first 10 observations.

#full, all observations
#>length_jack1<- JackLmer(observ=chinook_length,dam="dam",sire="sire",response="length")
#first 10 observations
length_jack1<- JackLmer(observ=chinook_length,dam="dam",sire="sire",response="length",first=10) 
#> [1] "2024-02-04 15:48:32 PST"
#> [1] "Removing observation: 1 of 1210"
#> [1] "Removing observation: 2 of 1210"
#> [1] "Removing observation: 3 of 1210"
#> [1] "Removing observation: 4 of 1210"
#> [1] "Removing observation: 5 of 1210"
#> [1] "Removing observation: 6 of 1210"
#> [1] "Removing observation: 7 of 1210"
#> [1] "Removing observation: 8 of 1210"
#> [1] "Removing observation: 9 of 1210"
#> [1] "Removing observation: 10 of 1210"
#> Time difference of 0.7191651 secs
head(length_jack1)
#>    dam:sire         sire       dam  Residual     Total     additive    nonadd
#> 1 0.1719272 1.130054e-10 0.1899717 0.6320471 0.9939461 4.520215e-10 0.6877089
#> 2 0.1720704 3.439641e-05 0.1902922 0.6317337 0.9941307 1.375856e-04 0.6882816
#> 3 0.1720430 1.401530e-10 0.1894845 0.6305184 0.9920459 5.606120e-10 0.6881722
#> 4 0.1721860 1.017022e-08 0.1903536 0.6314626 0.9940023 4.068088e-08 0.6887441
#> 5 0.1719302 7.149520e-08 0.1901255 0.6320635 0.9941192 2.859808e-07 0.6877207
#> 6 0.1719799 7.679377e-10 0.1901186 0.6320349 0.9941334 3.071751e-09 0.6879195
#>    maternal
#> 1 0.1899717
#> 2 0.1902578
#> 3 0.1894845
#> 4 0.1903536
#> 5 0.1901254
#> 6 0.1901186

Because the delete-one observation jackknife resampling may be computationally intensive for large data sets, the JackLmer function has the option of delete-d observation jackknife resampling, for which d > 1. The rows of the observed data frame are shuffled and a block of observations of size d is deleted sequentially. For example, delete-5 observation jackknife resampling is specified as size = 5, which deletes a block of 5 observations.

Full analysis uses all observations. Example has the first 10 observations.

#full
#>length_jack1D<- JackLmer(observ=chinook_length,dam="dam",sire="sire",response="length",size=5)
#first 10
length_jack1D<- JackLmer(observ=chinook_length,dam="dam",sire="sire",response="length",
size=5,first=10)
#> [1] "2024-02-04 15:48:33 PST"
#> [1] "Removing block: 1 of 242"
#> [1] "Removing block: 2 of 242"
#> [1] "Removing block: 3 of 242"
#> [1] "Removing block: 4 of 242"
#> [1] "Removing block: 5 of 242"
#> [1] "Removing block: 6 of 242"
#> [1] "Removing block: 7 of 242"
#> [1] "Removing block: 8 of 242"
#> [1] "Removing block: 9 of 242"
#> [1] "Removing block: 10 of 242"
#> Time difference of 0.586534 secs
head(length_jack1D)
#>    dam:sire         sire       dam  Residual     Total     additive    nonadd
#> 1 0.1724557 2.738820e-04 0.1873491 0.6298571 0.9899358 1.095528e-03 0.6898227
#> 2 0.1722088 3.706721e-04 0.1898717 0.6323838 0.9948350 1.482688e-03 0.6888352
#> 3 0.1720427 0.000000e+00 0.1896242 0.6330576 0.9947245 0.000000e+00 0.6881709
#> 4 0.1706830 0.000000e+00 0.1891575 0.6333865 0.9932270 0.000000e+00 0.6827322
#> 5 0.1723912 1.125891e-10 0.1907481 0.6326380 0.9957772 4.503564e-10 0.6895646
#> 6 0.1713051 0.000000e+00 0.1925064 0.6324205 0.9962320 0.000000e+00 0.6852202
#>    maternal
#> 1 0.1870752
#> 2 0.1895010
#> 3 0.1896242
#> 4 0.1891575
#> 5 0.1907481
#> 6 0.1925064

Extract the jackknife confidence intervals (CI) and median for the additive genetic, non-additive genetic, and maternal values from the data frame of models produced using JackLmer.

The mean and the standard error of pseudo-values for each variance component are calculated (Efron and Tibshirani 1993, p.184‒188). The standard error is then used with the Student’s t distribution to provide the lower and upper limits for the confidence interval. For delete-d jackknife resampling, M degrees of freedom were used for producing the confidence interval (Martin et al. 2004): M = N / d, where N is the total number of observations and d is the number of deleted observations. Large values of M, such as 1,000, can translate to the delete-d jackknife resampling method approaching bootstrap resampling expectations (Efron and Tibshirani 1993, p. 149).

Martin, H., Westad, F. & Martens, H. (2004). Improved Jackknife Variance Estimates of Bilinear Model Parameters. COMPSTAT 2004 – Proceedings in Computational Statistics 16th Symposium Held in Prague, Czech Republic, 2004 (ed J. Antoch), pp. 261-275. Physica-Verlag HD, Heidelberg.

Default confidence interval is 95% as level = 95.

full is a vector of additive, non-additive, maternal, and total variance components, i.e. c(additive,non-additive,maternal,total), from the raw observed variance components of observLmer.

The chinook_jackL example data for the jackknife CI were produced using another model including a position effect (see the vignette for Advanced Normal Data Example). The raw variance components of this model were additive= 0, non-additive= 0.7192, maternal= 0.2030, and total= 1.0404. Typically the variance components would be from observLmer above for the analysis pipeline.

data(chinook_jackL) #similar to length_jack1 all observations
ciJack(comp=chinook_jackL,full=c(0,0.7192,0.2030,1.0404))
#> $raw
#>   component lower  mean upper
#> 1  additive 0.000 0.000 0.000
#> 2    nonadd 0.481 0.691 0.900
#> 3  maternal 0.197 0.244 0.291
#> 
#> $percentage
#>   component lower mean upper
#> 1  additive   0.0  0.0   0.0
#> 2    nonadd  52.2 69.7  87.2
#> 3  maternal  20.6 24.4  28.2
#full, all observations, observLmer components
#>ciJack(comp=length_jack1,full=c(0,0.6878,0.1900,0.9935)) 

The raw values are presented and are converted to a percentage of the total variance for each model. Defaults are the number of decimal places to round CI raw values as rnd_r = 3 and to round the CI percent values as rnd_p = 1.

Plotting confidence intervals

The barMANA and boxMANA functions are simple plotting functions for the confidence intervals or all values, respectively, from the bootstrap and jackknife approaches. Default is to display the percentage values as type = perc. Raw values can be displayed as type = raw.

Within the functions, there are simple plot modifications available. For the y-axis, min and max values can be species as ymin and ymax, as well as the increment as yunit. Also, magnification of the axis unit as cex_yaxis and label as cex_ylab. The position of the legend can be specified as leg. Default is “topright”.

Bar plot

The barMANA function produces bar graphs with the bootstrap-t median (ciMANA) or jackknife pseudo-value mean (ciJack) as the top of the shaded bar and error bars covering the range of the confidence interval for each of the additive genetic, non-additive genetic, and maternal values of a phenotypic trait.

The length of the error bar can be specified in inches as bar_len.

length_ci<- ciJack(comp=chinook_jackL,full=c(0,0.7192,0.2030,1.0404))
oldpar<- par(mfrow=c(2,1))
barMANA(ci_dat=length_ci) #basic, top
barMANA(ci_dat=length_ci,bar_len=0.3,yunit=20,ymax=100,cex_ylab=1.3) #modified, bottom

Different traits can also be combined on the same bar plot using trait specified in ciMANA or ciJack. The information is combined into a list object. For the example, the jackknife CI is duplicated to simulate ‘different traits’.

length_ci1<- ciJack(comp=chinook_jackL,full=c(0,0.7192,0.2030,1.0404),trait="length_1")
length_ci2<- ciJack(comp=chinook_jackL,full=c(0,0.7192,0.2030,1.0404),trait="length_2")
comb_bar<- list(raw=rbind(length_ci1$raw,length_ci2$raw),
percentage=rbind(length_ci1$percentage,length_ci2$percentage)) 
barMANA(ci_dat=comb_bar,bar_len=0.3,yunit=20,ymax=100,cex_ylab=1.3)

The legend is slightly off in the presented html version but is fine with the R plotting device.

Box plot

The boxMANA function produces box plots using all values for the bootstrap-t resampling data set (resampLmer) or jackknife resampling data set (JackLmer).

oldpar<- par(mfrow=c(2,1))
boxMANA(comp=chinook_bootL) #from resampLmer, basic, top 
boxMANA(comp=chinook_bootL,yunit=20,ymax=100,cex_ylab=1.3,leg="topleft") #modified, bottom

Different traits can also be combined on the same box plot by adding a “trait” column to the resampling data set. For the example, the bootstrap-t data frame is duplicated to simulate ‘different traits’.

chinook_bootL1<- chinook_bootL; chinook_bootL2<- chinook_bootL #from resampLmer
chinook_bootL1$trait<- "length_1"; chinook_bootL2$trait<- "length_2"
comb_boot<- rbind(chinook_bootL1,chinook_bootL2)
comb_boot$trait<- as.factor(comb_boot$trait)
boxMANA(comb_boot,yunit=20,ymax=100,cex_ylab=1.3,leg="topleft")

The recommended follow-up vignette is the Advanced Normal Data Example, covering the standard model with the options of including additional random effects for one position (e.g. tank) and/or one block effect (e.g. several blocks of 2 \(\times\) 2 factorial matings).

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.