c
: centredsc
: centered and scaledA
: additive model; same slope for all species/lifestage combinationsX
: Different slope per species/lifestage combinationHawaian contemporary data, supplied by Peter Follett, will be used for demonstrating the use of functions in the qra package. Several different styles of model will be compared. This turns out to be a challenging dataset with which to work.
We start by using GLM models to check out how well lines fit to the individual species/lifestage and species/lifestage/replicate combinations.
The graphs now shown use a logistic transform for the \(y\)-scale. Responses appear acceptably linear, after the first one or two observations.
For what follows, observations will, where at least 7 times with < 100% mortality are available, be restricted to days 6 or later. Where at least 5 times with < 100% mortality are available, be restricted to days 4 or later. The points that remain then appear, apart from clear outliers, to be acceptably linear on a logit mortality scale.
Alternatives that in principle might be used are:
fieller()
in this package could be fairly straightforwardedly adapted to handle this case.Alternative 1 will handle a wider range of cases than 2, which models the very specific form of nonlinearity that results from Abbott’s formula type control mortality effects.
Both these alternatives require the estimation of additional parameters. For models that fit curves, a strategy is required for deciding on the family of curves that will be used, and on how the change of curve between treatment groups will be parameterized. For zero-inflated models, a strategy is required for deciding on whether zero-inflation parameters can be assumed common across some treatment groups.
Even where sample-based control mortality estimates are available, these will in general be too inaccurate to use as known, fixed, control mortalities.
The following shows diagnotic plots, after fitting one line for each species/lifestage/replicate combination. Fitted lines are very clearly different between replicates. As replicates are treated as fixed effects, the models fitted here are not suitable for generalization beyond the data used to fit the model. The models will be used for diagnostic purposes.
We wish to check:
Two types of model will be fitted — a generalised linear (GLM) model, and linear model with \(y = \log(\dfrac{Dead+1/6}{Live+1/6})\).
Warning: not plotting observations with leverage one:
70, 71, 76, 77
Diagnostic plots — GLM X model
The “Residuals vs Fitted” plot suggests that a systematic pattern of variation may remain after fitting the line. Note, however, that the fitted smooth may be misleading, given some large outliers and the strong indication in the “Scale-Location” plot that the GLM model is giving too much weight to points at midrange mortalities, and too little weight to high mortality points. The assumption of a constant dispersion is clearly seriously wrong. One answer to the changes in dispersion may be to adjust the GLM weightings.
(The scale-location plot will be close to a horizontal line if points are being correctly weighted. It shows reduced variation about the line as mortalities increase.)
We investigate adjusting the weights by reversing or partially reversing the effect of the weighting that is implicit in the use of a generalized linear model with quasibinomial errors.
Warning: not plotting observations with leverage one:
70, 71, 76, 77
GLM model, logit link, adjusted weights
Now omit point 52 and repeat the plots:
Warning: not plotting observations with leverage one:
37, 38, 69, 70, 75, 76
GLM model, logit link, adjusted weights, omitting points 51 and 52
The “Residuals vs Fitted” plot is not as flat as one would like. The span for the smooth may however be set too small for this dataset.
Again, we fit one line per replicate, using a robust fit in order to downweight the influence of outliers.
Warning: not plotting observations with leverage one:
70, 71, 76, 77
lm model; y = log((Dead+1/6)/(Live+1/6))
Several points are identified as outliers. The following checks the diagnostic plots that result when they are omitted:
Warning: not plotting observations with leverage one:
37, 38, 39, 40, 66, 67, 72, 73
lm model; y = log((Dead+1/6)/(Live+1/6))
In the mixed model context, the main effect of differences in the way that individual lines are fitted is in the efficiency of use of the data.
rlmer
We fit robust versions of the linear mixed model. This allows, however, for a random intercept only. The model that allowed also for a random slope generated an error. (A guess is that there were too few points to allow a satisfactory fit.)
Loading required package: lme4
Loading required package: Matrix
Note: method with signature 'CsparseMatrix#Matrix#missing#replValue' chosen for function '[<-',
target signature 'dgCMatrix#ngCMatrix#missing#numeric'.
"Matrix#nsparseMatrix#missing#replValue" would also be valid
Note: method with signature 'CsparseMatrix#Matrix#missing#replValue' chosen for function '[<-',
target signature 'dgCMatrix#nsCMatrix#missing#numeric'.
"Matrix#nsparseMatrix#missing#replValue" would also be valid
The largest random effect is a clear outlier.
glmer
The following applies the relative weightings that were used for the fit to trtGpRep
with adjusted weights:
The comparison is for X
models, with slopes that vary between species/lifestage combinations. Models work either with cTime
(centered version of TrtTime
), or with scTime
(centered and scaled).
scTime
modXW.glmer
; work with scTime
scTime
modXWi.glmer
; work with scTime
My preference, following on from my experience in working with disinfestation data in the past year, is Model 1. In work that I undertook prior to around 2000, the choice was to work with some equivalent of Model 2. The preference may depend somewhat on the individual dataset. Additionally, it may be that we were not at that time very practiced with the use of diagnostic plots.
A further possibility is to fit one LT value for each replicate, then using those as the basis for further analysis. Results may be unsatisfactory if the accuracy of the LT99 estimates varies widely between replicates.
The following assume 16 degrees of freedom for the variance-covariance matrix. There are 8 \(\times\) 3 replicates in all. For each species/lifestage combination, two of the three degrees of freedom are left over after estimating the species/lifestage specific intercept, i.e., there are 8 \(\times\) 2 = 16 degrees of freedom. The same is the case for the (random) estimate. The value 16 is then the smallest of the degrees of freedom for the sources of variability that contribute to the variance-covariance matrix.
The confidence interval calculations have not taken account of the omission of outliers in the analyses, or of the use of robust methods in the analyses. The confidence intervals are on this account likely to be anti-conservative. In principle, one can bootstrap the calculations, but a likely roadblock for the present data is that calculations will fail for at least some of the bootstrap samples.
The following are LT99 and approximate confidence intervals:
MedFlyEgg:
estval var lower upper
lmm (rlmer), random intercept 10.02 1.167 7.92 12.8
glmer, uncorrected wts 8.89 1.855 6.69 13.5
glmer, corrected weights 9.38 1.445 7.31 12.9
glmer, uncorr wts, random intercept 10.19 0.740 8.42 12.1
glmer, corr wts, random intercept 9.98 0.897 8.15 12.3
GLM: Corr weights 14.83 12.250 10.55 48.1
MedFlyL1:
estval var lower upper
lmm (rlmer), random intercept 6.11 2.208 2.75 10.73
glmer, uncorrected wts 5.36 1.356 3.32 9.75
glmer, corrected weights 5.63 1.597 3.48 10.82
glmer, uncorr wts, random intercept 5.36 0.905 3.51 7.78
glmer, corr wts, random intercept 5.63 1.383 3.53 9.61
GLM: Corr weights 5.63 2.950 NA NA
MedFlyL2:
estval var lower upper
lmm (rlmer), random intercept 7.58 0.425 6.27 9.14
glmer, uncorrected wts 7.92 0.831 6.35 10.56
glmer, corrected weights 7.68 0.538 6.38 9.75
glmer, uncorr wts, random intercept 7.99 0.304 6.86 9.22
glmer, corr wts, random intercept 7.62 0.373 6.48 9.21
GLM: Corr weights 7.54 0.890 6.24 12.61
MedFlyL3:
estval var lower upper
lmm (rlmer), random intercept 6.14 0.254 5.14 7.35
glmer, uncorrected wts 7.09 0.555 5.78 9.20
glmer, corrected weights 7.01 0.520 5.77 9.15
glmer, uncorr wts, random intercept 6.84 0.212 5.91 7.89
glmer, corr wts, random intercept 6.97 0.395 5.84 8.70
GLM: Corr weights 6.67 0.757 5.49 11.68
MelonFlyEgg:
estval var lower upper
lmm (rlmer), random intercept 3.58 0.3327 2.25 5.05
glmer, uncorrected wts 3.33 0.2173 2.54 5.07
glmer, corrected weights 2.74 0.0546 2.32 3.46
glmer, uncorr wts, random intercept 3.32 0.1910 2.60 4.91
glmer, corr wts, random intercept 2.74 0.0522 2.33 3.43
GLM: Corr weights 2.74 0.1274 2.29 34.57
MelonFlyL1:
estval var lower upper
lmm (rlmer), random intercept 7.89 2.78 3.76 13.5
glmer, uncorrected wts 7.99 6.00 4.55 41.2
glmer, corrected weights 7.84 3.91 4.76 24.8
glmer, uncorr wts, random intercept 7.99 2.32 5.07 12.3
glmer, corr wts, random intercept 7.84 2.76 4.94 15.9
GLM: Corr weights 7.84 6.18 NA NA
MelonFlyL2:
estval var lower upper
lmm (rlmer), random intercept 11.1 1.88 8.41 14.7
glmer, uncorrected wts 12.3 11.58 7.84 37.1
glmer, corrected weights 12.2 5.84 8.52 22.1
glmer, uncorr wts, random intercept 12.5 1.78 9.74 15.5
glmer, corr wts, random intercept 12.5 2.37 9.62 16.5
GLM: Corr weights 14.0 7.12 10.58 34.2
MelonFlyL3:
estval var lower upper
lmm (rlmer), random intercept 8.55 0.619 7.00 10.5
glmer, uncorrected wts 9.65 2.285 7.25 14.9
glmer, corrected weights 9.73 1.505 7.64 13.4
glmer, uncorr wts, random intercept 9.68 0.591 8.10 11.4
glmer, corr wts, random intercept 9.62 0.798 7.93 11.9
GLM: Corr weights 9.71 1.629 7.93 15.9
The GLM model (with corrected weights) is clearly unsatisfactory. The fitting of a random slope does in some cases, for the model fitted using glmer, increase the width of the confidence interval. This happens both with and without the weighting. The same would almost certainly be the case for the rlmer model.
Use of glmer with “corrected” weights does lead to more consistent confidence intervals than for glmer with uncorrected weights.
Models 1 (rlmer with random intercepts only) and 5 (glmer with random intercepts only, corrected weights) give very similar confidence intervals. It is reasonable to expect that rlmer with random intercepts and slopes would give similar results to the weighted glmer model with random intercepts and slopes.
Model 6 is clearly unsatisfactory.
Attention is limited to the linear mixed model. 90% confidence ellipses are added.
Slope versus intercept plot. The intercept is a difference from the mean. Contour lines have been added for constant LT99=11 days (labeled with Es), LT99=9 days (labeled with 9s), and LT99=7 days (labeled with 7s).
The following uses a Hotelling T2 test to compare the means in the cases where the two ellipses (medFly L1 and MedFly L3) are closest to touching. The following requires careful checking. The \(p\)-value should be adjusted upwards for the number of comparisons made.
[1] 5e-04
On Hotelling T2, see, e.g., https://en.wikipedia.org/wiki/Hotelling%27s_T-squared_distribution#Statistic