Most studies examine the effect of a certain factor on a measure. However, very rarely do we have hypotheses stating what would be the expected result. Instead, we have a control group to which the results of the treatment group will be compared.
Imagine a study in a school examining the impact of playing collaborative games before beginning the classes. This study most likely will have two groups, one where the students are playing collaborative games and one where the students will have non- structured activities prior to classes. Then, the objective of the study is to compare the two groups.
Consider the results obtained, where 100 is the usual score obtained by students prior to the study.
Figure 1: Mean scores along with 95% confidence interval for two groups of students on the quality of learning behavior.
As seen, there seems to be a better score for students playing collaborative games. Taking into account the confidence interval, the manipulation seems to improve significantly the learning behavior as the lower end of the Collaborative games group is above the Unstructured activity mean (and vice versa).
What a surprise to discover that a t-test will NOT confirm this impression! (t(48) = 1.76, p = .085) Here, I use the schoRsch t_out
function only to get human-readable outputs (Pfister & Janczyk, 2016).
library(schoRsch)
t_out(t.test(dataFigure1$score[dataFigure1$grp==1],
dataFigure1$score[dataFigure1$grp==2],
var.equal=T))
## Test Results
## 1 Two Sample t-test: t(48) = 1.76, p = .085, d = 0.50
The reason is that the confidence intervals used are “stand-alone”: They can be used to examine, say, the first group to the value 100. As this value is outside the interval, we are correct in concluding that 100 is significantly different (with level \(\alpha\) of .05) from 100, as confirmed by a single-group t-test:
t_out(t.test(dataFigure1$score[dataFigure1$grp==1], mu=100))
## Test Results
## 1 One Sample t-test: t(24) = 2.50, p = .020, d = 0.50
##
## NOTE: Reporting unadjusted estimate for Cohen's d.
Likewise, the second group’ mean is significantly different from 105 (which happens to be the first group’s mean):
t_out(t.test(dataFigure1$score[dataFigure1$grp==2], mu=105))
## Test Results
## 1 One Sample t-test: t(24) = -2.48, p = .021, d = -0.50
##
## NOTE: Reporting unadjusted estimate for Cohen's d.
This is precisely the purpose of stand-alone intervals: to compare a single result to a fix value. The fix value (here 100 for the first group and 105 for the second group) has no uncertainty, it is a constant.
In contrast, the two-group t-test compares two means, the two of which are uncertain. Therefore, in making a confidence interval, it is necessary that the basic, stand-alone, confidence interval be informed that it is going to be compared —not to fix values— to a second quantity which is itself uncertain.
In the language of analyses of variance, we can say that when the purpose of the plot is to compare means to other means, there is more variances in the comparisons than there is in single groups in isolation.
Assuming that the variances are roughly homogeneous between group (an assumption made by the t-test), there is a simple adjustment that can be brought to the error bars: just increase their length by \(\sqrt(2)\). As \(\sqrt(2) \approx 1.41\) it means increasing their length by 41%.
With superbPlot
, this so-called difference adjustment (Baguley, 2012) is obtained easily by adding an adjustment to the list of adjustments with adjustments = list(purpose = "difference")
, as seen below.
superbPlot(dataFigure1,
BSFactor = "grp",
adjustments=list(purpose = "difference"), # the only new thing here
variables = "score",
plotStyle="line" ) +
xlab("Group") + ylab("Score") +
labs(title="Difference-adjusted\n95% confidence intervals") +
coord_cartesian( ylim = c(85,115) ) +
theme_gray(base_size=16) +
scale_x_discrete(labels=c("1" = "Collaborative games", "2" = "Unstructured activity"))
Figure 2: Mean scores along with difference-adjusted 95% confidence interval for two groups of students on the quality of learning behavior.
This is where the usefulness of superb
is apparent: it only required an option to mutate the plot from a plot showing means with stand-alone confidence intervals to a plot showing means with difference-adjusted confidence intervals.
Just for comparison purposes, I show both plots side-by-side.
library(gridExtra)
plt1 <- superbPlot(dataFigure1,
BSFactor = "grp",
variables = "score",
plotStyle="line" ) +
xlab("Group") + ylab("Score") +
labs(title="(stand-alone)\n95% confidence intervals") +
coord_cartesian( ylim = c(85,115) ) +
theme_gray(base_size=16) +
scale_x_discrete(labels=c("1" = "Collaborative games", "2" = "Unstructured activity"))
plt2 <- superbPlot(dataFigure1,
BSFactor = "grp",
adjustments=list(purpose = "difference"),
variables = "score",
plotStyle="line" ) +
xlab("Group") + ylab("Score") +
labs(title="Difference-adjusted\n95% confidence intervals") +
coord_cartesian( ylim = c(85,115) ) +
theme_gray(base_size=16) +
scale_x_discrete(labels=c("1" = "Collaborative games", "2" = "Unstructured activity"))
plt <- grid.arrange(plt1, plt2, ncol=2)
Figure 3: Two representation of the data with unadjusted (left) and adjusted (right) 95% confidence intervals
A second way to compare the two plots is to superimpose them, as in Figure 4:
# generate the two plots, nudging the error bars, using distinct colors, and having the background transparent
plt1 <- superbPlot(dataFigure1,
BSFactor = "grp",
variables = "score",
errorbarParams = list(color="blue",position = position_nudge(-0.05) ),
plotStyle="line" ) +
xlab("Group") + ylab("Score") +
labs(title="(red) Difference-adjusted 95% confidence intervals\n(blue) (stand-alone) 95% confidence intervals") +
coord_cartesian( ylim = c(85,115) ) +
theme_gray(base_size=16) +
scale_x_discrete(labels=c("1" = "Collaborative games", "2" = "Unstructured activity")) +
theme(panel.background = element_rect(fill = "transparent"),
plot.background = element_rect(fill = "transparent", color = "white"))
plt2 <- superbPlot(dataFigure1,
BSFactor = "grp",
adjustments=list(purpose = "difference"),
variables = "score",
errorbarParams = list(color="red",position = position_nudge(0.05) ),
plotStyle="line" ) +
xlab("Group") + ylab("Score") +
labs(title="(red) Difference-adjusted 95% confidence intervals\n(blue) (stand-alone) 95% confidence intervals") +
coord_cartesian( ylim = c(85,115) ) +
theme_gray(base_size=16) +
scale_x_discrete(labels=c("1" = "Collaborative games", "2" = "Unstructured activity")) +
theme(panel.background = element_rect(fill = "transparent"),
plot.background = element_rect(fill = "transparent", color = "white"))
# transform the ggplots into "grob" so that they can be manipulated
plt1g <- ggplotGrob(plt1)
plt2g <- ggplotGrob(plt2)
# put the two grob onto an empty ggplot (as the positions are the same, they will be overlayed)
ggplot() +
annotation_custom(grob=plt1g) +
annotation_custom(grob=plt2g)
Figure 4: Two representations of the results with adjusted and unadjusted error bars on the same plot
Adjusting the confidence intervals is important to have coherence between the text and the figure. If you are to claim that there is no difference but show Figure 1, an examinator (you know, a reviewer) may raise a red flag and cast doubt on your conclusions (opening the door to many rounds of reviews or rejection if this is a submitted work).
Having coherence between the figures and the tests reported in your document is one way to improve the clarity of your work. Coherence here comes cheap: You just need to add in the figure caption “Difference-adjusted” before “95% confidence intervals.”