The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
Calculating the average nLTT plot of multiple phylogenies is not a trivial tasks.
The function get_nltt_values
collects the nLTT values of
a collection of phylogenies as tidy data.
This allows for a good interplay with ggplot2.
Create two easy trees:
newick1 <- "((A:1,B:1):2,C:3);"
newick2 <- "((A:2,B:2):1,C:3);"
phylogeny1 <- ape::read.tree(text = newick1)
phylogeny2 <- ape::read.tree(text = newick2)
phylogenies <- c(phylogeny1, phylogeny2)
There are very similar. phylogeny1
has short tips:
This can be observed in the nLTT plot:
As a collection of timepoints:
time | N |
---|---|
0.0000000 | 0.3333333 |
0.6666667 | 0.6666667 |
1.0000000 | 1.0000000 |
Plotting those timepoints:
df <- as.data.frame(nLTT::get_phylogeny_nltt_matrix(phylogeny1))
ggplot2::qplot(
time, N, data = df, geom = "step", ylim = c(0, 1), direction = "vh",
main = "NLTT plot of phylogeny 1"
)
## Warning: `qplot()` was deprecated in ggplot2 3.4.0.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
phylogeny2
has longer tips:
Also this can be observed in the nLTT plot:
As a collection of timepoints:
time | N |
---|---|
0.0000000 | 0.3333333 |
0.3333333 | 0.6666667 |
1.0000000 | 1.0000000 |
Plotting those timepoints:
df <- as.data.frame(nLTT::get_phylogeny_nltt_matrix(phylogeny2))
ggplot2::qplot(
time, N, data = df, geom = "step", ylim = c(0, 1), direction = "vh",
main = "NLTT plot of phylogeny 2"
)
The average nLTT plot should be somewhere in the middle.
It is constructed from stretched nLTT matrices.
Here is the nLTT matrix of the first phylogeny:
t <- nLTT::stretch_nltt_matrix(
nLTT::get_phylogeny_nltt_matrix(phylogeny1), dt = 0.20, step_type = "upper"
)
knitr::kable(t)
0.0 | 0.6666667 |
0.2 | 0.6666667 |
0.4 | 0.6666667 |
0.6 | 0.6666667 |
0.8 | 1.0000000 |
1.0 | 1.0000000 |
Here is the nLTT matrix of the second phylogeny:
t <- nLTT::stretch_nltt_matrix(
nLTT::get_phylogeny_nltt_matrix(phylogeny2), dt = 0.20, step_type = "upper"
)
knitr::kable(t)
0.0 | 0.6666667 |
0.2 | 0.6666667 |
0.4 | 1.0000000 |
0.6 | 1.0000000 |
0.8 | 1.0000000 |
1.0 | 1.0000000 |
Here is the average nLTT matrix of both phylogenies:
0.0 | 0.6666667 |
0.2 | 0.6666667 |
0.4 | 0.8333333 |
0.6 | 0.8333333 |
0.8 | 1.0000000 |
1.0 | 1.0000000 |
Observe how the numbers get averaged.
The same, now shown as a plot:
Here a demo how the new function works:
id | t | nltt |
---|---|---|
1 | 0.0 | 0.6666667 |
1 | 0.2 | 0.6666667 |
1 | 0.4 | 0.6666667 |
1 | 0.6 | 0.6666667 |
1 | 0.8 | 1.0000000 |
1 | 1.0 | 1.0000000 |
2 | 0.0 | 0.6666667 |
2 | 0.2 | 0.6666667 |
2 | 0.4 | 1.0000000 |
2 | 0.6 | 1.0000000 |
2 | 0.8 | 1.0000000 |
2 | 1.0 | 1.0000000 |
Plotting options, first create a data frame:
Here we see an averaged nLTT plot, where the original nLTT values are still visible:
ggplot2::qplot(
t, nltt, data = df, geom = "point", ylim = c(0, 1),
main = "Average nLTT plot of phylogenies", color = id, size = I(0.1)
) + ggplot2::stat_summary(
fun.data = "mean_cl_boot", color = "red", geom = "smooth"
)
Here we see an averaged nLTT plot, with the original nLTT values omitted:
ggplot2::qplot(t, nltt, data = df, geom = "blank", ylim = c(0, 1),
main = "Average nLTT plot of phylogenies"
) + ggplot2::stat_summary(
fun.data = "mean_cl_boot", color = "red", geom = "smooth"
)
Create two harder trees:
newick1 <- "((A:1,B:1):1,(C:1,D:1):1);"
newick2 <- paste0("((((XD:1,ZD:1):1,CE:2):1,(FE:2,EE:2):1):4,((AE:1,BE:1):1,",
"(WD:1,YD:1):1):5);"
)
phylogeny1 <- ape::read.tree(text = newick1)
phylogeny2 <- ape::read.tree(text = newick2)
phylogenies <- c(phylogeny1, phylogeny2)
There are different. phylogeny1
is relatively simple,
with two branching events happening at the same time:
This can be observed in the nLTT plot:
As a collection of timepoints:
time | N |
---|---|
0.0000000 | 0.1111111 |
0.5714286 | 0.2222222 |
0.7142857 | 0.3333333 |
0.7142857 | 0.4444444 |
0.7142857 | 0.5555556 |
0.8571429 | 0.6666667 |
0.8571429 | 0.7777778 |
0.8571429 | 0.8888889 |
1.0000000 | 1.0000000 |
phylogeny2
is more elaborate:
Also this can be observed in the nLTT plot:
As a collection of timepoints:
time | N |
---|---|
0.0000000 | 0.1111111 |
0.5714286 | 0.2222222 |
0.7142857 | 0.3333333 |
0.7142857 | 0.4444444 |
0.7142857 | 0.5555556 |
0.8571429 | 0.6666667 |
0.8571429 | 0.7777778 |
0.8571429 | 0.8888889 |
1.0000000 | 1.0000000 |
The average nLTT plot should be somewhere in the middle.
It is constructed from stretched nLTT matrices.
Here is the nLTT matrix of the first phylogeny:
t <- nLTT::stretch_nltt_matrix(
nLTT::get_phylogeny_nltt_matrix(phylogeny1), dt = 0.20, step_type = "upper"
)
knitr::kable(t)
0.0 | 0.5 |
0.2 | 0.5 |
0.4 | 0.5 |
0.6 | 1.0 |
0.8 | 1.0 |
1.0 | 1.0 |
Here is the nLTT matrix of the second phylogeny:
t <- nLTT::stretch_nltt_matrix(
nLTT::get_phylogeny_nltt_matrix(phylogeny2), dt = 0.20, step_type = "upper"
)
knitr::kable(t)
0.0 | 0.2222222 |
0.2 | 0.2222222 |
0.4 | 0.2222222 |
0.6 | 0.3333333 |
0.8 | 0.6666667 |
1.0 | 1.0000000 |
Here is the average nLTT matrix of both phylogenies:
0.0 | 0.3611111 |
0.2 | 0.3611111 |
0.4 | 0.3611111 |
0.6 | 0.6666667 |
0.8 | 0.8333333 |
1.0 | 1.0000000 |
Observe how the numbers get averaged.
Here a demo how the new function works:
id | t | nltt |
---|---|---|
1 | 0.0 | 0.5000000 |
1 | 0.2 | 0.5000000 |
1 | 0.4 | 0.5000000 |
1 | 0.6 | 1.0000000 |
1 | 0.8 | 1.0000000 |
1 | 1.0 | 1.0000000 |
2 | 0.0 | 0.2222222 |
2 | 0.2 | 0.2222222 |
2 | 0.4 | 0.2222222 |
2 | 0.6 | 0.3333333 |
2 | 0.8 | 0.6666667 |
2 | 1.0 | 1.0000000 |
Plotting options, first create a data frame:
Here we see an averaged nLTT plot, where the original nLTT values are still visible:
ggplot2::qplot(
t, nltt, data = df, geom = "point", ylim = c(0, 1),
main = "Average nLTT plot of phylogenies", color = id, size = I(0.1)
) + ggplot2::stat_summary(
fun.data = "mean_cl_boot", color = "red", geom = "smooth"
)
Here we see an averaged nLTT plot, with the original nLTT values omitted:
ggplot2::qplot(t, nltt, data = df, geom = "blank", ylim = c(0, 1),
main = "Average nLTT plot of phylogenies"
) + ggplot2::stat_summary(
fun.data = "mean_cl_boot", color = "red", geom = "smooth"
)
Create three random trees:
set.seed(42)
phylogeny1 <- ape::rcoal(10)
phylogeny2 <- ape::rcoal(20)
phylogeny3 <- ape::rcoal(30)
phylogeny4 <- ape::rcoal(40)
phylogeny5 <- ape::rcoal(50)
phylogeny6 <- ape::rcoal(60)
phylogeny7 <- ape::rcoal(70)
phylogenies <- c(
phylogeny1, phylogeny2, phylogeny3,
phylogeny4, phylogeny5, phylogeny6, phylogeny7
)
Here a demo how the new function works:
id | t | nltt |
---|---|---|
1 | 0.0 | 0.2000000 |
1 | 0.2 | 0.2000000 |
1 | 0.4 | 0.2000000 |
1 | 0.6 | 0.2000000 |
1 | 0.8 | 0.3000000 |
1 | 1.0 | 1.0000000 |
2 | 0.0 | 0.1000000 |
2 | 0.2 | 0.1000000 |
2 | 0.4 | 0.1000000 |
2 | 0.6 | 0.1000000 |
2 | 0.8 | 0.2000000 |
2 | 1.0 | 1.0000000 |
3 | 0.0 | 0.0666667 |
3 | 0.2 | 0.0666667 |
3 | 0.4 | 0.1000000 |
3 | 0.6 | 0.1333333 |
3 | 0.8 | 0.2333333 |
3 | 1.0 | 1.0000000 |
4 | 0.0 | 0.0500000 |
4 | 0.2 | 0.0500000 |
4 | 0.4 | 0.0500000 |
4 | 0.6 | 0.1000000 |
4 | 0.8 | 0.2750000 |
4 | 1.0 | 1.0000000 |
5 | 0.0 | 0.0400000 |
5 | 0.2 | 0.0600000 |
5 | 0.4 | 0.0600000 |
5 | 0.6 | 0.0600000 |
5 | 0.8 | 0.1000000 |
5 | 1.0 | 1.0000000 |
6 | 0.0 | 0.0333333 |
6 | 0.2 | 0.0333333 |
6 | 0.4 | 0.0666667 |
6 | 0.6 | 0.0666667 |
6 | 0.8 | 0.0833333 |
6 | 1.0 | 1.0000000 |
7 | 0.0 | 0.0285714 |
7 | 0.2 | 0.0285714 |
7 | 0.4 | 0.0285714 |
7 | 0.6 | 0.0428571 |
7 | 0.8 | 0.1000000 |
7 | 1.0 | 1.0000000 |
Here we see an averaged nLTT plot, where the original nLTT values are still visible:
ggplot2::qplot(t, nltt, data = df, geom = "point", ylim = c(0, 1),
main = "Average nLTT plot of phylogenies", color = id, size = I(0.1)
) + ggplot2::stat_summary(
fun.data = "mean_cl_boot", color = "red", geom = "smooth"
)
Here we see an averaged nLTT plot, with the original nLTT values omitted:
ggplot2::qplot(t, nltt, data = df, geom = "blank", ylim = c(0, 1),
main = "Average nLTT plot of phylogenies"
) + ggplot2::stat_summary(
fun.data = "mean_cl_boot", color = "red", geom = "smooth"
)
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.