This tutorial gives a basic introduction on constructing phylogenetic networks and to add parameter to trees or networx using phangorn (Schliep 2011) in R. Splits graph or phylogenetic networks are a nice way to display conflict data or summarize different trees. Here we present to popular networks, consensus networks (Holland et al. 2004) and neighborNet (Bryant and Moulton 2004).
Often trees or networks are missing either edge weights or support values about the edges. We show how to improve a tree/networx by adding support values or estimating edge weights using non-negative Least-Squares (nnls).
We first load the phangorn package and a few data sets we use in this vignette.
library(phangorn)
data(Laurasiatherian)
data(yeast)
A consensusNet (Holland et al. 2004) is a generalization of a consensus tree. Instead only representing splits with at least 50% in a bootstrap or MCMC sample one can use a lower threshold. However of important competing splits are left out.
The input for consensusNet
is a list of trees i.e. an object of class multiPhylo
.
set.seed(1)
bs <- bootstrap.phyDat(yeast, FUN = function(x)nj(dist.hamming(x)),
bs=100)
tree <- nj(dist.hamming(yeast))
par("mar" = rep(1, 4))
tree <- plotBS(tree, bs, "phylogram")
cnet <- consensusNet(bs, .3)
plot(cnet, "2D", show.edge.label=TRUE)
Often consensusNet
will return incompatible splits, which cannot plotted as a planar graph. A nice way to get still a good impression of the network is to plot it in 3 dimensions.
plot(cnet)
# rotate 3d plot
play3d(spin3d(axis=c(0,1,0), rpm=6), duration=10)
# create animated gif file
movie3d(spin3d(axis=c(0,1,0), rpm=6), duration=10)
which will result in a spinning graph similar to this
The function neighborNet
implements the popular method of Bryant and Moulton (2004). The Neighbor-Net algorithm extends the Neighbor joining allowing again algorithm is computed in 2 parts, the first computes a circular ordering. The second step involves estimation of edge weights using non-negative Least-Squares (nnls).
dm <- dist.hamming(yeast)
nnet <- neighborNet(dm)
par("mar" = rep(1, 4))
plot(nnet, "2D")
The advantage of Neighbor-Net is that it returns a circular split system which can be always displayed in a planar (2D) graph. The plots displayed in phangorn
may not planar, but re-plotting may gives you a planar graph. This unwanted behavior will be improved in future version. The rendering of the networx
is done using the the fantastic igraph package (Csardi and Nepusz 2006).
We can use the generic function addConfidences
to add support values from a tree, i.e. an object of class phylo
to a networx
, splits
or phylo
object. The Neighbor-Net object we computed above contains no support values. We can add the support values fro the tree we computed to the splits these two objects share.
nnet <- addConfidences(nnet, tree)
par("mar" = rep(1, 4))
plot(nnet, "2D", show.edge.label=TRUE)
We can also add support values to a tree:
tree2 <- rNNI(tree, 2)
tree2 <- addConfidences(tree2, tree)
# several support values are missing
par("mar" = rep(1, 4))
plot(tree2, show.node.label=TRUE)
Consensus networks on the other hand have information about support values corresponding to a split, but are generally without edge weights. Given a distance matrix we can estimate edge weights using non-negative Least-Squares.
cnet <- nnls.networx(cnet, dm)
par("mar" = rep(1, 4))
plot(cnet, "2D", show.edge.label=TRUE)
Bryant, David, and Vincent Moulton. 2004. “Neighbor-Net: An Agglomerative Method for the Construction of Phylogenetic Networks.” Molecular Biology and Evolution 21 (2): 255–65. doi:10.1093/molbev/msh018.
Csardi, Gabor, and Tamas Nepusz. 2006. “The Igraph Software Package for Complex Network Research.” InterJournal Complex Systems: 1695. http://igraph.org.
Holland, Barbara R., Katharina T. Huber, Vincent Moulton, and Peter J. Lockhart. 2004. “Using Consensus Networks to Visualize Contradictory Evidence for Species Phylogeny.” Molecular Biology and Evolution 21 (7): 1459–61. doi:10.1093/molbev/msh145.
Schliep, Klaus Peter. 2011. “Phangorn: Phylogenetic Analysis in R.” Bioinformatics 27 (4): 592–93. doi:10.1093/bioinformatics/btq706.