The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
dynafluxr
library(dynafluxr)
#> Loading required package: bspline
#> Loading required package: nlsic
#> Loading required package: nnls
#> Loading required package: dotty
#pkgload::load_all("~/projs/dynafluxr/dev/dynafluxr")
This vignette shows how dynafluxr
R package can be used to unravel metabolic flux dynamics from metabolite concentration time courses and a stoichiometric model. No regulation model (like Michaelis-Menten) is required for this task. Our method is heavily relying on B-splines as implemented in bspline
package. We show how to prepare measurement data, edit stoichiometric reactions, use various package options and interpret/exploit the results.
Even if graphical user interface (GUI) cans be called and used with function gui()
, this vignette is focused on command line interface (CLI) which is more suitable for reproducible research and automation. The CLI can be used from R or from a shell environment.
The data used for this vignette were obtained by NMR measurements on MetaToul-FluxoMet platform, Toulouse Biotechnology Institute (TBI), Toulouse, France. Credits for making these data available go to Svetlana Dubiley, Pauline Rouane, Cyril Charlier, Guy Lippens and Pierre Millard. Pierre Millard has also shared stoichiometric model and supervised the whole project. All cited persons are (or were at the moment of the project) from TBI. The case presented in this vignette is also presented in an article entitled “Estimating flux dynamics from metabolite concentration time courses with dynafluxr” by Serguei Sokol, Svetlana Dubiley, Pauline Rouane, Cyril Charlier, Guy Lippens, and Pierre Millard.
Data must be stored in a .tsv
(Tabulation Separated Values) file (.tsv
extension is not mandatory, it can be .txt
, .csv
, whatever, but the field separator must be the tabulation character), one column per chemical specie. The first row describes column names. Time points, at which measurements are done, must be stored in a column whose name starts with ‘Time’. No specie name can start with ‘Time’. Decimal separator must be a point .
character. A head of data file used as an example is looking like
fmeas=system.file("dataglyco/data.tsv", package="dynafluxr")
cat(head(readLines(fmeas)), sep="\n")
#> Time GLC G6P F6P FBP
#> 10 5.539973094 0.511379021 0.019053081 0.227663976
#> 10.66666667 5.463888781 0.552186749 0.015203465 0.256477013
#> 11.33333333 5.45243488 0.620074144 0.020073931 0.221628305
#> 12 5.386348531 0.665020538 0.026775166 0.261749199
#> 12.66666667 5.409419851 0.711166169 0.020894807 0.202719737
The raw data may be difficult to read because of tab misalignment and row wrapping. Here are parsed data:
mf=read.delim(fmeas, comment.char="#")
print(head(mf))
#> Time GLC G6P F6P FBP
#> 1 10.00000 5.539973 0.5113790 0.01905308 0.2276640
#> 2 10.66667 5.463889 0.5521867 0.01520346 0.2564770
#> 3 11.33333 5.452435 0.6200741 0.02007393 0.2216283
#> 4 12.00000 5.386349 0.6650205 0.02677517 0.2617492
#> 5 12.66667 5.409420 0.7111662 0.02089481 0.2027197
#> 6 13.33333 5.393365 0.7319101 0.02205172 0.1638058
The reactions are written as in following example:
fsto=system.file("dataglyco/network.txt", package="dynafluxr")
cat(readLines(fsto), sep="\n")
#> HXK GLC + ATP -> G6P + ADP
#> PGI G6P -> F6P
#> PFK F6P + ATP -> FBP + ADP
The first field separated by tabulation is a reaction name then reaction itself where species are separated by “ + ” sign and possibly preceded by a stoichiometric coefficient with “ * ” symbol, e.g. 2*NAD
. The symbol “*” is manadtory as a metabolite name can start with a number, e.g. 3PG
is a molecule of 3-phosphoglycerate and not 3 * PG
. The reaction can also be a degradation or washing out one, e.g. A ->
which means that A is degraded to nothing or transported outside of the considered system. The reaction can also be a synthesis one, e.g. -> A
which means that A is synthesized from nothing or brought from external environment.
We start by gathering example files in a fresh temporary created working directory (it will no longer exist after ending the R session):
ddir=tempfile(pattern="glyco")
dir.create(ddir)
file.copy(c(fmeas, fsto), ddir)
#> [1] TRUE TRUE
Let fit available data with B-splines of order 4 and with 5 internal knots (default in dynafluxr
), then plot them to have an idea how data looks like:
fmeas=file.path(ddir, "data.tsv")
fsto=file.path(ddir, "network.txt")
mf=read.delim(fmeas, comment.char="#")
nm=colnames(mf)[-1L]
msp=fitsmbsp(mf$Time, mf[, nm], n=4, nki=5)
matplot(mf$Time, msp(mf$Time), t="l", ylab="Concentration [mM]", xlab="Time [min]", lwd=2)
coltr=apply(col2rgb(1:6)/255, 2, function(v) do.call(rgb, c(as.list(v), alpha=0.25)))
matpoints(mf$Time, mf[, nm], pch="o", cex=0.75, col=coltr)
legend("topright", legend=nm, lty=1:5, col=1:6, cex=0.75, lwd=2)
Data treatment can make that measured concentration are getting negative.
As concentrations cannot get negative by definition, dynafluxr
constraints B-splines to non negative values.
Another non desirable effect can be that consumed metabolite is sometimes increasing which has no physical meaning. In our case, Glucose is only consumed so in dynafluxr
, it can be signaled in options --decresing
while --increasing
is reserved for accumulated products like FBP. All options can be abbreviated till unambiguous prefix, e.g. --decr
for --decreasing
.
We can see also in the previous plot that FBP has non zero values at the beginning of the experiment. To skip these nonphysical data, we can use an option --skip
which indicates the number of first time points to skip. In our case, we will skip 24 first time points.
dynafluxr
We try dynafluxr::cli()
(Command Line Interface) function with minimal options, just indicating measurement data, stoichiometric model and, as seen before, that GLC/FBP must be monotonously decreasing/increasing respectively and that 24 first time points must be skipped. The cli()
function is the main entry point to dynafluxr
package and it can be used from R or from a shell environment. It is a wrapper around dynafluxr::fdyn()
function which does the real work of flux estimation.
res=cli(c("-m", fmeas, "-s", fsto, "--decr", "GLC", "--incr", "FBP", "--skip", "24"))
The same command could be run from a shell (not R) environment if variables $fmeas and $fsto are defined. In this case it would look like:
$ Rscript -e "dynafluxr::cli()" -m $fmeas -s $fsto --decr GLC --incr FBP --skip 24
The cli()
function accepts a vector of command line arguments. The first argument is -m
or --meas
which indicates the measurement data file, then -s
or --stoich
for stoichiometric model file, then options for decreasing and increasing species, skipping time points and so on. The result of the function is a list with several elements.
The full list of available options can be viewed with dynafluxr::cli("-h")
command. The result of cli()
function (in case of no error in run-time), is a series of data and pdf files written in a directory glyco/data
which is the base name of measurement data file:
list.files("glyco/data/")
#> character(0)
The file Readme.md
describes the content of the result directory. These files can be explored by system tools like pdf-viewers and spreadsheet software but for needs of this vignette we will reproduce some results by R means. As we won’t need result files in this demo page, we’ll cancel their writing with empty option -o ""
which normally indicates result directory/archive name if it is different from the default one:
res=cli(c("-m", fmeas, "-s", fsto, "--decr", "GLC", "--incr", "FBP", "--skip", "24", "-o", ""))
Let now glance on fit quality, i.e. results of \(\chi^2\)-test:
print(res$chi2tab)
#> rss var_ref df chi2 pval
#> GLC 0.2140855 3.282197e-04 990 652.2628 1.000000e+00
#> G6P 0.3940253 2.982900e-04 990 1320.9474 7.145778e-12
#> F6P 0.1586435 3.712298e-05 990 4273.4582 0.000000e+00
#> FBP 2.7157317 2.610422e-03 990 1040.3420 1.297346e-01
#> Total 3.4824861 3.274055e-03 3960 7287.0105 9.723348e-201
We can see that \(p\)-values are almost 0 for G6P and FBP which usually means a poor fit quality. Let see what is going on by plotting integrated species with measurements:
nm=colnames(res$mf)[-1]
matplot(res$tpp, res$isp(res$tpp, nm), t="l", xlab="Time [min]", ylab="Concentration [mM]", lwd=2)
matpoints(res$tp, res$mf[,nm], pch="o", cex=0.5, col=coltr)
legend("topright", legend=nm, lty=1:5, col=1:6, cex=0.75, lwd=2)
For a human eye is does not look bad but for a computer, it is not good enough. The problem is that the uses underestimated reference variance visible in the column RefVar
of the \(\chi^2\)-test table.
The reference variance is used to estimate the noise in measurements and it is used to weight the residuals in least squares. If it is underestimated, the residuals are too large and the fit is considered as poor. The reference variance can be estimated automatically like it was done till now. But It can also be set to a some more realistic value manually. In our case, we can set it to from 0.05 mM$^2$ to 0.1 mM$^2$ which are reasonable values for NMR measurements. This can be done with --sderr
option: --sderr=GLC=0.05,G6P=0.05,FBP=0.1,F6P=0.05
From experiment setup, we know that FBP and F6P are not measured in absolute concentration as there is a doubt about proton attribution in the measured signal. Let apply scaling factors to FBP and F6P:
res=cli(c("-m", fmeas, "-s", fsto, "--decr", "GLC", "--incr", "FBP", "--skip", "24", "--sf=FBP,F6P", "-o", "", "--sderr=GLC=0.05,G6P=0.05,FBP=0.1,F6P=0.05"))
print(res$chi2tab)
#> rss var_ref df chi2 pval
#> GLC 0.2339998 0.002500000 990 93.59992 1
#> G6P 0.3708691 0.002500000 990 148.34765 1
#> F6P 0.1402564 0.002821918 990 49.70250 1
#> FBP 2.6265546 0.009745498 990 269.51466 1
#> Total 3.3716799 0.017567416 3960 561.16473 1
Now, all \(p\)-values are well above 0.05 usual threshold.
We are ready to see the main result: estimated reaction rates.
matplot(res$tpp, res$vsp(res$tpp), t="l", ylab="Rate [1/min]", xlab="Time [min]", lwd=2)
legend("topright", colnames(bsppar(res$vsp)$qw), lty=1:5, col=1:6, cex=0.75, lwd=2)
Having estimated rates, we can now explore details of metabolic fluxes. Not only we have the total flux \(\frac{dM_i}{dt}\) but also all its components coming from involved reactions \(S_{ij}v_j\). For G6P, it gives:
nm="G6P"
jnz=names(which(res$stofull[nm,] != 0)) # non-zero coeffs, i.e. reactions involved in G6P mass balance
fl=t(res$stofull[nm,jnz]*t(res$vsp(res$tpp, jnz))) # each rate v_j is multiplied by S_ij
fl=cbind(res$fsp(res$tpp, nm), fl) # add total flux
matplot(res$tpp, fl, t="l", ylab="Flux [mM/min]", xlab="Time [min]", main="G6P flux components", lwd=2)
abline(h=0)
legend("topright", legend=c("Total", jnz), lty=1:5, col=1:6, lwd=2)
Examples of plotting B-spline curves with smooth error intervals are given in a notebook available at
In this vignette, we have shown a step-by-step procedure for reaction rate estimation from metabolite concentration time courses and a stoichiometric model. We have seen how package parameters can be chosen, how fit quality can be evaluated and how data can be presented and exploited. Based on this vignette, user should be ready to start working with his own data set and stoichiometric model.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.