The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Towards Confidence Estimates in Cascade Networks using the SelectBoost Package

Frédéric Bertrand and Myriam Maumy-Bertrand

Université de Strasbourg and CNRSIRMA, labex IRMIA
frederic.bertrand@utt.fr

2022-11-29

#Introduction

Aims of the vignette

Extending results from the Cascade package: reverse engineering with selectboost to compute confidence indices for a fitted model. We first fit a model to a cascade network using the Cascade package inference function then we compute confidence indices for the inferred links using the Selecboost algorithm.

If you are a Linux/Unix or a Macos user, you can install a version of SelectBoost with support for doMC from github with:

devtools::install_github("fbertran/SelectBoost", ref = "doMC")

References

Code

Code to reproduce the datasets saved with the package and some the figures of the article Bertrand et al. (2020), Bioinformatics. https://doi.org/10.1093/bioinformatics/btaa855

Data simulation

library(Cascade)

We define the F array for the simulations.

T<-4
F<-array(0,c(T-1,T-1,T*(T-1)/2))

for(i in 1:(T*(T-1)/2)){diag(F[,,i])<-1}
F[,,2]<-F[,,2]*0.2
F[2,1,2]<-1
F[3,2,2]<-1
F[,,4]<-F[,,2]*0.3
F[3,1,4]<-1
F[,,5]<-F[,,2]

We set the seed to make the results reproducible

set.seed(1)
Net<-Cascade::network_random(
  nb=100,
  time_label=rep(1:4,each=25),
  exp=1,
  init=1,
  regul=round(rexp(100,1))+1,
  min_expr=0.1,
  max_expr=2,
  casc.level=0.4
)
Net@F<-F

We simulate gene expression according to the network Net

M <- Cascade::gene_expr_simulation(
  network=Net,
  time_label=rep(1:4,each=25),
  subject=5,
  level_peak=200)

Network inference

We infer the new network.

Net_inf_C <- Cascade::inference(M,cv.subjects=TRUE)
#> We are at step :  1
#> The convergence of the network is (L1 norm) : 0.0068
#> We are at step :  2
#> The convergence of the network is (L1 norm) : 0.00121
#> We are at step :  3
#> The convergence of the network is (L1 norm) : 0.00096

Heatmap of the coefficients of the Omega matrix of the network. Run the code to get the graph.

stats::heatmap(Net_inf_C@network, Rowv = NA, Colv = NA, scale="none", revC=TRUE)
Fab_inf_C <- Net_inf_C@F

Compute the confidence indices for the inference

library(SelectBoost)
set.seed(1)

By default the crossvalidation is made subjectwise according to a leave one out scheme and the resampling analysis is made at the .95 c0 level. To pass CRAN tests, use.parallel = FALSE is required. Set use.parallel = TRUE and select the number of cores using ncores = 4.

net_pct_selected <- selectboost(M, Fab_inf_C, use.parallel = FALSE)
#> 
#> We are at peak :  2
#> .........................
#> We are at peak :  3
#> .........................
#> We are at peak :  4
#> .........................
net_pct_selected_.5 <- selectboost(M, Fab_inf_C, c0value = .5, use.parallel = FALSE)
#> 
#> We are at peak :  2
#> .........................
#> We are at peak :  3
#> .........................
#> We are at peak :  4
#> .........................
net_pct_selected_thr <- selectboost(M, Fab_inf_C, group = group_func_1, use.parallel = FALSE)
#> 
#> We are at peak :  2
#> .........................
#> We are at peak :  3
#> .........................
#> We are at peak :  4
#> .........................

Use cv.subject=FALSE to use default crossvalidation

net_pct_selected_cv <- selectboost(M, Fab_inf_C, cv.subject=FALSE, use.parallel = FALSE)
#> 
#> We are at peak :  2
#> .........................
#> We are at peak :  3
#> .........................
#> We are at peak :  4
#> .........................

Analysis of the confidence indices

Use plot to display the result of the confidence analysis.

plot(net_pct_selected)

Run the code to plot the other results.

plot(net_pct_selected_.5)
plot(net_pct_selected_thr)
plot(net_pct_selected_cv)

Run the code to plot the remaning graphs.

Distribution of non-zero (absolute value > 1e-5) coefficients

hist(Net_inf_C@network[abs(Net_inf_C@network)>1e-5])

Plot of confidence at .95 resampling level versus coefficient value for non-zero (absolute value > 1e-5) coefficients

plot(Net_inf_C@network[abs(Net_inf_C@network)>1e-5],net_pct_selected@network.confidence[abs(Net_inf_C@network)>1e-5])
hist(net_pct_selected@network.confidence[abs(Net_inf_C@network)>1e-5])

Plot of confidence at .5 resampling level versus coefficient value for non-zero (absolute value > 1e-5) coefficients

plot(Net_inf_C@network[abs(Net_inf_C@network)>1e-5],net_pct_selected_.5@network.confidence[abs(Net_inf_C@network)>1e-5])
hist(net_pct_selected_.5@network.confidence[abs(Net_inf_C@network)>1e-5])

Plot of confidence at .95 resamling level with groups created by thresholding the correlation matrix versus coefficient value for non-zero (absolute value > 1e-5) coefficients.

plot(Net_inf_C@network[abs(Net_inf_C@network)>1e-5],net_pct_selected_thr@network.confidence[abs(Net_inf_C@network)>1e-5])
hist(net_pct_selected_thr@network.confidence[abs(Net_inf_C@network)>1e-5])

Plot of confidence at .95 resampling level versus coefficient value for non-zero (absolute value > 1e-5) coefficients using standard cross-validation.

plot(Net_inf_C@network[abs(Net_inf_C@network)>1e-5],net_pct_selected_cv@network.confidence[abs(Net_inf_C@network)>1e-5])
hist(net_pct_selected_cv@network.confidence[abs(Net_inf_C@network)>1e-5])

Further improvements

For further recommandations on the choice of the c0 parameter, for instance as a quantile, please consult the SI of Bertrand et al. 2020.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.