The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
There are two prevaling methods for using HTS-SIP data to estimate the amount of isotope that each OTU incorporated:
In this vignette, we are going to show how to run both analyses and also compare the results a bit.
First, let's load some packages including HTSSIP
.
library(dplyr)
library(ggplot2)
library(HTSSIP)
OK. We're going to be using 2 data files:
We'll be using the dataset that we simulated in the HTSSIP_sim vignette.
The phyloseq object is similar to the dataset in the other vignettes.
# HTS-SIP data
physeq_rep3
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 6 taxa and 144 samples ]
## sample_data() Sample Data: [ 144 samples by 5 sample variables ]
The associated qPCR data is a list of length = 2.
# qPCR data (list object)
physeq_rep3_qPCR %>% names
## [1] "raw" "summary"
For the analyses in this vignette, we only need the 'summary' table.
# qPCR data (list object)
physeq_rep3_qPCR_sum = physeq_rep3_qPCR$summary
physeq_rep3_qPCR_sum %>% head(n=4)
## IS_CONTROL Sample Buoyant_density qPCR_tech_rep_mean
## 1 FALSE 13C-Glu_rep1_1.671712_2 1.671712 46598172
## 2 FALSE 13C-Glu_rep1_1.671722_1 1.671722 42299449
## 3 FALSE 13C-Glu_rep1_1.680311_3 1.680311 53536519
## 4 FALSE 13C-Glu_rep1_1.683540_4 1.683540 51449609
## qPCR_tech_rep_sd Gradient Fraction Treatment Replicate
## 1 9055833 13C-Glu_rep1 2 13C-Glu 1
## 2 16369298 13C-Glu_rep1 1 13C-Glu 1
## 3 18265667 13C-Glu_rep1 3 13C-Glu 1
## 4 3796576 13C-Glu_rep1 4 13C-Glu 1
OK. Let's quantify isotope incorporation witht the q-SIP method.
# transforming OTU counts
physeq_rep3_t = OTU_qPCR_trans(physeq_rep3, physeq_rep3_qPCR_sum)
# calculating atom fraction excess
atomX = qSIP_atom_excess(physeq_rep3_t,
control_expr='Treatment=="12C-Con"',
treatment_rep='Replicate')
atomX %>% names
## [1] "W" "A"
The resulting list object contains 2 data.frames. We are interested in the 'A' table, which contains estimated BD shifts (Z) and atom fraction excess (A).
atomX$A %>% head(n=4)
## # A tibble: 4 x 9
## OTU Wlab Wlight Z Gi Mlight Mheavymax Mlab A
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 OTU.1 1.72 1.70 0.0222 0.636 308. 318. 312. 0.412
## 2 OTU.2 1.71 1.69 0.0117 0.586 308. 318. 310. 0.218
## 3 OTU.3 1.73 1.70 0.0347 0.608 308. 318. 314. 0.644
## 4 OTU.4 1.73 1.70 0.0261 0.705 308. 318. 313. 0.484
Next, let's calculate bootstrap confidence intervales for the atom fraction excess estimations.
# calculating bootstrapped CI values
df_atomX_boot = qSIP_bootstrap(atomX, n_boot=100)
df_atomX_boot %>% head(n=4)
## # A tibble: 4 x 11
## OTU Wlab Wlight Z Gi Mlight Mheavymax Mlab A A_CI_low
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 OTU.1 1.72 1.70 0.0222 0.636 308. 318. 312. 0.412 0.0252
## 2 OTU.2 1.71 1.69 0.0117 0.586 308. 318. 310. 0.218 0.126
## 3 OTU.3 1.73 1.70 0.0347 0.608 308. 318. 314. 0.644 0.431
## 4 OTU.4 1.73 1.70 0.0261 0.705 308. 318. 313. 0.484 0.366
## # … with 1 more variable: A_CI_high <dbl>
Now for delta_BD. The setup is easier because we are not using qPCR data, just relative abundances from 16S rRNA sequence data.
df_dBD = delta_BD(physeq_rep3, control_expr='Treatment=="12C-Con"')
df_dBD %>% head(n=4)
## # A tibble: 4 x 4
## OTU CM_control CM_treatment delta_BD
## <chr> <dbl> <dbl> <dbl>
## 1 OTU.1 1.69 1.72 0.0260
## 2 OTU.2 1.69 1.70 0.0130
## 3 OTU.3 1.69 1.74 0.0562
## 4 OTU.4 1.71 1.73 0.0208
Let's plot the data and compare all of the results. First, let's join all of the data into one table for plotting. We'll also format it for plotting.
# checking & joining data
stopifnot(nrow(df_atomX_boot) == nrow(df_dBD))
df_j = dplyr::inner_join(df_atomX_boot, df_dBD, c('OTU'='OTU'))
stopifnot(nrow(df_atomX_boot) == nrow(df_j))
# formatting data for plotting
df_j = df_j %>%
dplyr::mutate(OTU = reorder(OTU, -delta_BD))
OK. Time to plot!
# plotting BD shift (Z)
ggplot(df_j, aes(OTU)) +
geom_point(aes(y=Z), color='blue') +
geom_point(aes(y=delta_BD), color='red') +
geom_hline(yintercept=0, linetype='dashed', alpha=0.5) +
labs(x='OTU', y='BD shift (Z)') +
theme_bw() +
theme(
axis.text.x = element_blank()
)
In the figure, red points are delta_BD and blue points are q-SIP. It's easy to see that delta_BD is a lot more variable than q-SIP. This is likely due to a high influence of compositional data artifacts on delta_BD versus q-SIP.
Let's make a boxplot to show the difference in estimation variance between the two methods.
# plotting BD shift (Z): boxplots
## formatting the table
df_j_g = df_j %>%
dplyr::select(OTU, Z, delta_BD) %>%
tidyr::gather(Method, BD_shift, Z, delta_BD) %>%
mutate(Method = ifelse(Method == 'Z', 'q-SIP', 'delta-BD'))
## plotting
ggplot(df_j_g, aes(Method, BD_shift)) +
geom_boxplot() +
geom_hline(yintercept=0, linetype='dashed', alpha=0.5) +
labs(x='Method', y='BD shift (Z)') +
theme_bw()
The boxplot helps to summarize how much more variance delta_BD produces versus q-SIP.
sessionInfo()
## R version 3.4.3 (2017-11-30)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS Sierra 10.12.6
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
##
## locale:
## [1] C/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] phyloseq_1.22.3 HTSSIP_1.4.1 ggplot2_3.2.0 tidyr_0.8.3
## [5] dplyr_0.8.0.1
##
## loaded via a namespace (and not attached):
## [1] nlme_3.1-131 bitops_1.0-6
## [3] matrixStats_0.54.0 bit64_0.9-7
## [5] doParallel_1.0.14 RColorBrewer_1.1-2
## [7] GenomeInfoDb_1.14.0 tools_3.4.3
## [9] backports_1.1.4 utf8_1.1.4
## [11] R6_2.4.0 coenocliner_0.2-2
## [13] vegan_2.5-1 rpart_4.1-15
## [15] Hmisc_4.2-0 DBI_1.0.0
## [17] lazyeval_0.2.2 BiocGenerics_0.24.0
## [19] mgcv_1.8-28 colorspace_1.4-1
## [21] permute_0.9-5 ade4_1.7-13
## [23] nnet_7.3-12 withr_2.1.2
## [25] tidyselect_0.2.5 gridExtra_2.3
## [27] DESeq2_1.18.1 bit_1.1-14
## [29] compiler_3.4.3 cli_1.1.0
## [31] Biobase_2.38.0 htmlTable_1.13.1
## [33] DelayedArray_0.4.1 labeling_0.3
## [35] scales_1.0.0 checkmate_1.9.3
## [37] genefilter_1.60.0 stringr_1.4.0
## [39] digest_0.6.19 foreign_0.8-71
## [41] XVector_0.18.0 base64enc_0.1-3
## [43] pkgconfig_2.0.2 htmltools_0.3.6
## [45] highr_0.8 htmlwidgets_1.3
## [47] rlang_0.4.0 RSQLite_2.1.1
## [49] rstudioapi_0.10 jsonlite_1.6
## [51] BiocParallel_1.12.0 acepack_1.4.1
## [53] RCurl_1.95-4.12 magrittr_1.5
## [55] GenomeInfoDbData_1.0.0 Formula_1.2-3
## [57] biomformat_1.6.0 Matrix_1.2-17
## [59] fansi_0.4.0 Rcpp_1.0.1
## [61] munsell_0.5.0 S4Vectors_0.16.0
## [63] ape_5.3 stringi_1.4.3
## [65] MASS_7.3-51.4 SummarizedExperiment_1.8.1
## [67] zlibbioc_1.24.0 rhdf5_2.22.0
## [69] plyr_1.8.4 blob_1.1.1
## [71] grid_3.4.3 parallel_3.4.3
## [73] crayon_1.3.4 lattice_0.20-38
## [75] Biostrings_2.46.0 splines_3.4.3
## [77] multtest_2.34.0 annotate_1.56.2
## [79] locfit_1.5-9.1 zeallot_0.1.0
## [81] knitr_1.18 pillar_1.4.1
## [83] igraph_1.2.4 GenomicRanges_1.30.3
## [85] markdown_0.9 geneplotter_1.56.0
## [87] reshape2_1.4.3 codetools_0.2-16
## [89] stats4_3.4.3 XML_3.98-1.19
## [91] glue_1.3.1 evaluate_0.14
## [93] latticeExtra_0.6-28 data.table_1.10.4-3
## [95] vctrs_0.1.0 foreach_1.4.4
## [97] gtable_0.3.0 purrr_0.3.2
## [99] assertthat_0.2.1 mime_0.6
## [101] xtable_1.8-4 survival_2.44-1.1
## [103] tibble_2.1.1 iterators_1.0.10
## [105] memoise_1.1.0 AnnotationDbi_1.40.0
## [107] IRanges_2.12.0 cluster_2.0.6
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.