This short Vignette will show how to correction overloaded signals in (i) an artificial test case and (ii) a provided real data set. To achieve this we need to load the package functions as well as a small data example in xcmsRaw format.
library(CorrectOverloadedPeaks)
data("xcmsRaw_data")
Let’s model a typical overloaded signal occuring frequently in GC-APCI-MS using the provided function .
pk <- CorrectOverloadedPeaks::ModelGaussPeak(height=10^7, width=3, scan_rate=10, e=0, ds=8*10^6, base_line=10^2)
plot(pk, main="Gaussian peak of true intensity 10^7 but cutt off at 8*10^6")
Now we roughly estimate peak boarders before applying the provided function to correct peak data.
idx <- pk[,"int"]>0.005 * max(pk[,"int"])
tmp <- CorrectOverloadedPeaks::FitGaussPeak(x=pk[idx,"rt"], y=pk[idx,"int"], silent=FALSE, xlab="RT", ylab="Intensity")
## [1] "Number of converging sollutions: 10, keeping 1"
The generated QC plot does show the optimal solution found (green line), indicating the substituted intensity values (grey circles) and obtained parameters (blue text) including the probably peak height (max_int=9.7*10^6) being very close to the true peak height (10^7). Now let’s extend this simplified process to peaks from a real data set. The following function call will generate (i) a PDF in the working directory with QC-plots for 10 peaks from 5 chromatographic regions, (ii) processing information output to the console and (iii) a new file “cor_df_all.RData” in the working directory containing all extracted but non-corrected mass traces.
tmp <- CorrectOverloadedPeaks::CorrectOverloadedPeaks(data=xcmsRaw_data, method="EMG", testing=TRUE)
##
## Processing... S5_35_01_2241_Int+LM.mzXML
##
## Trying to correct 5 overloaded regions.
## [1] "Processing Region/Mass: 1 / 1"
## [1] "Number of converging sollutions: 157, keeping 5"
## [1] "Processing Region/Mass: 1 / 2"
## [1] "Number of converging sollutions: 110, keeping 63"
## [1] "Processing Region/Mass: 2 / 1"
## [1] "Number of converging sollutions: 128, keeping 11"
## [1] "Processing Region/Mass: 2 / 2"
## [1] "Number of converging sollutions: 110, keeping 6"
## [1] "Processing Region/Mass: 3 / 1"
## [1] "Number of converging sollutions: 126, keeping 6"
## [1] "Processing Region/Mass: 4 / 1"
## [1] "Number of converging sollutions: 120, keeping 66"
## [1] "Processing Region/Mass: 4 / 2"
## [1] "Number of converging sollutions: 112, keeping 12"
## [1] "Processing Region/Mass: 4 / 3"
## [1] "Number of converging sollutions: 119, keeping 64"
## [1] "Processing Region/Mass: 5 / 1"
## [1] "Number of converging sollutions: 58, keeping 4"
## [1] "Processing Region/Mass: 5 / 2"
## [1] "Number of converging sollutions: 161, keeping 37"
## [1] "Storing non-corrected data information in 'cor_df_all.RData'"
Let’s load these non-corrected mass traces for further visualization of package capabilities. For instance we can reprocess peak 2 from region 4 using the isotopic ratio approach:
load("cor_df_all.RData")
head(cor_df_all[[4]][[2]])
## Scan RT mz0 int0 mz1 int1 mz2 int2 modified
## 188 188 599.514 350.1646 7374 351.1684 5589 352.1663 1277 FALSE
## 189 189 599.623 350.1636 19565 351.1668 7864 352.1631 3842 FALSE
## 190 190 599.732 350.1627 50418 351.1650 19183 352.1616 9141 FALSE
## 191 191 599.840 350.1646 118553 351.1664 38646 352.1633 18278 FALSE
## 192 192 599.951 350.1635 260899 351.1651 86333 352.1620 41024 FALSE
## 193 193 600.060 350.1637 528827 351.1651 167749 352.1619 77910 FALSE
tmp <- CorrectOverloadedPeaks::FitPeakByIsotopicRatio(cor_df=cor_df_all[[4]][[2]], silent=FALSE)
The extracted data contain RT and Intensity information for the overloaded mass trace (mz=350.164) as well as isotopes of this mz up to the first isotope which is not itself overloaded (M+2, green triangles). This isotope is evaluated with respect to its ratio to M+0 in the peak front (15.9%) and this ratio in turn is used to scal up the overloaded data points of M+0 (grey circles) as indicated by the black line. The data could of course be processed alternatively using the Gauss method as shown previously for artificial data.
tmp <- CorrectOverloadedPeaks::FitGaussPeak(x=cor_df_all[[4]][[2]][,"RT"], y=cor_df_all[[4]][[2]][,"int0"], silent=FALSE, xlab="RT", ylab="Intensity")
## [1] "Number of converging sollutions: 10, keeping 2"
Finally we clean up the temporary files stored on the harddrive.
if(file.exists("cor_df_all.RData")) file.remove("cor_df_all.RData")
## [1] TRUE
if(file.exists("S5_35_01_2241_Int+LM.mzXML.pdf")) file.remove("S5_35_01_2241_Int+LM.mzXML.pdf")
## [1] TRUE