Qualitative and quantitative analysis of contaminants are the core of the Environmental Science. GC/LC-MS might be one of the most popular instruments for such analytical methonds. Previous works such as xcms
were devoloped for GC-MS data. However, such packages have limited functions for environmental analysis. In this package. I added functions for various GC/LC-MS data analysis purposes used in environmental analysis. Such feature could not only reveal certain problems, but also help the user find out the unknown patterns in the dataset of GC-MS.
GC/LC is used for separation and MS is used for detection in a GC/LC-MS system. The collected data are intensities of certain mass at different retention time. When we perform analysis on certain column in full scan mode, the counts of different mass were collected in each scan. The drwell time for each scan might only last for 500ms or less. Then the next scan begins with a different retention time. Here we could use a matrix to stand for those data. Each column stands for each mass and row stands for the retention time of that scan. Such matrix could be treated as time series data. In this package, we treat such data as matrix
type.
For high-resolution MS, building such matrix is tricky. We might need to bin the RAW data to make alignment for different scans into a matrix. Such works could be done by xcms
.
When you perform a selected ions monitor(SIM) mode analysis, only few mass data were collected and each mass would have counts and retention time as a time seris data. In this package, we treat such data as data.frame
type.
You could use getmd
to import the mass spectrum data as supported by xcms
and get the profile of GC-MS data matrix. mzstep
is the bin step for mass:
data <- getmd('data/data1.CDF', mzstep = 0.1)
You could also subset the data by the mass(m/z 100-1000) or retention time range(40-100s) in getmd
function:
data <- getmd(data,mzrange=c(100,1000),rtrange=c(40,100))
You could also combined the mass full-scan data with the same range of retention time by cbmd
:
data <- cbmd(data1,data2,data3)
You could plot the Total Ion Chromatogram(TIC) for certain RT and mass range.
plottic(data,rt=c(3.1,25),ms=c(100,1000))
You could also plot the mass spectrum for certain RT range. You could use the returned MSP files for NIST search:
plotrtms(data,rt=c(3.1,25),ms=c(100,1000))
The Extracted Ion Chromatogram(EIC) is also support by enviGCMS
and the returned data could be analysised for molecular isotopes:
plotmsrt(data,ms=500,rt=c(3.1,25))
You could use plotms
or plotmz
to show the heatmap or scatter plot for LC/GC-MS data, which is very useful for exploratory data analysis.
plotms(data)
plotmz(data)
You could change the retention time into the temprature if it is a constant speed of temperature rising process. But you need show the temprature range.
plott(data,temp = c(100,320))
enviGCMS
supplied many function for decreasing the noisy during the analysis process. findline
could be used for find line of the boundary regression model for noisy. comparems
could be used to make a point-to-point data subtraction of two full-scan mass spectrum data. plotgroup
could be used convert the data matrix into a 0-1 heatmap according to threshold. plotsub
could be used to show the self backgroud subtraction of full-scan data. plotsms
shows the RSD of the intensity of full scan data. plothist
could be used to find the data distribution of the histgram of the intensities of full scan data.
Some functions could be used to caculate the molecular isotope ratio. EIC data could be import into GetIntergration
and return the infomation of found peaks. Getisotoplogues
could be used to caculate the molecular isotope ratio of certain molecular. Some shortcut function such as batch
and qbatch
could be used to caculate molecular isotope ratio for mutiple and single molecular in EIC data.
In environmetnal non-target analysis, some specific functions such as comparision, batch correction and visulization could be useful to handle data pre-processed by xcms
as shown below. Besides, isotope extraction for single group of samples with certain mass diff could also be done by getmassdiff
function.
getdata
could be used to get the xcmsSet
object in one step with optimized methods
getbgremove
could subtract the background group from xcmsSet
object with two groups (the first group is background group)
getupload
could get the csv files to be submitted to Metaboanalyst
gettechrep
could get the report for technique replicates
getbiotechrep
could get the report for all of your technique replicates for bio replicated sample in single group
getgrouprep
could get the report for samples with biological and technique replicates in different groups
gettimegrouprep
could get the report for the time series or two factor groups for samples with biological and technique replicates in different groups
plotmr
could plot the scatter plot for xcmsset (or two) objects with threshold
plote
could plot EIC and boxplot for all peaks and return diffreport
getfeaturest
could get the features from t test, with p value, q value, rsd and power restriction
getfeaturesanova
could get the features from anova, with p value, q value, rsd and power restriction
svacor
could perform surrogate variable analysis(SVA) to correct the batch effects
svapca
, svaplot
, svapca
could be used to visulize the batch effects and their influnces on each peaks
In general, enviGCMS
could be used to explore the data from GC/LC-MS and extract certain patterns in the data with various purposes.