The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

NewmanOmics: Tools for Personalized Transcriptomics

Kevin R. Coombes

In this vignette, we describe how to use the NewmanOmics Paired and Banked tests to analyze gene expression data from a single sample.

Getting Started

As usual, we start by loading the package:

library(NewmanOmics)

The package contains paired tumor and normal samples from patients with head and neck cancer. these came from a study that was submitted to the Gene Expression Omnibus.

data(GSE6631)
dim(GSE6631)
## [1] 2000   44
GSE6631[1:5, 1:4]
##            Normal.mucosa.1  Cancer.1 Normal.mucosa.2  Cancer.2
## 34155_s_at        26.42586  22.19725        22.13673  18.66223
## 34281_at         334.29232 382.92879       393.40014 509.30754
## 39125_at         258.62695 290.06060       268.97994 220.16837
## 37276_at          45.65556  38.86692        34.77368  33.40627
## 1519_at          423.26690 366.40731       308.62338 550.34888

As we can see, this consists of (normalized) Affymetrix microarray data. The odd numbered columns are derived from normal mucosa, and the even numbered columns are derived from paired tumor samples.

Before proceeding, we are going to log-transform the data.

HN <- log2(1 + GSE6631)
boxplot(HN, col=c("forestgreen", "dodgerblue"))

Box plot of log-transformed data. The figure suggests that the the data have been reasonably normalized, and that it is unlikely to be overwhelmed by artifacts.

Paired Statistic

To illustrate the Newman Paired test, we are going to use only one sample.

HN1 <- HN[, 1:2]
result1 <- pairedStat(HN1, pairing = c(-1,1))
summary(result1@nu.statistics)
##     Cancer.1        
##  Min.   : 0.000584  
##  1st Qu.: 0.415834  
##  Median : 0.988511  
##  Mean   : 1.417227  
##  3rd Qu.: 1.834846  
##  Max.   :17.437729
summary(result1@p.values)
##     Cancer.1     
##  Min.   :0.0000  
##  1st Qu.:0.3002  
##  Median :0.5768  
##  Mean   :0.5462  
##  3rd Qu.:0.8142  
##  Max.   :0.9998

We can create a histogram of the per-gene (empirical) p-values

hist(result1)

Histogram of empoirical p-values. We can also produce an “M-versus-A” plot of the data.

plot(result1)

Bland-Altman plot.

Alternate Inputs

The pairedStat function has flexible inputs, allowing you to store the data in various ways. Here we run the algorithm for three pairs, with an explicit pairing vector.

result2 <- pairedStat(HN[, 1:6], pairing=c(-1, 1, -2, 2, -3, 3))
summary(result2@nu.statistics)
##     Cancer.1            Cancer.2            Cancer.3        
##  Min.   : 0.000584   Min.   : 0.000984   Min.   : 0.001682  
##  1st Qu.: 0.415834   1st Qu.: 0.459064   1st Qu.: 0.369775  
##  Median : 0.988511   Median : 1.020574   Median : 0.878206  
##  Mean   : 1.417227   Mean   : 1.417032   Mean   : 1.419570  
##  3rd Qu.: 1.834846   3rd Qu.: 1.779653   3rd Qu.: 1.765563  
##  Max.   :17.437729   Max.   :15.264180   Max.   :16.572431
summary(result2@p.values)
##     Cancer.1         Cancer.2         Cancer.3     
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.3010   1st Qu.:0.3157   1st Qu.:0.3196  
##  Median :0.5770   Median :0.5646   Median :0.6202  
##  Mean   :0.5462   Mean   :0.5375   Mean   :0.5629  
##  3rd Qu.:0.8145   3rd Qu.:0.7955   3rd Qu.:0.8347  
##  Max.   :0.9997   Max.   :0.9995   Max.   :0.9992
plot(result2)

Bland-ALtman plots.Bland-ALtman plots.Bland-ALtman plots.

hist(result2)

P-value histograms.P-value histograms.P-value histograms.

We can also input the same data as a pair of matrices.

normals <- HN[, c(1,3,5)]
tumors <- HN[, c(2,4,6)]
result3 <- pairedStat(normals, tumors)
summary(result3@nu.statistics)
##     Cancer.1            Cancer.2            Cancer.3        
##  Min.   : 0.000584   Min.   : 0.000984   Min.   : 0.001682  
##  1st Qu.: 0.415834   1st Qu.: 0.459064   1st Qu.: 0.369775  
##  Median : 0.988511   Median : 1.020574   Median : 0.878206  
##  Mean   : 1.417227   Mean   : 1.417032   Mean   : 1.419570  
##  3rd Qu.: 1.834846   3rd Qu.: 1.779653   3rd Qu.: 1.765563  
##  Max.   :17.437729   Max.   :15.264180   Max.   :16.572431
summary(result3@p.values)
##     Cancer.1         Cancer.2         Cancer.3     
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.3001   1st Qu.:0.3147   1st Qu.:0.3187  
##  Median :0.5775   Median :0.5652   Median :0.6206  
##  Mean   :0.5465   Mean   :0.5377   Mean   :0.5632  
##  3rd Qu.:0.8154   3rd Qu.:0.7964   3rd Qu.:0.8358  
##  Max.   :0.9997   Max.   :0.9996   Max.   :0.9993

Or we can input the same data as a list of paired samples.

listOfPairs <- list(HN[,1:2], HN[,3:4], HN[,5:6])
result4 <- pairedStat(listOfPairs)
summary(result4@nu.statistics)
##     Cancer.1            Cancer.2            Cancer.3        
##  Min.   : 0.000584   Min.   : 0.000984   Min.   : 0.001682  
##  1st Qu.: 0.415834   1st Qu.: 0.459064   1st Qu.: 0.369775  
##  Median : 0.988511   Median : 1.020574   Median : 0.878206  
##  Mean   : 1.417227   Mean   : 1.417032   Mean   : 1.419570  
##  3rd Qu.: 1.834846   3rd Qu.: 1.779653   3rd Qu.: 1.765563  
##  Max.   :17.437729   Max.   :15.264180   Max.   :16.572431
summary(result4@p.values)
##     Cancer.1         Cancer.2         Cancer.3     
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.3004   1st Qu.:0.3152   1st Qu.:0.3190  
##  Median :0.5769   Median :0.5645   Median :0.6203  
##  Mean   :0.5463   Mean   :0.5375   Mean   :0.5630  
##  3rd Qu.:0.8149   3rd Qu.:0.7959   3rd Qu.:0.8350  
##  Max.   :0.9998   Max.   :0.9996   Max.   :0.9993

Banked Statistic

A completely different approach to personalized transcriptomics is to compare individual samples to a “bank” of known normals.

normals <- HN[, seq(1, ncol(HN), 2)] # odds are normal
tumors <- HN[, seq(2, ncol(HN), 2)] # evens are tumor
bank <- createBank(normals)
result5 <- bankStat(bank, tumors[,1,drop=FALSE])
summary(result5$nu.statistics)
##     Cancer.1      
##  Min.   :-9.6255  
##  1st Qu.:-0.5540  
##  Median : 0.2797  
##  Mean   : 0.1027  
##  3rd Qu.: 0.9911  
##  Max.   : 6.7524
summary(result5$p.values)
##     Cancer.1     
##  Min.   :0.0000  
##  1st Qu.:0.2898  
##  Median :0.6101  
##  Mean   :0.5569  
##  3rd Qu.:0.8392  
##  Max.   :1.0000
hist(result5$p.values, breaks=101)

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.