The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
In this vignette, we describe how to use the NewmanOmics Paired and Banked tests to analyze gene expression data from a single sample.
As usual, we start by loading the package:
library(NewmanOmics)
The package contains paired tumor and normal samples from patients with head and neck cancer. these came from a study that was submitted to the Gene Expression Omnibus.
data(GSE6631)
dim(GSE6631)
## [1] 2000 44
1:5, 1:4] GSE6631[
## Normal.mucosa.1 Cancer.1 Normal.mucosa.2 Cancer.2
## 34155_s_at 26.42586 22.19725 22.13673 18.66223
## 34281_at 334.29232 382.92879 393.40014 509.30754
## 39125_at 258.62695 290.06060 268.97994 220.16837
## 37276_at 45.65556 38.86692 34.77368 33.40627
## 1519_at 423.26690 366.40731 308.62338 550.34888
As we can see, this consists of (normalized) Affymetrix microarray data. The odd numbered columns are derived from normal mucosa, and the even numbered columns are derived from paired tumor samples.
Before proceeding, we are going to log-transform the data.
log2(1 + GSE6631)
HN <-boxplot(HN, col=c("forestgreen", "dodgerblue"))
The figure suggests that the the data have been reasonably normalized, and that it is unlikely to be overwhelmed by artifacts.
To illustrate the Newman Paired test, we are going to use only one sample.
HN[, 1:2]
HN1 <- pairedStat(HN1, pairing = c(-1,1))
result1 <-summary(result1@nu.statistics)
## Cancer.1
## Min. : 0.000584
## 1st Qu.: 0.415834
## Median : 0.988511
## Mean : 1.417227
## 3rd Qu.: 1.834846
## Max. :17.437729
summary(result1@p.values)
## Cancer.1
## Min. :0.0000
## 1st Qu.:0.3002
## Median :0.5768
## Mean :0.5462
## 3rd Qu.:0.8142
## Max. :0.9998
We can create a histogram of the per-gene (empirical) p-values
hist(result1)
We can also produce an “M-versus-A” plot of the data.
plot(result1)
The pairedStat function has flexible inputs, allowing you to store the data in various ways. Here we run the algorithm for three pairs, with an explicit pairing vector.
pairedStat(HN[, 1:6], pairing=c(-1, 1, -2, 2, -3, 3))
result2 <-summary(result2@nu.statistics)
## Cancer.1 Cancer.2 Cancer.3
## Min. : 0.000584 Min. : 0.000984 Min. : 0.001682
## 1st Qu.: 0.415834 1st Qu.: 0.459064 1st Qu.: 0.369775
## Median : 0.988511 Median : 1.020574 Median : 0.878206
## Mean : 1.417227 Mean : 1.417032 Mean : 1.419570
## 3rd Qu.: 1.834846 3rd Qu.: 1.779653 3rd Qu.: 1.765563
## Max. :17.437729 Max. :15.264180 Max. :16.572431
summary(result2@p.values)
## Cancer.1 Cancer.2 Cancer.3
## Min. :0.0000 Min. :0.0000 Min. :0.0000
## 1st Qu.:0.3010 1st Qu.:0.3157 1st Qu.:0.3196
## Median :0.5770 Median :0.5646 Median :0.6202
## Mean :0.5462 Mean :0.5375 Mean :0.5629
## 3rd Qu.:0.8145 3rd Qu.:0.7955 3rd Qu.:0.8347
## Max. :0.9997 Max. :0.9995 Max. :0.9992
plot(result2)
hist(result2)
We can also input the same data as a pair of matrices.
HN[, c(1,3,5)]
normals <- HN[, c(2,4,6)]
tumors <- pairedStat(normals, tumors)
result3 <-summary(result3@nu.statistics)
## Cancer.1 Cancer.2 Cancer.3
## Min. : 0.000584 Min. : 0.000984 Min. : 0.001682
## 1st Qu.: 0.415834 1st Qu.: 0.459064 1st Qu.: 0.369775
## Median : 0.988511 Median : 1.020574 Median : 0.878206
## Mean : 1.417227 Mean : 1.417032 Mean : 1.419570
## 3rd Qu.: 1.834846 3rd Qu.: 1.779653 3rd Qu.: 1.765563
## Max. :17.437729 Max. :15.264180 Max. :16.572431
summary(result3@p.values)
## Cancer.1 Cancer.2 Cancer.3
## Min. :0.0000 Min. :0.0000 Min. :0.0000
## 1st Qu.:0.3001 1st Qu.:0.3147 1st Qu.:0.3187
## Median :0.5775 Median :0.5652 Median :0.6206
## Mean :0.5465 Mean :0.5377 Mean :0.5632
## 3rd Qu.:0.8154 3rd Qu.:0.7964 3rd Qu.:0.8358
## Max. :0.9997 Max. :0.9996 Max. :0.9993
Or we can input the same data as a list of paired samples.
list(HN[,1:2], HN[,3:4], HN[,5:6])
listOfPairs <- pairedStat(listOfPairs)
result4 <-summary(result4@nu.statistics)
## Cancer.1 Cancer.2 Cancer.3
## Min. : 0.000584 Min. : 0.000984 Min. : 0.001682
## 1st Qu.: 0.415834 1st Qu.: 0.459064 1st Qu.: 0.369775
## Median : 0.988511 Median : 1.020574 Median : 0.878206
## Mean : 1.417227 Mean : 1.417032 Mean : 1.419570
## 3rd Qu.: 1.834846 3rd Qu.: 1.779653 3rd Qu.: 1.765563
## Max. :17.437729 Max. :15.264180 Max. :16.572431
summary(result4@p.values)
## Cancer.1 Cancer.2 Cancer.3
## Min. :0.0000 Min. :0.0000 Min. :0.0000
## 1st Qu.:0.3004 1st Qu.:0.3152 1st Qu.:0.3190
## Median :0.5769 Median :0.5645 Median :0.6203
## Mean :0.5463 Mean :0.5375 Mean :0.5630
## 3rd Qu.:0.8149 3rd Qu.:0.7959 3rd Qu.:0.8350
## Max. :0.9998 Max. :0.9996 Max. :0.9993
A completely different approach to personalized transcriptomics is to compare individual samples to a “bank” of known normals.
HN[, seq(1, ncol(HN), 2)] # odds are normal
normals <- HN[, seq(2, ncol(HN), 2)] # evens are tumor
tumors <- createBank(normals)
bank <- bankStat(bank, tumors[,1,drop=FALSE])
result5 <-summary(result5$nu.statistics)
## Cancer.1
## Min. :-9.6255
## 1st Qu.:-0.5540
## Median : 0.2797
## Mean : 0.1027
## 3rd Qu.: 0.9911
## Max. : 6.7524
summary(result5$p.values)
## Cancer.1
## Min. :0.0000
## 1st Qu.:0.2898
## Median :0.6101
## Mean :0.5569
## 3rd Qu.:0.8392
## Max. :1.0000
hist(result5$p.values, breaks=101)
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.