The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

NACHO Analysis

A NAnostring quality Control dasHbOard

Mickaël Canouil, Ph.D., Gerard A. Bouland and Roderick C. Slieker, Ph.D.

January 12, 2024

1 Installation

# Install NACHO from CRAN:
install.packages("NACHO")

# Or the the development version from GitHub:
# install.packages("remotes")
remotes::install_github("mcanouil/NACHO")

2 Overview

NACHO (NAnostring quality Control dasHbOard) is developed for NanoString nCounter data.
NanoString nCounter data is a messenger-RNA/micro-RNA (mRNA/miRNA) expression assay and works with fluorescent barcodes.
Each barcode is assigned a mRNA/miRNA, which can be counted after bonding with its target.
As a result each count of a specific barcode represents the presence of its target mRNA/miRNA.

NACHO is able to load, visualise and normalise the exported NanoString nCounter data and facilitates the user in performing a quality control.
NACHO does this by visualising quality control metrics, expression of control genes, principal components and sample specific size factors in an interactive web application.

With the use of two functions, RCC files are summarised and visualised, namely: load_rcc() and visualise().

The load_rcc() function is used to preprocess the data.
The visualise() function initiates a Shiny-based dashboard that visualises all relevant QC plots.

NACHO also includes a function normalise(), which (re)calculates sample specific size factors and normalises the data.

The normalise() function creates a list in which your settings, the raw counts and normalised counts are stored.

In addition (since v0.6.0) NACHO includes two (three) additional functions:

The render() function renders a full quality-control report (HTML) based on the results of a call to load_rcc() or normalise() (using print() in a Rmarkdown chunk).
The autoplot() function draws any quality-control metrics from visualise() and render().

For more vignette("NACHO") and vignette("NACHO-analysis").

Canouil M, Bouland GA, Bonnefond A, Froguel P, Hart L, Slieker R (2019). “NACHO: an R package for quality control of NanoString nCounter data.” Bioinformatics. ISSN 1367-4803, doi:10.1093/bioinformatics/btz647.

@Article{,
  title = {{NACHO}: an {R} package for quality control of {NanoString} {nCounter} data},
  author = {Mickaël Canouil and Gerard A. Bouland and Amélie Bonnefond and Philippe Froguel and Leen Hart and Roderick Slieker},
  journal = {Bioinformatics},
  address = {Oxford, England},
  year = {2019},
  month = {aug},
  issn = {1367-4803},
  doi = {10.1093/bioinformatics/btz647},
}

3 Analyse NanoString data

3.1 Load packages

library(NACHO)
library(GEOquery, quietly = TRUE, warn.conflicts = FALSE)
## Error in library(GEOquery, quietly = TRUE, warn.conflicts = FALSE): there is no package called 'GEOquery'

3.2 Download `GSE70970` from GEO (or use your own data)

data_directory <- file.path(tempdir(), "GSE70970", "Data")

# Download data
gse <- getGEO("GSE70970")
## Error in getGEO("GSE70970"): could not find function "getGEO"
getGEOSuppFiles(GEO = "GSE70970", baseDir = tempdir())
## Error in getGEOSuppFiles(GEO = "GSE70970", baseDir = tempdir()): could not find function "getGEOSuppFiles"
# Unzip data
untar(
  tarfile = file.path(tempdir(), "GSE70970", "GSE70970_RAW.tar"),
  exdir = data_directory
)
## Warning in untar(tarfile = file.path(tempdir(), "GSE70970",
## "GSE70970_RAW.tar"), : '/usr/bin/tar -xf
## '/var/folders/gn/mxv05rj52wd1yg1hb018s4s40000gn/T//RtmpQG4XyK/GSE70970/GSE70970_RAW.tar'
## -C
## '/var/folders/gn/mxv05rj52wd1yg1hb018s4s40000gn/T//RtmpQG4XyK/GSE70970/Data''
## returned error code 1
# Get phenotypes and add IDs
targets <- pData(phenoData(gse[[1]]))
## Error in pData(phenoData(gse[[1]])): could not find function "pData"
targets$IDFILE <- list.files(data_directory)
## Error: object 'targets' not found

3.3 Import RCC files

GSE70970 <- load_rcc(data_directory, targets, id_colname = "IDFILE")
## Error in eval(expr, envir, enclos): object 'targets' not found

3.4 Perform the analyses using `limma`

library(limma)
## Error in library(limma): there is no package called 'limma'

3.4.1 Get the phenotypes

selected_pheno <- GSE70970[["nacho"]][
  j = lapply(unique(.SD), function(x) ifelse(x == "NA", NA, x)),
  .SDcols = c("IDFILE", "age:ch1", "gender:ch1", "chemo:ch1", "disease.event:ch1")
]
## Error in eval(expr, envir, enclos): object 'GSE70970' not found
selected_pheno <- na.exclude(selected_pheno)
## Error in eval(expr, envir, enclos): object 'selected_pheno' not found

## Error in eval(expr, envir, enclos): object 'selected_pheno' not found

3.4.2 Get the normalised counts

expr_counts <- GSE70970[["nacho"]][
  i = grepl("Endogenous", CodeClass),
  j = as.matrix(
    dcast(.SD, Name ~ IDFILE, value.var = "Count_Norm"),
    "Name"
  ),
  .SDcols = c("IDFILE", "Name", "Count_Norm")
]
## Error in eval(expr, envir, enclos): object 'GSE70970' not found

## Error in eval(expr, envir, enclos): object 'expr_counts' not found

Alternatively, "Accession" number is also available.

GSE70970[["nacho"]][
  i = grepl("Endogenous", CodeClass),
  j = as.matrix(
    dcast(.SD, Accession ~ IDFILE, value.var = "Count_Norm"),
    "Accession"
  ),
  .SDcols = c("IDFILE", "Accession", "Count_Norm")
]

3.4.3 Select phenotypes and counts

Make sure count matrix and phenotypes have the same samples

samples_kept <- intersect(selected_pheno[["IDFILE"]], colnames(expr_counts))
## Error in eval(expr, envir, enclos): object 'selected_pheno' not found
expr_counts <- expr_counts[, samples_kept]
## Error in eval(expr, envir, enclos): object 'expr_counts' not found
selected_pheno <- selected_pheno[IDFILE %in% c(samples_kept)]
## Error in eval(expr, envir, enclos): object 'selected_pheno' not found

Build the numeric design matrix

design <- model.matrix(~ `disease.event:ch1`, selected_pheno)
## Error in eval(expr, envir, enclos): object 'selected_pheno' not found

limma

eBayes(lmFit(expr_counts, design))
## Error in eBayes(lmFit(expr_counts, design)): could not find function "eBayes"

3.5 Perform the analyses using `lm` (or any other model)

GSE70970[["nacho"]][
  i = grepl("Endogenous", CodeClass),
  j = lapply(unique(.SD), function(x) ifelse(x == "NA", NA, x)),
  .SDcols = c(
    "IDFILE", "Name", "Accession", "Count", "Count_Norm",
    "age:ch1", "gender:ch1", "chemo:ch1", "disease.event:ch1"
  )
][
  Name %in% head(unique(Name), 10)
][
  j = as.data.table(
    coef(summary(lm(
      formula = Count_Norm ~ `disease.event:ch1`,
      data = na.exclude(.SD)
    ))),
    "term"
  ),
  by = c("Name", "Accession")
]
## Error in eval(expr, envir, enclos): object 'GSE70970' not found

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.