The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
This page gives a compact mental model for misha. Use it as the first
quick read before the full Manual vignette.
Most analyses follow the same pattern:
In misha this is usually one call to gextract,
gscreen, or gsummary.
You are not limited to raw track names. You can pass full
expressions, for example log(dense_track + 1),
dense_track / (chip.sum + 1e-6), or
pmin(dense_track, 2).
All examples below assume the bundled examples database:
A track is genomic signal organized over coordinates.
dense_track in the examples DB).Useful starter commands:
gtrack.ls() # list tracks in the examples DB
gtrack.info("dense_track") # inspect type/metadata
gtrack.info("sparse_track")For intuition, you can think of dense_track as a
ChIP-seq-like coverage signal.
An interval set defines genomic regions
(chrom, start, end) where you
want to work.
The iterator is the stepping policy inside the scope.
iterator = 100 -> fixed 100 bp binsiterator = "some_sparse_track" -> iterate over that
track’s intervalsiterator = some_intervals_df -> iterate over
explicit regionsiterator = "my_intervals_set" -> iterate directly
over an intervals setThink of it as: scope says where, iterator says in what chunks.
out <- gextract("dense_track", regions, iterator = 100)
log_out <- gextract("log(dense_track + 1)", regions, iterator = 100)Create and use an intervals set as an iterator:
A virtual track is a named on-the-fly transformation, not stored as a physical track file.
Examples:
gvtrack.create("chip.sum", "dense_track", "sum")
out <- gextract("chip.sum", regions, iterator = 200)You can also shift the iterator window used by the virtual track:
gvtrack.create("chip.shifted", "dense_track", "sum")
gvtrack.iterator("chip.shifted", sshift = -100, eshift = 100)
out <- gextract("chip.shifted", regions, iterator = 200)Here, each iterator interval is expanded by 100 bp on both sides
before evaluating dense_track.
Virtual tracks are session objects (easy to list with
gvtrack.ls and delete with gvtrack.rm).
library(misha)
gdb.init_examples()
# 1) pick scope
regions <- gintervals(1, 0, 50000)
# 2) inspect available tracks
print(gtrack.ls())
# 3) extract signal with a chosen iterator
chip <- gextract("dense_track", regions, iterator = 100)
# 4) screen high-signal bins (as a simple peak-like filter)
hi_chip <- gscreen("dense_track > 0.6", regions, iterator = 100)
# 5) summarize distribution/coverage
stats <- gsummary("dense_track", regions, iterator = 100)A PWM/PSSM is a motif model over A/C/G/T. In misha, a common pattern is:
regions <- gintervals(1, c(1000, 2000), c(1020, 2020))
seqs <- gseq.extract(regions)
pssm <- matrix(c(
0.80, 0.05, 0.10, 0.05,
0.10, 0.10, 0.70, 0.10,
0.05, 0.80, 0.05, 0.10,
0.10, 0.10, 0.10, 0.70
), ncol = 4, byrow = TRUE)
colnames(pssm) <- c("A", "C", "G", "T")
scores <- gseq.pwm(seqs, pssm, mode = "lse")If your database has motif files under pssms/, you can
create a genome-wide PWM-energy track with
gtrack.create_pwm_energy(...).
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.