The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
This is the location for the HTRX tool that was firstly proposed by
Barrie, W.,
Yang, Y., Irving-Pease, E.K. et al. Elevated genetic risk for multiple
sclerosis emerged in steppe pastoralist populations. Nature 625, 321–328
(2024).
and then illustrated in detail by
Yang
Y, Lawson DJ. HTRX: an R package for learning non-contiguous haplotypes
associated with a phenotype. Bioinformatics Advances 3.1 (2023):
vbad038.
Authors:
Yaoling Yang
(yaoling.yang@bristol.ac.uk)
Daniel Lawson
(dan.lawson@bristol.ac.uk)
License: GPL-3
Haplotype Trend Regression with eXtra flexibility (HTRX) searches a pre-defined set of SNPs for haplotype patterns that include single nucleotide polymorphisms (SNPs) and non-contiguous haplotypes.
We search over all possible templates which give a value for each SNP being ‘0’ or ‘1’, reflecting whether the reference allele of each SNP is present or absent, or an ‘X’ meaning either value is allowed.
We used a two-stage procedure to select the best HTRX model (function
“do_cv”).
Stage 1: select candidate models;
Stage 2: select the best model using 10-fold cross-validation.
Longer haplotypes are important for discovering interactions. However, there are \(3^k-1\) haplotypes in HTRX if the region contains \(k\) SNPs, making it unrealistic for regions with large numbers of SNPs. To address this issue, we proposed “cumulative HTRX” (function “do_cumulative_htrx”) that enables HTRX to run on longer haplotypes, i.e. haplotypes which include at least 7 SNPs (we recommend). Besides, we provide a parameter “max_int” which controls the maximum number of SNPs that can interact.
::install_github("https://github.com/YaolingYang/HTRX") devtools
This package is also available from CRAN. You can install it by
install.packages("HTRX")
A tutorial of package HTRX can be found in vignettes/HTRX_vignette.pdf
library(HTRX)
## use dataset "example_hap1", "example_hap2" and "example_data_nosnp"
## "example_hap1" and "example_hap2" are both genomes of 8 SNPs for 5,000 individuals (diploid data)
## "example_data_nosnp" is an example dataset which contains the outcome (binary), sex, age and 18 PCs
## visualise the covariates data
head(HTRX::example_data_nosnp)
## visualise the genotype data for the first genome
head(HTRX::example_hap1)
## we perform HTRX on the first 4 SNPs
## we first generate all the haplotype data, as defined by HTRX
=make_htrx(HTRX::example_hap1[,1:4],HTRX::example_hap2[,1:4])
HTRX_matrix
## If the data is haploid, please set
## HTRX_matrix=make_htrx(HTRX::example_hap1[,1:4],HTRX::example_hap1[,1:4])
## next compute the maximum number of independent features
=htrx_max(nsnp=4)
featurecap
## then perform HTRX using 2-step cross-validation
## to compute additional variance explained by haplotypes
## If you want to compute total variance explained, please set gain=FALSE
<- do_cv(HTRX::example_data_nosnp,
htrx_results train_proportion=0.5,
HTRX_matrix,sim_times=3,featurecap=featurecap,usebinary=1,
method="stratified",criteria="BIC",
gain=TRUE,runparallel=FALSE)
## If we want to compute the total variance explained
## we can set gain=FALSE in the above example
## we perform cumulative HTRX on all the 8 SNPs using 2-step cross-validation
## to compute additional variance explained by haplotypes
## If the data is haploid, please set hap2=HTRX::example_hap1
## If you want to compute total variance explained, please set gain=FALSE
## For Linux/MAC users, we strongly encourage you to set runparallel=TRUE
<- do_cumulative_htrx(data_nosnp=HTRX::example_data_nosnp,
cumu_htrx_results hap1=HTRX::example_hap1,
hap2=HTRX::example_hap2,
train_proportion=0.5,sim_times=1,
featurecap=6,usebinary=1,
randomorder=TRUE,method="stratified",
criteria="BIC",runparallel=FALSE)
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.