The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

DYNATE

The DYNATE package accompanies the paper “Localizing Rare-Variant Association Regions via Multiple Testing Embedded in an Aggregation Tree”. The package is designed to pinpoint the disease-associated rare variant sets with a controlled false discovery rate in case-control study.

library(DYNATE)

Here, we show how to apply DYNATE.

We require the input data is a data frame with a long format: each row is a rare variant (SNP) per sample. Specifically, the input data should contains 6 variables with name Sample.Name, Sample.Type, snpID, domainID, X1, X2. Where variables Sample.Name, snpID and domainID indicate the Sample ID, SNP ID, and domain ID, respectively; Variable Sample.Type indicates the case/control status of each sample; Variables X1 and X2 are covariates that could be considered in the analysis. The snp_dat below is a toy simulated data with 6 variables and 210,454 rows. The data contains 2,000 samples (1,000 cases and 1,000 controls). In total 16,281 SNPs reside in 2,000 domains are considered in snp_dat.

str(snp_dat)
#> 'data.frame':    54282 obs. of  6 variables:
#>  $ snpID      : num  1 1 1 1 1 1 2 2 2 2 ...
#>  $ Sample.Type: chr  "case" "case" "ctrl" "ctrl" ...
#>  $ Sample.Name: int  58 695 1373 1504 1898 1938 110 199 292 326 ...
#>  $ domainID   : int  1 1 1 1 1 1 1 1 1 1 ...
#>  $ X1         : num  0.38242 -0.46531 -0.23738 -0.0889 0.00982 ...
#>  $ X2         : int  1 1 1 0 1 1 1 0 1 0 ...

First, we set the tuning parameters as follows. Please refer to the paper for detailed tuning parameters selection procedure.

M <- 5 # leaf size
L <- 3 # layer number
alpha <- 0.05 # desired FDR

Second, we use Test_Leaf function to construct leaves and generate leaf P-values for the case-control study.

# Model consider covariates effect:
t1 <- Sys.time()
p_leaf <- Test_Leaf(snp_dat=snp_dat,thresh_val=M)
t2 <- Sys.time()
t2-t1
#> Time difference of 1.350046 secs

In the output data frames p_leaf, each row links to a rare variant (SNPs), and the number of rows equals the number of rare variants (SNPs) we considered (SNPs that link to a leaf with p-value=1 are excluded for maintaining the algorithm stability). The data frame includes 5 variables. In the data frame, variable L1 is leaf ID; variable pvals is the leaf level p values; variable Test indicates the name of the statistical test to generate the leaf level p values (FET or score).

str(p_leaf)
#> Classes 'data.table' and 'data.frame':   4214 obs. of  5 variables:
#>  $ L1      : int  1 2 3 3 4 5 5 6 7 8 ...
#>  $ pvals   : num  0.453 0.887 0.548 0.548 0.895 ...
#>  $ snpID   : num  1 2 3 4 5 6 7 8 9 10 ...
#>  $ domainID: int  1 1 1 1 1 1 1 1 1 1 ...
#>  $ Test    : chr  "FET" "FET" "FET" "FET" ...
#>  - attr(*, ".internal.selfref")=<externalptr>

Finally, we use the function DYNATE to conduct dynamic and hierarchical testing based on the leaf level p values.

out <- DYNATE(struct_map=p_leaf,L=L,alpha=alpha)

In the output data frames out, each row links to a unique SNP that is detected by DYNATE. The variables snpID, L1, and domainID link to the detected SNP ID, leaf ID, and domain ID, respectively; Variable Test links to the name of the statistical test we applied (FET or score); Variable pvals1 links to the leaf level p-values; Variable Layer indicates in which layer the SNP is detected.

str(out)
#> tibble [9 × 6] (S3: tbl_df/tbl/data.frame)
#>  $ L1      : int [1:9] 205 1357 1357 1462 2085 2479 3178 546 547
#>  $ snpID   : num [1:9] 277 1811 1812 1957 2785 ...
#>  $ domainID: int [1:9] 20 168 168 181 312 376 493 66 66
#>  $ Test    : chr [1:9] "FET" "FET" "FET" "FET" ...
#>  $ pvals1  : num [1:9] 1.25e-08 4.76e-34 4.76e-34 6.65e-05 2.09e-13 ...
#>  $ Layer   : int [1:9] 1 1 1 1 1 1 1 2 2

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.