The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

diemr: Genome polarisation in R

CRAN version CRAN downloads Monthly downloads

diemr implements the Diagnostic Index Expectation Maximization (diem) algorithm for genome polarization in R. It estimates which alleles of single nucleotide variant (SNV) sites belong to either side of a barrier to gene flow, co-estimates individual assignment, and infers barrier strength and divergence. These tools are designed for studies of hybridization, speciation, and population divergence, and extend the methods described in Baird et al. (2023) Genome polarisation for detecting barriers to geneflow. Methods in Ecology and Evolution 14, 512-528 doi:10.1111/2041-210X.14010. For the original algorithm description and implementations in Python and Mathematica, see the diem repository at https://github.com/StuartJEBaird/diem.

Installation

To start using diemr, load the package or install it from CRAN if it is not yet available:

if(!require("diemr", character.only = TRUE)){
    install.packages("diemr", dependencies = TRUE)
    library("diemr", character.only = TRUE)
}
# Loading required package: diemr

Set working directory to a location with read and write privileges.

Check data format

Next, assemble paths to all files containing the data to be used by diemr. Here, we will use a tiny example dataset for illustration that is included in the package. A good practice is to check that all files contain data in correct format for all individuals and markers.

filepaths <- system.file("extdata", "data7x3.txt",
                         package = "diemr")
CheckDiemFormat(filepaths, ploidy = list(rep(2, 6)), ChosenInds = 1:6)
# File check passed: TRUE
# Ploidy check passed: TRUE

If the CheckDiemFormat() function fails, work through the error messages and fix the stored input files accordingly. The algorithm repeatedly accesses data from the harddisk, so seeing the passed file check prior to analysis is critical.

diem.res <- diem(files = filepaths,
                 ploidy = list(rep(2, 6)), 
                 ChosenInds = 1:6,
                 verbose = TRUE,
                 nCores = 1)

The results including marker polarisation, marker diagnostic index and its support will be included in the list element diem.res$DI. Additional elements in the results list contain basic tracking information about the expectation maximisation iterations, with details provided in the folder diagnostics in the working directory.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.