The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Introduction to baseq

library(baseq)

Introduction

baseq is a basic sequence processing tool for biological data. It provides simple and efficient functions for common tasks in molecular biology, such as cleaning sequences, translating DNA/RNA to protein, and calculating GC content.

Sequence Cleaning

You can clean DNA or RNA sequences by removing any non-standard characters. The universal clean_seq() function automatically detects the type.

dna_seq <- "ATGCnNryMK"
clean_seq(dna_seq)
#> [1] "ATGC"

rna_seq <- "AUGGCuuNnRYMK"
clean_seq(rna_seq)
#> [1] "AUGGCUU"

Translation

baseq can translate DNA and RNA sequences into protein sequences in all six reading frames.

dna_seq <- "ATCGAGCTAGCTAGCTAGCTAGCT"
proteins <- dna_to_protein(dna_seq)
proteins[["Frame F1"]]
#> [1] "IELAS"

GC Content

Calculate the GC content of a DNA sequence.

dna_seq <- "ATGCATGC"
gc_content(dna_seq)
#> [1] 50

Reading and Writing Files

baseq provides universal functions to read and write FASTA and FASTQ files.

# Read a FASTA file into a dataframe
# df <- read_seq("path/to/file.fasta")

# Write a dataframe to a FASTA file
# write_seq(df, "output.fasta")

For more details, see the documentation for individual functions.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.