The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Demo

library(pureseqtmr)
library(testthat)
library(knitr)

pureseqtmr is a package to call PureseqTM from R. PureseqTM predicts the topology of a membrane protein, where the topology can be either inside, or not inside the membrane.

To be able to call PureseqTM, it needs to be installed:

pureseqtmr::install_pureseqtm()

Note that this code is not actually run, to comply with CRAN guidelines.

PureseqTM supplies some example files. Use get_example_filenames to get the path to all these files:

if (is_pureseqtm_installed()) {
  get_example_filenames()
}

In this example, 1bhaA.fasta will be used. To obtain the full path, use get_example_filename. get_example_filename will give an error if the file is not found.

if (is_pureseqtm_installed()) {
  fasta_filename <- get_example_filename("1bhaA.fasta")
  head(readLines(fasta_filename))
}

Getting the topology of this protein:

if (is_pureseqtm_installed()) {
  topology <- predict_topology(fasta_filename)
  kable(topology)
}

Or show the topology as a plot:

if (is_pureseqtm_installed()) {
  plot_topology(topology)
}

One needs the exact same code for a full proteome. Here we use a pureseqtmr example file, which is the COVID-19 reference proteome, as downloaded from https://www.uniprot.org/proteomes/UP000464024.

fasta_filename <- system.file(
  "extdata",
  "UP000464024.fasta",
  package = "pureseqtmr"
)
expect_true(file.exists(fasta_filename))

Show the (top of the) proteome:

head(readLines(fasta_filename))
#> [1] ">sp|P0DTC7|NS7A_SARS2 Protein 7a OS=Severe acute respiratory syndrome coronavirus 2 OX=2697049 GN=7a PE=3 SV=1"                
#> [2] "MKIILFLALITLATCELYHYQECVRGTTVLLKEPCSSGTYEGNSPFHPLADNKFALTCFS"                                                                  
#> [3] "TQFAFACPDGVKHVYQLRARSVSPKLFIRQEEVQELYSPIFLIVAAIVFITLCFTLKRKT"                                                                  
#> [4] "E"                                                                                                                             
#> [5] ">sp|P0DTD1|R1AB_SARS2 Replicase polyprotein 1ab OS=Severe acute respiratory syndrome coronavirus 2 OX=2697049 GN=rep PE=1 SV=1"
#> [6] "MESLVPGFNEKTHVQLSLPVLQVRDVLVRGFGDSVEEVLSEARQHLKDGTCGLVEVEKGV"

Getting the topology of this protein:

if (is_pureseqtm_installed()) {
  topology <- predict_topology(fasta_filename)
}

Instead of directly showing the raw data, the protein names are shortened first:

if (is_pureseqtm_installed()) {
  topology$name <- stringr::str_match(
    string = topology$name,
    pattern = "..\\|.*\\|(.*)_SARS2"
  )[, 2]
}

Show the topology as a plot:

if (is_pureseqtm_installed()) {
  plot_topology(topology)
}

And tally the number of transmembrane helices per protein:

if (is_pureseqtm_installed()) {
  kable(tally_tmhs(topology))
}

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.