maldipickr

The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

maldipickr

Quickstart

The {maldipickr} package helps microbiologists reduce duplicate/clonal bacteria from their cultures and eventually exclude previously selected bacteria. {maldipickr} achieve this feat by grouping together data from MALDI Biotyper and helps choose representative bacteria from each group using user-relevant metadata – a process known as cherry-picking.

{maldipickr} cherry-picks bacterial isolates with MALDI Biotyper:

using taxonomic identification report
using spectra data

Using taxonomic identification report

First make sure {maldipickr} is installed and loaded, alternatively follow the instructions to install the package.

Cherry-picking four isolates based on their taxonomic identification by the MALDI Biotyper is done in a few steps with {maldipickr}.

Get example data

We import an example Biotyper CSV report and glimpse at the table.

report_tbl <- read_biotyper_report(
  system.file("biotyper_unknown.csv", package = "maldipickr")
)
report_tbl %>%
  dplyr::select(name, bruker_species, bruker_log) %>% knitr::kable()

name	bruker_species	bruker_log
unknown_isolate_1	not reliable identification	1.33
unknown_isolate_2	not reliable identification	1.40
unknown_isolate_3	Faecalibacterium prausnitzii	1.96
unknown_isolate_4	Faecalibacterium prausnitzii	2.07

Delineate clusters and cherry-pick

Delineate clusters from the identifications after filtering the reliable ones and cherry-pick one representative spectra.

Unreliable identifications based on the log-score are replaced by “not reliable identification”, but stay tuned as they do not represent the same isolates!

report_tbl <- report_tbl %>%
  dplyr::mutate(
      bruker_species = dplyr::if_else(bruker_log >= 2, bruker_species,
                                      "not reliable identification")
  )
knitr::kable(report_tbl)

name	sample_name	hit_rank	bruker_quality	bruker_species	bruker_taxid	bruker_hash	bruker_log
unknown_isolate_1	NA	1	-	not reliable identification	NA	3e920566-2734-43dd-85d0-66cf23a2d6ef	1.33
unknown_isolate_2	NA	1	-	not reliable identification	NA	88a85875-eeb5-4858-966e-98a077325dc3	1.40
unknown_isolate_3	NA	1	+	not reliable identification	137408536	2d266f20-5428-428d-96ec-ddd40200794b	1.96
unknown_isolate_4	NA	1	+++	Faecalibacterium prausnitzii	137408536	2d266f20-5428-428d-96ec-ddd40200794b	2.07

The chosen ones are indicated by to_pick column.

report_tbl %>%
  delineate_with_identification() %>%
  pick_spectra(report_tbl, criteria_column = "bruker_log") %>%
  dplyr::relocate(name, to_pick, bruker_species) %>% 
  knitr::kable()
#> Generating clusters from single report

name	to_pick	bruker_species	membership	cluster_size	sample_name	hit_rank	bruker_quality	bruker_taxid	bruker_hash	bruker_log
unknown_isolate_1	TRUE	not reliable identification	2	1	NA	1	-	NA	3e920566-2734-43dd-85d0-66cf23a2d6ef	1.33
unknown_isolate_2	TRUE	not reliable identification	3	1	NA	1	-	NA	88a85875-eeb5-4858-966e-98a077325dc3	1.40
unknown_isolate_3	TRUE	not reliable identification	4	1	NA	1	+	137408536	2d266f20-5428-428d-96ec-ddd40200794b	1.96
unknown_isolate_4	TRUE	Faecalibacterium prausnitzii	1	1	NA	1	+++	137408536	2d266f20-5428-428d-96ec-ddd40200794b	2.07

Using spectra data

In parallel to taxonomic identification reports, {maldipickr} process spectra data. Make sure {maldipickr} is installed and loaded, alternatively follow the instructions to install the package.

Cherry-picking six isolates from three species based on their spectra data obtained from the MALDI Biotyper is done in a few steps with {maldipickr}.

Get example data

We set up the directory location of our example spectra data, but adjust for your requirements. We import and process the spectra which gives us a named list of three objects: spectra, peaks and metadata (more details in Value section of process_spectra()).

spectra_dir <- system.file("toy-species-spectra", package = "maldipickr")

processed <- spectra_dir %>%
  import_biotyper_spectra() %>%
  process_spectra()

Delineate clusters and cherry-pick

Delineate spectra clusters using Cosine similarity and cherry-pick one representative spectra. The chosen ones are indicated by to_pick column.

processed %>%
  list() %>%
  merge_processed_spectra() %>%
  coop::tcosine() %>%
  delineate_with_similarity(threshold = 0.92) %>%
  set_reference_spectra(processed$metadata) %>%
  pick_spectra() %>%
  dplyr::relocate(name, to_pick) %>% 
  knitr::kable()

name	to_pick	membership	cluster_size	SNR	peaks	is_reference
species1_G2	FALSE	1	4	5.089590	21	FALSE
species2_E11	FALSE	2	2	5.543735	22	FALSE
species2_E12	TRUE	2	2	5.633540	23	TRUE
species3_F7	FALSE	1	4	4.889949	26	FALSE
species3_F8	TRUE	1	4	5.558884	25	TRUE
species3_F9	FALSE	1	4	5.398429	25	FALSE

This provides only a brief overview of the features of {maldipickr}, browse the other vignettes to learn more about additional features.

Session information

sessionInfo()
#> R version 4.3.1 (2023-06-16)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 20.04.6 LTS
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0 
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=de_DE.UTF-8        LC_COLLATE=C              
#>  [5] LC_MONETARY=de_DE.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=de_DE.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C       
#> 
#> time zone: Europe/Berlin
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] maldipickr_1.3.1
#> 
#> loaded via a namespace (and not attached):
#>  [1] vctrs_0.6.4              cli_3.6.1                knitr_1.48              
#>  [4] rlang_1.1.4              xfun_0.44                coop_0.6-3              
#>  [7] purrr_1.0.2              generics_0.1.3           jsonlite_1.8.7          
#> [10] glue_1.6.2               htmltools_0.5.6.1        sass_0.4.7              
#> [13] fansi_1.0.5              rmarkdown_2.28           tibble_3.2.1            
#> [16] evaluate_0.22            jquerylib_0.1.4          fastmap_1.1.1           
#> [19] yaml_2.3.7               lifecycle_1.0.4          compiler_4.3.1          
#> [22] dplyr_1.1.4              pkgconfig_2.0.3          tidyr_1.3.0             
#> [25] readBrukerFlexData_1.9.1 rstudioapi_0.15.0        digest_0.6.33           
#> [28] R6_2.5.1                 tidyselect_1.2.1         utf8_1.2.3              
#> [31] pillar_1.9.0             parallel_4.3.1           magrittr_2.0.3          
#> [34] bslib_0.5.1              withr_2.5.1              tools_4.3.1             
#> [37] MALDIquant_1.22.1        cachem_1.0.8

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.