The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Type: Package
Title: Spectral Entropy for Mass Spectrometry Data
Version: 0.1.4
Date: 2023-08-07
Description: Clean the MS/MS spectrum, calculate spectral entropy, unweighted entropy similarity, and entropy similarity for mass spectrometry data. The entropy similarity is a novel similarity measure for MS/MS spectra which outperform the widely used dot product similarity in compound identification. For more details, please refer to the paper: Yuanyue Li et al. (2021) "Spectral entropy outperforms MS/MS dot product similarity for small-molecule compound identification" <doi:10.1038/s41592-021-01331-z>.
License: Apache License (== 2.0)
Depends: R (≥ 3.5.0), Rcpp (≥ 1.0.10)
Suggests: testthat
LinkingTo: Rcpp
RoxygenNote: 7.2.3
Encoding: UTF-8
URL: https://github.com/YuanyueLi/MSEntropy
NeedsCompilation: yes
Packaged: 2023-08-07 22:58:36 UTC; yli
Author: Yuanyue Li [aut, cre]
Maintainer: Yuanyue Li <liyuanyue@gmail.com>
Repository: CRAN
Date/Publication: 2023-08-07 23:10:02 UTC

Entropy similarity between two spectra

Description

Calculate the entropy similarity between two spectra

Usage

calculate_entropy_similarity(
  peaks_a,
  peaks_b,
  ms2_tolerance_in_da,
  ms2_tolerance_in_ppm,
  clean_spectra,
  min_mz,
  max_mz,
  noise_threshold,
  max_peak_num
)

Arguments

peaks_a

A matrix of spectral peaks, with two columns: mz and intensity

peaks_b

A matrix of spectral peaks, with two columns: mz and intensity

ms2_tolerance_in_da

The MS2 tolerance in Da, set to -1 to disable

ms2_tolerance_in_ppm

The MS2 tolerance in ppm, set to -1 to disable

clean_spectra

Whether to clean the spectra before calculating the entropy similarity, see clean_spectrum

min_mz

The minimum mz value to keep, set to -1 to disable

max_mz

The maximum mz value to keep, set to -1 to disable

noise_threshold

The noise threshold, set to -1 to disable, all peaks have intensity < noise_threshold * max_intensity will be removed

max_peak_num

The maximum number of peaks to keep, set to -1 to disable

Value

The entropy similarity

Examples

mz_a <- c(169.071, 186.066, 186.0769)
intensity_a <- c(7.917962, 1.021589, 100.0)
mz_b <- c(120.212, 169.071, 186.066)
intensity_b <- c(37.16, 66.83, 999.0)
peaks_a <- matrix(c(mz_a, intensity_a), ncol = 2, byrow = FALSE)
peaks_b <- matrix(c(mz_b, intensity_b), ncol = 2, byrow = FALSE)
calculate_entropy_similarity(peaks_a, peaks_b,
                             ms2_tolerance_in_da = 0.02, ms2_tolerance_in_ppm = -1,
                             clean_spectra = TRUE, min_mz = 0, max_mz = 1000,
                             noise_threshold = 0.01,
                             max_peak_num = 100)


Calculate spectral entropy of a spectrum

Description

Calculate spectral entropy of a spectrum

Usage

calculate_spectral_entropy(peaks)

Arguments

peaks

A matrix of peaks, with two columns: m/z and intensity.

Value

A double value of spectral entropy.

Examples

mz <- c(100.212, 300.321, 535.325)
intensity <- c(37.16, 66.83, 999.0)
peaks <- matrix(c(mz, intensity), ncol = 2, byrow = FALSE)
calculate_spectral_entropy(peaks)


Unweighted entropy similarity between two spectra

Description

Calculate the unweighted entropy similarity between two spectra

Usage

calculate_unweighted_entropy_similarity(
  peaks_a,
  peaks_b,
  ms2_tolerance_in_da,
  ms2_tolerance_in_ppm,
  clean_spectra,
  min_mz,
  max_mz,
  noise_threshold,
  max_peak_num
)

Arguments

peaks_a

A matrix of spectral peaks, with two columns: mz and intensity

peaks_b

A matrix of spectral peaks, with two columns: mz and intensity

ms2_tolerance_in_da

The MS2 tolerance in Da, set to -1 to disable

ms2_tolerance_in_ppm

The MS2 tolerance in ppm, set to -1 to disable

clean_spectra

Whether to clean the spectra before calculating the entropy similarity, see clean_spectrum

min_mz

The minimum mz value to keep, set to -1 to disable

max_mz

The maximum mz value to keep, set to -1 to disable

noise_threshold

The noise threshold, set to -1 to disable, all peaks have intensity < noise_threshold * max_intensity will be removed

max_peak_num

The maximum number of peaks to keep, set to -1 to disable

Value

The unweighted entropy similarity

Examples

mz_a <- c(169.071, 186.066, 186.0769)
intensity_a <- c(7.917962, 1.021589, 100.0)
mz_b <- c(120.212, 169.071, 186.066)
intensity_b <- c(37.16, 66.83, 999.0)
peaks_a <- matrix(c(mz_a, intensity_a), ncol = 2, byrow = FALSE)
peaks_b <- matrix(c(mz_b, intensity_b), ncol = 2, byrow = FALSE)
calculate_unweighted_entropy_similarity(peaks_a, peaks_b,
                                       ms2_tolerance_in_da = 0.02, ms2_tolerance_in_ppm = -1,
                                       clean_spectra = TRUE, min_mz = 0, max_mz = 1000,
                                       noise_threshold = 0.01,
                                       max_peak_num = 100)


Clean a spectrum

Description

Clean a spectrum

This function will clean the peaks by the following steps: 1. Remove empty peaks (mz <= 0 or intensity <= 0). 2. Remove peaks with mz >= max_mz or mz < min_mz. 3. Centroid the spectrum by merging peaks within min_ms2_difference_in_da or min_ms2_difference_in_ppm. 4. Remove peaks with intensity < noise_threshold * max_intensity. 5. Keep only the top max_peak_num peaks. 6. Normalize the intensity to sum to 1.

Note: The only one of min_ms2_difference_in_da and min_ms2_difference_in_ppm should be positive.

Usage

clean_spectrum(
  peaks,
  min_mz,
  max_mz,
  noise_threshold,
  min_ms2_difference_in_da,
  min_ms2_difference_in_ppm,
  max_peak_num,
  normalize_intensity
)

Arguments

peaks

A matrix of spectral peaks, with two columns: mz and intensity

min_mz

The minimum mz value to keep, set to -1 to disable

max_mz

The maximum mz value to keep, set to -1 to disable

noise_threshold

The noise threshold, set to -1 to disable, all peaks have intensity < noise_threshold * max_intensity will be removed

min_ms2_difference_in_da

The minimum mz difference in Da to merge peaks, set to -1 to disable, any two peaks with mz difference < min_ms2_difference_in_da will be merged

min_ms2_difference_in_ppm

The minimum mz difference in ppm to merge peaks, set to -1 to disable, any two peaks with mz difference < min_ms2_difference_in_ppm will be merged

max_peak_num

The maximum number of peaks to keep, set to -1 to disable

normalize_intensity

Whether to normalize the intensity to sum to 1

Value

A matrix of spectral peaks, with two columns: mz and intensity

Examples

mz <- c(100.212, 169.071, 169.078, 300.321)
intensity <- c(0.3716, 7.917962, 100., 66.83)
peaks <- matrix(c(mz, intensity), ncol = 2, byrow = FALSE)
clean_spectrum(peaks, min_mz = 0, max_mz = 1000, noise_threshold = 0.01,
               min_ms2_difference_in_da = 0.02, min_ms2_difference_in_ppm = -1,
               max_peak_num = 100, normalize_intensity = TRUE)


Calculate spectral entropy similarity between two spectra

Description

msentropy_similarity calculates the spectral entropy between two spectra (Li et al. 2021). It is a wrapper function defining defaults for parameters and calling the calculate_entropy_similarity() or calculate_unweighted_entropy_similarity() functions to perform the calculation.

Usage

msentropy_similarity(
  peaks_a,
  peaks_b,
  ms2_tolerance_in_da = 0.02,
  ms2_tolerance_in_ppm = -1,
  clean_spectra = TRUE,
  min_mz = 0,
  max_mz = 1000,
  noise_threshold = 0.01,
  max_peak_num = 100,
  weighted = TRUE,
  ...
)

Arguments

peaks_a

A two-column numeric matrix with the m/z and intensity values for peaks of one spectrum.

peaks_b

A two-column numeric matrix with the m/z and intensity values for peaks of one spectrum.

ms2_tolerance_in_da

The MS2 tolerance in Da, set to -1 to disable. Defaults to ms2_tolerance_in_da = 0.02.

ms2_tolerance_in_ppm

The MS2 tolerance in ppm, set to -1 to disable. Defaults to ms2_tolerance_in_ppm = -1.

clean_spectra

Whether to clean the spectra before calculating the entropy similarity, see clean_spectrum().

min_mz

The minimum mz value to keep, set to -1 to disable. Defaults to min_mz = 0.

max_mz

The maximum mz value to keep, set to -1 to disable. Defaults to max_mz = 1000.

noise_threshold

The noise threshold, set to -1 to disable, all peaks have intensity < noise_threshold * max_intensity will be removed. Defaults to noise_threshold = 0.01, thus, by default, all peaks with an intensity less than 1% of the maximum intensity of a spectrum will be removed.

max_peak_num

The maximum number of peaks to keep, set to -1 to disable. Defaults to max_peak_num = 1000.

weighted

logical(1) whether the weighted or unweighted entropy similarity should be calculated. Defaults to weighted = TRUE, thus calculate_entropy_similarity() is used for the calculation. For weighted = FALSE calculate_unweighted_entropy_similarity() is used instead.

...

Optional additional parameters (currently ignored)

Value

The entropy similarity

References

Li, Y., Kind, T., Folz, J. et al. (2021) Spectral entropy outperforms MS/MS dot product similarity for small-molecule compound identification. Nat Methods 18, 1524-1531. doi: 10.1038/s41592-021-01331-z.

Examples


peaks_a <- cbind(mz = c(169.071, 186.066, 186.0769),
    intensity = c(7.917962, 1.021589, 100.0))
peaks_b <- cbind(mz = c(120.212, 169.071, 186.066),
    intensity <- c(37.16, 66.83, 999.0))
msentropy_similarity(peaks_a, peaks_b, ms2_tolerance_in_da = 0.02)

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.