The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
library(PnT)
library(plotly)
<- readRDS('../inst/extdata/mylar_data.rds') mylar
This data represents a spectrum of light released by analyzing the elements in a sample. The first peak represents hydrogen in the sample, and the second peak represents other elements.
plot_ly(mylar, x = ~Ch, y = ~CPS,
type = 'scatter', mode = 'lines')
To find these peaks, we can just use the following function:
find_peaks(mylar, 'Ch', 'CPS')
#> # A tibble: 2 × 2
#> X Y
#> <int> <dbl>
#> 1 217 1352
#> 2 323 3000
Next, let’s see how to find peaks in more noisy data
<- seq(0, 10, 0.01)
X <- sin(X) + runif(length(X), min=-0.1, max=0.1)
Y <- data.frame(X=X, Y=Y)
data plot_ly(data, x = ~X, y = ~Y,
type = 'scatter', mode = 'lines')
To find the peaks, we can just use the same function
find_peaks(data, 'X', 'Y')
#> # A tibble: 2 × 2
#> X Y
#> <dbl> <dbl>
#> 1 1.55 1.10
#> 2 7.73 1.09
Now, if we have even more noisy data, the peak finder might need better filter parameters.
<- Y + runif(length(X), min=-0.1, max=0.1)
Y <- data.frame(X=X, Y=Y)
data find_peaks(data, 'X', 'Y')
#> # A tibble: 38 × 2
#> X Y
#> <dbl> <dbl>
#> 1 0.31 0.435
#> 2 0.71 0.830
#> 3 0.82 0.880
#> 4 1.26 1.07
#> 5 1.33 1.11
#> 6 1.42 1.13
#> 7 1.65 1.17
#> 8 1.9 1.08
#> 9 1.99 1.02
#> 10 2.22 0.902
#> # ℹ 28 more rows
This data has too much noise for the filter. So, we can increase the amount of noise blocked by the filter using the parameter minYerror. This parameter has default value 0.1.
find_peaks(data, 'X', 'Y', minYerror=0.2)
#> # A tibble: 2 × 2
#> X Y
#> <dbl> <dbl>
#> 1 1.65 1.17
#> 2 7.65 1.14
Another possibility is that we only want the thin, sharp peaks.
<- (7/(5+(X-5)^2)) + (0.05/(0.02+(X-8)^2)) + (0.03/(0.03+(X-1)^2))
Y <- data.frame(X=X, Y=Y)
data plot_ly(data, x = ~X, y = ~Y,
type = 'scatter', mode = 'lines')
For example, say we only want the sharp peaks (the first and last), not the wide peak in the middle.
find_peaks(data, 'X', 'Y')
#> # A tibble: 3 × 2
#> X Y
#> <dbl> <dbl>
#> 1 1 1.33
#> 2 5 1.41
#> 3 8 3.00
The usual function call does not only get those peaks. Instead, we use the parameter minSlope, which controls a filter testing the slope around peaks relative to the dimensions of the data. This argument is 0 by default. Increasing minSlope allows us to filter out smooth peaks in favor of sharp ones.
find_peaks(data, 'X', 'Y', minSlope = 1)
#> # A tibble: 2 × 2
#> X Y
#> <dbl> <dbl>
#> 1 1 1.33
#> 2 8 3.00
If, instead, the outer peaks are just edge effects, and we only want the middle peak, we can use edgeFilter. In this case, setting edgeFilter to 0.25 filters out the first and last quarter (0.25) of the data.
find_peaks(data, 'X', 'Y', edgeFilter = 0.25)
#> # A tibble: 1 × 2
#> X Y
#> <dbl> <dbl>
#> 1 5 1.41
<- function(figure){
logy layout(figure, yaxis=list(type="log"))
}
Here, we see some real world examples where this package can be used.
Here, we have a data set to analyze: air quality data from a school.
<- plot_ly(school_data, x=~Energy, y=~Counts, type="scatter", mode="lines"); fig fig
To find the prominent peaks, we can use the basic function call:
find_peaks(school_data, 'Energy', 'Counts')
#> # A tibble: 6 × 2
#> X Y
#> <dbl> <dbl>
#> 1 -0.0022 10642
#> 2 0.278 12839
#> 3 0.528 7317
#> 4 1.03 1748
#> 5 1.49 2128
#> 6 1.75 5361
However, if we plot the data using a log plot, we see that many of the other peaks are not noise.
logy(fig)
The main example of this is the peak at x = 6.4 keV. In the previous
plot, this peak was barely visible, but in this one, it is obviously not
just noise. By hovering over the peak with your mouse, you can see that
this peak has a height of about 60 from the rest of the graph. Using the
argument asFraction
in conjunction with
minYerror
, you can make sure this is the smallest peak
found.
find_peaks(school_data, 'Energy', 'Counts', asFraction = FALSE, minYerror = 60)
#> # A tibble: 13 × 2
#> X Y
#> <dbl> <dbl>
#> 1 -0.0022 10642
#> 2 0.278 12839
#> 3 0.528 7317
#> 4 0.718 595
#> 5 1.03 1748
#> 6 1.26 785
#> 7 1.49 2128
#> 8 1.75 5361
#> 9 2.31 494
#> 10 2.63 208
#> 11 3.31 362
#> 12 3.69 242
#> 13 6.40 81
In this example, setting asFraction = FALSE
allows you
to input the height of 60 you measured for the peak directly into the
parameter minYerror
without having to divide it by the
total height of the data.
<- plot_ly(jadeite_control, x=~Ch, y=~CPS, type="scatter", mode="lines"); fig fig
Again, we need to use a log plot to see the smaller peaks clearly.
logy(fig)
We can see that the small peak at x = 1000 has a height of about 400. So, we can try what we did before to find the peaks of the graph.
find_peaks(jadeite_control, 'Ch', 'CPS', asFraction = FALSE, minYerror = 400)
#> # A tibble: 19 × 2
#> X Y
#> <dbl> <dbl>
#> 1 129 32702
#> 2 155 45298
#> 3 184 382405
#> 4 214 1048170
#> 5 285 7100
#> 6 323 8754
#> 7 366 9347
#> 8 403 27282
#> 9 425 24060
#> 10 452 58156
#> 11 491 12029
#> 12 551 9306
#> 13 608 4596
#> 14 662 10285
#> 15 722 3632
#> 16 781 26212
#> 17 862 4895
#> 18 915 1601
#> 19 990 1462
However, what if the peaks at 285, 323, and 366 are actually noise? Say you want to filter out these peaks, and other small peaks in the area where most of the peaks are larger. In this case, we have to break up the peak finding into two separate ranges. First, we find the peaks in the range where the peaks are smaller:
<- find_peaks(jadeite_control, 'Ch', 'CPS', asFraction = FALSE, minYerror = 400, ROI = c(500, 1050))
smallerPeaks
smallerPeaks#> # A tibble: 8 × 2
#> X Y
#> <dbl> <dbl>
#> 1 551 9306
#> 2 608 4596
#> 3 662 10285
#> 4 722 3632
#> 5 781 26212
#> 6 862 4895
#> 7 915 1601
#> 8 990 1462
Now, we can find the peaks in the other range:
<- find_peaks(jadeite_control, 'Ch', 'CPS', asFraction = FALSE, minYerror = 10000, ROI = c(0, 500))
largerPeaks
largerPeaks#> # A tibble: 5 × 2
#> X Y
#> <dbl> <dbl>
#> 1 129 32702
#> 2 155 45298
#> 3 184 382405
#> 4 214 1048170
#> 5 452 58156
Here, the ROI
argument determines the region searched by
the peak finder. If we want to combine these peaks into one data frame,
we can just use bind_rows
.
::bind_rows(largerPeaks, smallerPeaks)
dplyr#> # A tibble: 13 × 2
#> X Y
#> <dbl> <dbl>
#> 1 129 32702
#> 2 155 45298
#> 3 184 382405
#> 4 214 1048170
#> 5 452 58156
#> 6 551 9306
#> 7 608 4596
#> 8 662 10285
#> 9 722 3632
#> 10 781 26212
#> 11 862 4895
#> 12 915 1601
#> 13 990 1462
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.