The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
MatchingPursuit package## R version: 4.5.1
## Generated on: 03-kwiecień-2026
The below function downloads Enhanced Matching Pursuit Implementation external program (or EMPI for short), see Różański (2024), and stores it in the cache directory. The function downloads the EMPI program in a version compatible with the operating system used (Windows, Linux, MacOS-x64, MacOS-arm64). The code in the below chunk has been commented out because CRAN’s verification rules prohibit automatic binary downloads. Therefore, this function cannot be executed while generating this vignette.
User can check whether the EMPI program is installed; if not, an error message is displayed indicating that installation is required.
The presented package enables the analysis of time-series signals using the Matching Pursuit (MP) algorithm (see Mallat and Zhang (1993), Pati, Rezaiifar, and Krishnaprasad (1993), Durka (2007), Elad (2010)). Additionally, it supports working with EEG (electroencephalogram) signals by:
Let us construct an example signal by combining seven non-stationary components. The resulting signal (highlighted in blue below) will be used to demonstrate the basic functionality of the package.
fs <- 1024
T <- 1
t <- seq(0, T - 1 / fs, 1 / fs)
N <- length(t)
# 7 non-stationary signals.
x1 <- sin(2 * pi * (10 + 40 * t) * t) # linear chirp
x2 <- sin(2 * pi * (20 * t^2) * t) # nonlinear chirp
x3 <- (1 + 0.5 * sin(2 * pi * 2 * t)) * sin(2 * pi * 30 * t) # AM
x4 <- sin(2 * pi * 50 * t + 5 * sin(2 * pi * 3 * t)) # FM
x5 <- exp(-2 * t) * sin(2 * pi * 60 * t) # decreasing amplitude
x6 <- sin(2 * pi * (5 + 20 * sin(2 * pi * t)) * t) # frequency modulated sine wave
x7 <- t * sin(2 * pi * 40 * t) # increasing amplitude
signal <- data.frame(x = x1 + x2 + x3 + x4 + x5 + x6 + x7)Data must be stored in a data frame: rows represent samples for all
channels, and columns represent channels. Our first demo dataset
consists of only one channel (one column). The
read.csv.signals() function checks whether the data has the
correct structure. The first line of the file must contain two numbers:
the sampling rate in Hz (freq) and the signal length in
seconds (sec). This allows verification that the file
actually contains freq*sec samples.
file <- system.file("extdata", "sample1.csv", package = "MatchingPursuit")
out <- read.csv(file, header = FALSE)
head(out)
## V1
## 1 1024 1
## 2 0.00000000
## 3 1.02492083
## 4 1.93756420
## 5 2.63875099
## 6 3.05949332
out <- read.csv.signals(file)
str(out)
## List of 2
## $ signal :'data.frame': 1024 obs. of 1 variable:
## ..$ v1: num [1:1024] 0 1.02 1.94 2.64 3.06 ...
## $ sampling.rate: num 1024
head(out$signal)
## v1
## 1 0.000000
## 2 1.024921
## 3 1.937564
## 4 2.638751
## 5 3.059493
## 6 3.170315
out$sampling.rate
## [1] 1024The input data (signal) is passed as an argument to the
empi.execute() function, which generates the final output
file in SQLite format (sample1.db) containing
all atom parameters.
Important note: The code in the chunk below has been
commented out because CRAN’s verification rules prohibit automatic
binary downloads. empi.execute() function requires that
EMPI is installed, otherwise it terminates with an error message.
Therefore, this function can not be executed here. That is why the
sample1.db file has been generated in advance and included
in the package. In the next chunk, the empi2ft() function
uses this file as input (the db.file parameter).
Notice the empi.options parameter in the
empi.execute() function. You can specify NULL
for this parameter, and the EMPI program will run with the default
values set in the function ("-o local --gabor -i 50"). It
is also worth noting that the program offers a wide range of
configuration options. Details can be found in the
README.md file located in the directory where the EMPI
program was installed. In our example, parameters were set to instruct
the program to find 25 atoms.
It is now time to generate the final time-frequency (T-F) map for the selected channel. The centers of the atoms (in terms of time and frequency coordinates) are marked with the numbers of successive atoms, sorted from highest to lowest energy.
By comparing the two signal waveforms below the T-F map, it can be seen that the original signal and the reconstructed signal differ only minimally.
Below the plot, basic signal parameters are displayed, along with information about the number of atoms into which the input signal was decomposed.
Additionally, the energy of the input signal and the reconstructed signal (from the atoms) is calculated. The results show that 94.05% of the original signal’s energy is “explained” by the generated atoms. Naturally, increasing the number of generated atoms will likely bring the energy of the reconstructed signal closer to 100%.
# Reading a SQLite file in which all generated atom parameters are stored.
file <- system.file("extdata", "sample1.db", package = "MatchingPursuit")
# Create time-frequency map based on MP atoms.
out <- empi2tf(
db.file = file,
channel = 1,
mode = "sqrt",
freq.divide = 4,
increase.factor = 4,
display.crosses = FALSE,
display.atom.numbers = TRUE,
out.mode = "plot"
)
## Channel #: 1
## Number of atoms: 25
## Sampling rate: 1024
## Epoch size (in points): 1024
## Signal length (in seconds): 1
##
## Energy of the original signal: 2746.16
## nEnergy of the reconstructed signal: 2582.88
## reconstruction / original %: 94.05In this section, we demonstrate how to analyze electroencephalography (EEG) signals using the Matching Pursuit algorithm. The package provides a dedicated function for reading files in EDF and EDF+ (European Data Format). It also supports three types of EEG montages and allows for signal filtering.
Reading an example EEG signal (EDF file). The signal is 10 seconds long and consists of 20 channels (19 plus a special channel called the annotation channel that does not contain EEG signal data). The sampling rate is 256 Hz for each channel with EEG data. The channels have standard names.
file <- system.file("extdata", "EEG.edf", package = "MatchingPursuit")
# Read signal parameters and display them in a tabular form.
read.edf.params(file)
## channel.name frequency no.of.samples length.sec
## 1 Fp1 256 2560 10
## 2 Fp2 256 2560 10
## 3 F3 256 2560 10
## 4 F4 256 2560 10
## 5 F7 256 2560 10
## 6 F8 256 2560 10
## 7 Fz 256 2560 10
## 8 C3 256 2560 10
## 9 C4 256 2560 10
## 10 Cz 256 2560 10
## 11 T3 256 2560 10
## 12 T5 256 2560 10
## 13 T4 256 2560 10
## 14 T6 256 2560 10
## 15 P3 256 2560 10
## 16 P4 256 2560 10
## 17 Pz 256 2560 10
## 18 O1 256 2560 10
## 19 O2 256 2560 10
## 20 EDF_Annotations 57 627 11Sometimes it is necessary to change the sampling frequency (increase
— upsampling or decrease — downsampling). The
read.edf.signals() function provides this functionality. In
the example below, the original sampling frequency is reduced from 256
Hz to 64 Hz.
# Original signal
out1 <- read.edf.signals(file, resampling = FALSE, from = 0, to = 10)
signal <- out1$signals
sampling.rate <- out1$sampling.rate
sampling.rate
## [1] 256
# Resampled signal
out2 <- read.edf.signals(file, resampling = TRUE, f.new = 64, from = 0, to = 10)
signal.resampled <- out2$signals
sampling.rate.resampled <- out2$sampling.rate
sampling.rate.resampled
## [1] 64A bipolar montage is created (the classical double banana
montage), where each channel compares two adjacent electrodes. In the
first step, you define the pairs of electrodes to be connected using the
pairs list. In the second step, the
eeg.montage() function generates the required montage.
# Pairs of signals for bipolar montage (so called "double banana").
pairs <- list(
c("Fp2", "F4"), c("F4", "C4"), c("C4", "P4"), c("P4", "O2"), c("Fp1", "F3"), c("F3", "C3"),
c("C3", "P3"), c("P3", "O1"), c("Fp2", "F8"), c("F8", "T4"), c("T4", "T6"), c("T6", "O2"),
c("Fp1", "F7"), c("F7", "T3"), c("T3", "T5"), c("T5", "O1"), c("Fz", "Cz"), c("Cz", "Pz")
)
# Make the bipolar montage.
bip.montage <- eeg.montage(signal, montage.type = c("bipolar"), bipolar.pairs = pairs)
# Original signal.
head(signal[, 1:5])
## Fp1 Fp2 F3 F4 F7
## 1 -22.71201 -25.55728 -23.71532 -24.88713 -20.368393
## 2 -12.67505 -18.02861 -14.51318 -16.68830 -7.823155
## 3 -15.18333 -19.86674 -16.68830 -18.69876 -10.833093
## 4 -13.84686 -18.86342 -15.51650 -17.52695 -9.328124
## 5 -14.51318 -19.53358 -16.18665 -18.19710 -10.162942
## 6 -14.01153 -19.20041 -15.85349 -17.86011 -9.661285
# Signal after banana montage.
head(bip.montage[, 1:5])
## Fp2_F4 F4_C4 C4_P4 P4_O2 Fp1_F3
## 1 -0.6701516 -0.1684953 -1.167979 0.3331611 1.003313
## 2 -1.3403032 -0.3331611 -2.006625 0.6663222 1.838130
## 3 -1.1679785 -0.3369905 -1.838130 0.5016563 1.504969
## 4 -1.3364738 -0.3369905 -1.838130 0.6701516 1.669635
## 5 -1.3364738 -0.1684953 -2.006625 0.5016563 1.673464
## 6 -1.3403032 -0.3331611 -2.010455 0.5054858 1.841960EEG signals are rarely analyzed without prior filtering. Using the
filters.coeff() function, you can define the filter
parameters and then apply the filter to the signal. The filter
parameters listed below use typical values recommended in the literature
for EEG signal analysis.
# Filter parameters that will be used (quite typical in filtering EEG signals).
fc <- filters.coeff(
fs = sampling.rate,
notch = c(49, 51),
lowpass = 40,
highpass = 1,
)
# Filtering input signals.
bip.montage.filt <- bip.montage
for (m in 1:ncol(bip.montage)) {
bip.montage.filt[, m] = signal::filtfilt(fc$notch, bip.montage[, m]) # 50Hz notch filter
bip.montage.filt[, m] = signal::filtfilt(fc$lowpass, bip.montage.filt[, m]) # Low pass IIR Butterworth
bip.montage.filt[, m] = signal::filtfilt(fc$highpass, bip.montage.filt[, m]) # High pass IIR Butterwoth
}For the selected channel (after the bipolar montage, it is named Fp2_F4), both the original and filtered signals are displayed. The effect of filtering is clearly beneficial, as noise—typically without diagnostic significance—has been removed from the signal.
Important note: The code below has been commented out. See the explanation given here.
# The empi.options parameter is NULL, so the EMPI program is
# run with the parameters "-o local --gabor -i 50"
# sig <- list(bip.montage.filt, out1$sampling.rate)
# empi.out <- empi.execute (
# signal = sig,
# empi.options = NULL,
# write.to.file = TRUE,
# path = NULL,
# file.name = "EEG_bipolar_filtered.db"
# ) Generating the final time-frequency map for the selected channel (Fp2_F4). The canters of atoms (in the sense of time and frequency coordinates) are now marked with white crosses.
Comparing the two signal waveforms under the T-F map, it can be seen that the original and reconstructed signals differ only minimally.
Below the plot, basic signal parameters are displayed, along with information about the number of atoms into which the input signal was decomposed.
Additionally, the energy of the input signal and the reconstructed signal (from the generated atoms) is calculated. The results show that nearly all of the energy of the original signal (97.75%) is “explained” by the generated atoms.
# Reading a SQLite file where all the generated atom's parameters are stored.
file <- system.file("extdata", "EEG_bipolar_filtered.db", package = "MatchingPursuit")
# Generate time-frequency map based on MP atoms.
out <- empi2tf(
db.file = file,
channel = 1,
mode = "sqrt",
freq.divide = 6,
increase.factor= 4,
display.crosses = TRUE,
display.atom.numbers = FALSE,
out.mode = "plot"
)
## Channel #: 1
## Number of atoms: 50
## Sampling rate: 256
## Epoch size (in points): 2560
## Signal length (in seconds): 10
##
## Energy of the original signal: 499028.09
## nEnergy of the reconstructed signal: 487807.11
## reconstruction / original %: 97.75atom.params()Using the atom.params() function, all atom parameters
can be displayed in a tabular form. Note that the
sample1.db file contains only one channel. The signal was
decomposed into 25 atoms. By analyzing the position and
frequency parameters, you can easily identify the so-called
blobs corresponding to these atoms in the T-F maps.
file <- system.file("extdata", "sample1.db", package = "MatchingPursuit")
atom.params(file)
## channel_id atom_number amplitude energy frequency phase scale position
## 1 1 1 1.396 0.560 30.129 1.032 0.812 0.515
## 2 1 2 1.187 0.188 41.009 -2.675 0.378 0.668
## 3 1 3 3.051 0.162 19.161 2.909 0.049 0.056
## 4 1 4 1.612 0.158 52.881 0.835 0.173 0.525
## 5 1 5 1.859 0.143 59.245 0.580 0.117 0.055
## 6 1 6 1.063 0.139 62.278 2.493 0.349 0.557
## 7 1 7 1.367 0.128 27.042 0.227 0.194 0.220
## 8 1 8 0.590 0.107 37.685 2.694 0.872 0.676
## 9 1 9 1.903 0.088 58.038 -1.588 0.069 0.967
## 10 1 10 0.541 0.084 0.717 0.510 0.742 0.268
## 11 1 11 1.461 0.075 24.971 -0.558 0.100 0.635
## 12 1 12 0.614 0.067 17.111 0.696 0.506 0.430
## 13 1 13 0.917 0.066 69.964 3.105 0.223 0.702
## 14 1 14 1.108 0.057 10.347 1.641 0.131 0.414
## 15 1 15 0.668 0.049 34.951 -2.350 0.307 0.725
## 16 1 16 2.238 0.048 93.506 1.292 0.027 0.937
## 17 1 17 1.001 0.048 75.322 1.867 0.134 0.839
## 18 1 18 1.573 0.042 51.072 1.628 0.048 0.244
## 19 1 19 0.862 0.039 55.968 1.787 0.150 0.721
## 20 1 20 0.521 0.036 4.740 -1.830 0.380 0.156
## 21 1 21 2.066 0.036 30.172 2.292 0.024 0.992
## 22 1 22 0.549 0.034 44.676 1.637 0.317 0.491
## 23 1 23 1.708 0.031 36.527 -1.958 0.030 0.129
## 24 1 24 1.886 0.026 109.382 1.142 0.021 0.972
## 25 1 25 0.319 0.026 65.308 -2.085 0.733 0.366In the following example, note that the
EEG_bipolar_filtered.db file provides data for 18 channels.
To keep the output concise, only the first 5 and last 5 atoms are
displayed below. The total number of atoms generated is 900 (number of
atoms per channel = 50, number of channels = 18, so 50 × 18 = 900).
Important: It should be emphasized that, in general, the number of atoms generated for each channel does not need to be the same. However, in our example, the number of atoms in each channel is exactly the same.
file <- system.file("extdata", "EEG_bipolar_filtered.db", package = "MatchingPursuit")
ap <- atom.params(file)
head(ap, 5)
## channel_id atom_number amplitude energy frequency phase scale position
## 1 1 1 45.703 464.841 1.910 -2.885 0.629 7.783
## 2 1 2 72.222 450.535 4.457 -0.791 0.244 4.023
## 3 1 3 48.706 273.616 4.250 -0.913 0.326 7.470
## 4 1 4 16400.149 200.846 0.012 -1.571 0.132 7.101
## 5 1 5 33.193 169.123 2.224 -2.751 0.433 3.616
tail(ap, 5)
## channel_id atom_number amplitude energy frequency phase scale position
## 896 18 46 13.936 2.506 4.878 1.501 0.070 0.237
## 897 18 47 1.961 2.352 5.405 0.713 1.730 8.763
## 898 18 48 2.790 2.332 5.247 2.728 0.847 2.956
## 899 18 49 3.134 2.028 14.356 -1.172 0.584 4.155
## 900 18 50 7.788 1.918 8.337 0.351 0.087 6.058sig2bin()The sig2bin() function can be used to convert input data
(in plain text file format) to a binary file. The binary data consist of
floating-point values in the byte order of the current machine (no
byte-order conversion is performed). For multichannel signals, the
samples are arranged such that first come all channels at
t = 0, then all channels at t = Δt, and so
on.
Note: Users do not work directly with
.bin files. Binary files are used only internally by the
empi.execute() function. The external EMPI program executed
within this function requires binary data as input. Additionally, the
ability to convert text files to binary form can be useful for those who
wish to use EMPI independently of the R environment.
file <- system.file("extdata", "sample3.csv", package = "MatchingPursuit")
out <- read.csv.signals(file)
signal.bin <- sig2bin(data = out$signal, write.to.file = FALSE)
# We have 3 channels. The first 4 time points.
head(out$signal, 4)
## v1 v2 v3
## 1 0.8162 0.6454 0.8467
## 2 0.8167 0.3514 0.2683
## 3 0.3689 0.5621 0.2096
## 4 0.4414 0.1426 0.4842
# The same elements of the signal in binary (floats are stored in 4 bytes).
head(signal.bin, 48)
## [1] 7c f2 50 3f ef 38 25 3f 55 c1 58 3f 40 13 51 3f b3 ea b3 3e 9e 5e 89 3e 76
## [26] e0 bc 3e c9 e5 0f 3f 62 a1 56 3e 2e ff e1 3e bc 05 12 3e 10 e9 f7 3eread.empi.db.file()Data from the generated SQLite file can be easily read
using the read.empi.db.file() function. Below, data from a
sample sample1.db file is read. The original and
reconstructed signals are then extracted and displayed. The two signals
are virtually indistinguishable, indicating that the atoms are highly
effective at reconstructing the analyzed signal.
file <- system.file("extdata", "sample1.db", package = "MatchingPursuit")
out <- read.empi.db.file(file)
str(out)
## List of 6
## $ atoms :'data.frame': 25 obs. of 9 variables:
## ..$ channel_id : int [1:25] 1 1 1 1 1 1 1 1 1 1 ...
## ..$ atom_number: int [1:25] 1 2 3 4 5 6 7 8 9 10 ...
## ..$ amplitude : num [1:25] 1.4 1.19 3.05 1.61 1.86 ...
## ..$ energy : num [1:25] 0.56 0.188 0.162 0.158 0.143 ...
## ..$ envelope : chr [1:25] "gauss" "gauss" "gauss" "gauss" ...
## ..$ frequency : num [1:25] 30.1 41 19.2 52.9 59.2 ...
## ..$ phase : num [1:25] 1.032 -2.675 2.909 0.835 0.58 ...
## ..$ scale : num [1:25] 0.8117 0.3781 0.0491 0.1726 0.1168 ...
## ..$ position : num [1:25] 0.5152 0.6685 0.0563 0.5253 0.0552 ...
## $ original.signal: num [1:1024, 1] 0 1.02 1.94 2.64 3.06 ...
## $ reconstruction : num [1:1024, 1] 0.834 1.249 1.575 1.768 1.803 ...
## $ gabors :List of 1
## ..$ : num [1:1024, 1:25] -0.01048 -0.00795 -0.00512 -0.00209 0.00104 ...
## $ t : num [1:1024] 0 0.000977 0.001953 0.00293 0.003906 ...
## $ f : num 1024In this chapter, we present a specific data example adapted from the work of Durka (2007). The signal consists of a mixture of seven components: (a) four Gabor functions with different parameters, (b) a unit impulse, (c) a sinusoidal waveform, and (d) a chirp signal.
In the T–F map, all signal components—except for the chirp—are represented clearly and accurately (i.e., blobs for Gabor functions, a horizontal line for the sine wave, and a vertical line for the unit impulse). However, the chirp signal is decomposed into several separate blobs. This behavior arises from the discrete nature of the atom dictionary used in the MP algorithm, which prevents a continuous representation of a signal with smoothly varying frequency. This limitation (and, in some respects, a drawback) of the MP algorithm should be taken into account.
## Channel #: 1
## Number of atoms: 32
## Sampling rate: 128
## Epoch size (in points): 1280
## Signal length (in seconds): 10
##
## Energy of the original signal: 903.48
## nEnergy of the reconstructed signal: 910.88
## reconstruction / original %: 100.82
The Matching Pursuit algorithm is well-known and described in the literature. Its purpose is to approximate the analyzed signal using so-called atoms. (the text below is adapted from Kunik and Gramacki (2025)).
Given a signal \(f \in \mathbb{R}^n\), and a (possibly overcomplete) large redundant dictionary \(D =\{g_{\gamma}\}_{\gamma \in \Gamma}\) of normalized atoms \(\|g_{\gamma}\|=1\) MP finds a sparse signal representation
\[ f \approx \sum_{n = 0}^{N} a_n g_{\gamma_n} \tag{1} \] where \(a_n \in \mathbb{R}\) are coefficients, \(g_{\gamma_n} \in D\) are atoms selected from the dictionary and \(N\) is the desired number of iterations (or stopping threshold). In most practical cases $ N size(D)$. Also, \(g_{\gamma}\) is the dictionary atom indexed by \(\gamma\) and \(\Gamma\) is the set of all indices in the directory.
In the ideal case, the linear expansion (1) should include all atoms \(g_{\gamma_n}\) that represent the relevant structures of the signal \(f\). For real signals, such an ideal scenario is rarely possible, and some form of approximation is required. This task can be accomplished elegantly using the MP algorithm, which was first proposed by Mallat and Zhang (1993) in the context of signal processing.
Each atom \(g_{\gamma}\) is typically a time-frequency shifted, scaled version of a prototype function, such as the Gabor function (often called a Gaussian-windowed sinusoid). The dictionary is constructed to cover a wide range of time and frequency characteristics. The real-valued Gabor function has the following form:
\[ g_{\gamma}(t) = K(\gamma) e^{- \pi \left( \frac{t-\mu}{\sigma} \right) ^2} \cos(\omega (t - \mu) + \phi) \]
where \(\gamma = (\mu, \omega, \sigma, \phi)\) constitute a four-dimmensional space and \(K(\gamma)\) is such that \(||g_{\gamma}|| = 1\). It is easy to see that Gabor functions are constructed by multiplying Gaussian envelopes with cosine oscillations of different frequencies \(\omega\) and phases offset \(\phi\). By multiplying these two functions, we can obtain a wide variety of shapes depending on their parameters. A few examples of Gabor function are presented in figure below (in blue). The sinusoidal plane wave (in gray) is modulated by a Gaussian envelope (in red).
In the MP algorithm, the decomposition process is iterative. At each step, the algorithm selects an atom \(g_{\gamma_n}\) from the dictionary \(D\) that best matches the current residual signal \(R\). Formally, starting with the signal \(f\) at iteration \(n=0\) the initial residual and initial function approximation are
\[ R^0 = f \\ f^0 = 0 \] For each iteration \(n = \{0,1,\ldots, N\}\) we find \(g_{\gamma_n} \in D\) such that the following inner product \(\langle \cdot, \cdot \rangle\) is maximized
\[ g_{\gamma_n} = \arg \max_{\gamma \in \Gamma} | \langle R^{n}, g_{\gamma} \rangle | \] The coefficients \(a_n\) in (1) are
\[ a_n = \langle R^{n}, g_{\gamma_n} \rangle \] and updated function approximation is defined as
\[ f^{n+1} = f^{n} + a_n g_{\gamma_n} \] Similarly, updated residual is defined as
\[ R^{n+1} = R^{n} - a_n g_{\gamma_n} \] After \(N\) iterations, the signal \(f\) is approximated as
\[ f \approx \sum_{n = 0}^{N} \langle R^n, g_{\gamma_n} \rangle g_{\gamma_n} = \sum_{n = 0}^{N} a_n g_{\gamma_n} \] or equivalently
\[ f = \sum_{n = 0}^{N} \langle R^n g_{\gamma_n} \rangle g_{\gamma_n} + R^{N+1} \] The procedure stops when \(\|R^{n+1}\|\) is below a threshold or after a fixed number of iterations.
It should be noted that finding an optimal approximation (1) is an NP-hard problem. A suboptimal solution can be obtained using an iterative procedure, such as the MP algorithm. Another key property of MP is energy conservation: the total energy of the signal is preserved in the MP decomposition.
\[ ||f||^2 = \sum_{n = 0}^{N} |\langle R^n, g_{\gamma_n} \rangle |^2 + ||R^{N+1}||^2 \] Back to menu
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.