Peak Detection
The detection module identifies peaks in processed spectra, computes areas by integration, and aligns peaks across multiple samples.
Quick Example
from mioXpektron import detect_peaks_with_area
peaks_df = detect_peaks_with_area(
mz_values=mz,
intensities=corrected,
sample_name="sample_01",
group="control",
min_snr=3.0,
noise_model="mz_binned",
)
Detection Algorithms
Local Maximum Detection
The default method. Finds peaks as local maxima above a noise-based threshold:
from mioXpektron import detect_peaks_with_area, detect_peaks_with_area_v2
# Standard version
peaks = detect_peaks_with_area(
mz_values=mz,
intensities=corrected,
sample_name="sample_01",
group="control",
min_snr=3.0,
)
# Enhanced version with additional peak properties
peaks = detect_peaks_with_area_v2(
mz_values=mz,
intensities=corrected,
sample_name="sample_01",
group="control",
min_snr=3.0,
noise_model="mz_binned",
noise_bins=20,
)
CWT-Based Detection
Uses the Continuous Wavelet Transform for multi-scale peak detection, which is more robust to varying peak widths:
from mioXpektron import detect_peaks_cwt_with_area
peaks = detect_peaks_cwt_with_area(
mz_values=mz,
intensities=corrected,
sample_name="sample_01",
group="control",
min_snr=3.0,
)
Noise Estimation
Robust noise estimation using the Median Absolute Deviation (MAD) approach, which excludes peak regions for accurate background noise measurement:
from mioXpektron import robust_noise_estimation
median_noise, std_noise = robust_noise_estimation(corrected)
The default global thresholding path uses a Gaussian-equivalent MAD estimate on positive intensities after masking the measured width of detected peaks plus an additional point margin. This is a robust heuristic for thresholding, not a full physical Poisson detector model.
For spectra whose background varies across the mass range, the detection entry
points also support noise_model="mz_binned". This estimates local
background statistics in m/z bins and interpolates them back to a per-point
threshold profile:
peaks = detect_peaks_with_area_v2(
mz_values=mz,
intensities=corrected,
sample_name="sample_01",
group="control",
noise_model="mz_binned",
noise_bins=20,
noise_min_points=25,
)
Available noise models:
"global": one threshold for the full spectrum"mz_binned": interpolated m/z-dependent thresholds
For spectra with strong mass-dependent background changes, "mz_binned" is
the preferred choice. The global model remains useful as a fast default, but
its SNR interpretation should be treated as heuristic.
Area Integration
Peak areas are computed from peak widths and corrected baselines. The current integration path:
handles empty or invalid background regions defensively
integrates on the true floating peak boundaries
reports the area definition and integration method in the output table
Batch Peak Collection
collect_peak_properties_batch() runs the full preprocessing and peak
collection workflow across many spectra and forwards the detector-specific
options consistently:
peaks_df = collect_peak_properties_batch(
files=file_list,
method="Gaussian",
min_intensity=5,
min_snr=3.0,
noise_model="mz_binned",
noise_bins=20,
)
For analytic fit methods that enable overlapping-peak deconvolution, the current implementation now uses a conservative two-stage acceptance rule:
nearby peaks must overlap on an adaptive width-based spacing criterion
the two-Gaussian fit must improve BIC over a single-Gaussian window fit by at least
deconvolution_min_bic_delta(default10)
Component widths are also checked against the configured peak-width bounds before the deconvoluted peaks are accepted.
Cross-Sample Alignment
Align peaks across multiple samples by m/z tolerance:
from mioXpektron import align_peaks, PeakAlignIntensityArea
# Align peak lists from multiple samples
aligned = align_peaks(peak_list, mz_tolerance=0.2)
# Full alignment with intensity and area matrices
aligner = PeakAlignIntensityArea(
mz_tolerance=0.2,
method="Gaussian",
noise_model="mz_binned",
noise_bins=20,
deconvolution_min_bic_delta=10.0,
)
intensity_matrix, area_matrix = aligner.align(peak_data)
PeakAlignIntensityArea now exposes the underlying peak-detection method
and the same noise-model options as the batch collector, so alignment runs can
be compared on equal footing.
Overlapping Peak Analysis
Detect and visualize overlapping peaks:
from mioXpektron import check_overlapping_peaks, check_overlapping_peaks2
# Basic overlap check
overlaps = check_overlapping_peaks(peaks, resolution_threshold=0.5)
# Enhanced analysis with visualization
check_overlapping_peaks2(peaks, data, resolution_threshold=0.5)
API Reference
- mioXpektron.detection.detect_peaks_with_area(mz_values, intensities, sample_name, group, min_intensity=1, min_snr=3, min_distance=2, window_size=10, peak_height=50, prominence=10, min_peak_width=1, max_peak_width=75, width_rel_height=0.5, noise_model='global', noise_bins=20, noise_min_points=25, verbose=False)[source]
Fast peak detection in ToF-SIMS or similar spectra, including peak area.
Returns:
- peak_indicesnp.ndarray
Indices of detected peaks
- peak_propertiesdict
Contains: mz, intensities, widths, prominences, heights, areas
- mioXpektron.detection.detect_peaks_with_area_v2(mz, intens, sample_name, group, *, min_intensity=1, min_snr=3, min_distance=2, prominence=10, min_peak_width=1, max_peak_width=75, rel_height=0.5, noise_model='global', noise_bins=20, noise_min_points=25, noise_window=10, verbose=False)[source]
- mioXpektron.detection.detect_peaks_cwt_with_area(mz_values, intensities, sample_name, group, min_intensity=1, min_snr=3, min_distance=2, window_size=10, peak_height=50, prominence=10, min_peak_width=1, max_peak_width=75, width_rel_height=0.5, noise_model='global', noise_bins=20, noise_min_points=25, verbose=False)[source]
Peak detection using Continuous Wavelet Transform (CWT) for ToF-SIMS spectra.
Returns:
- peak_propertiespd.DataFrame
Contains: mz, intensities, widths (approx), amplitudes, areas
- mioXpektron.detection.robust_peak_detection(mz_values, intensities, sample_name, group, method='Gaussian', min_intensity=1, min_snr=3, min_distance=2, window_size=10, peak_height=50, prominence=10, min_peak_width=1, max_peak_width=75, width_rel_height=0.5, distance_threshold=0.1, combined=False, use_cwt=False, noise_model='global', noise_bins=20, noise_min_points=25, deconvolution_min_bic_delta=10.0, deconvolution_overlap_factor=0.75, deconvolution_replace_singles=True, verbose=False)[source]
Fast peak detection in ToF-SIMS or similar spectra, including peak area.
Returns:
- peak_indicesnp.ndarray
Indices of detected peaks
- peak_propertiesdict
Contains: mz, intensities, widths, prominences, heights, areas
Notes
Overlapping-peak deconvolution now requires both geometric overlap and a BIC improvement over a single-Gaussian window fit. Fitted component widths must also remain within the user-specified peak-width bounds.
- mioXpektron.detection.robust_noise_estimation(intensities, peak_indices=None, window=2, peak_height=None, peak_prominence=None, min_peak_width=1, max_peak_width=75)[source]
Robust noise estimation by excluding regions near detected peaks.
- Parameters:
intensities (np.ndarray) – Denoised, baseline-corrected intensities.
peak_indices (np.ndarray or None) – Indices of detected peaks. If None, function will detect peaks automatically.
window (int) – Extra number of data points to exclude on each side of the detected peak width. The measured peak extent is always masked first.
peak_height (float or None) – Minimum height for peak detection. If None, defaults to the median of positive intensities (data-adaptive).
peak_prominence (float or None) – Minimum prominence for peak detection. If None, defaults to 3x the MAD of positive intensities (data-adaptive).
- Returns:
median_intensity (float) – Median intensity of noise region.
robust_std (float) – Robust standard deviation (Gaussian-equivalent MAD) of noise region.
- mioXpektron.detection.robust_noise_estimation_mz_dependent(mz_values, intensities, peak_indices=None, window=2, peak_height=None, peak_prominence=None, min_peak_width=1, max_peak_width=75, n_bins=20, min_points_per_bin=25)[source]
Estimate local noise as piecewise-constant m/z bins interpolated over the spectrum.
- Returns:
median_profile (np.ndarray) – Per-point local median noise estimate.
std_profile (np.ndarray) – Per-point local Gaussian-equivalent robust std estimate.
- mioXpektron.detection.collect_peak_properties_batch(files, mz_min=None, mz_max=None, baseline_method='airpls', noise_method='wavelet', missing_value_method='interpolation', normalization_target=100000000.0, method='Gaussian', min_intensity=1, min_snr=3, min_distance=5, window_size=10, peak_height=50, prominence=50, min_peak_width=1, max_peak_width=75, width_rel_height=0.5, distance_threshold=0.01, combined=False, noise_model='global', noise_bins=20, noise_min_points=25, deconvolution_min_bic_delta=10.0, deconvolution_overlap_factor=0.75, deconvolution_replace_singles=True)[source]
Collect peak properties from a batch of ToF-SIMS files.
- Parameters:
mz_min (float or None) – m/z window for data import (if supported).
mz_max (float or None) – m/z window for data import (if supported).
baseline_method (str) – Method for baseline correction.
noise_method (str) – Noise filtering method.
missing_value_method (str) – Method for handling missing values.
normalization_target (float) – Target TIC normalization value.
min_snr (int or float) – Minimum signal-to-noise ratio for peak detection.
min_distance (int) – Minimum distance between peaks (in data points).
prominence (int or float or None) – Minimum peak prominence for detection.
width_rel_height (float) – Relative height for width calculation (e.g., 0.5 = FWHM).
noise_model ({"global", "mz_binned"}) – Noise model used to derive peak thresholds.
noise_bins (int) – Number of m/z bins for
noise_model="mz_binned".noise_min_points (int) – Minimum positive noise points per bin before using local estimates.
- Returns:
peaks_df – DataFrame with all peak properties for all files.
- Return type:
pd.DataFrame
- mioXpektron.detection.align_peaks(peaks_df, mz_tolerance=0.2, mz_rounding_precision=1, output='intensity')[source]
Cluster peaks by m/z and return an aligned feature matrix.
Uses a greedy sorted-bin algorithm that guarantees every aligned bin spans at most mz_tolerance in m/z.
- class mioXpektron.detection.PeakAlignIntensityArea(mz_tolerance=0.2, mz_rounding_precision=1, min_intensity=1, min_snr=3, min_distance=2, peak_height=50, prominence=10, min_peak_width=1, max_peak_width=75, width_rel_height=0.5, noise_model='global', noise_bins=20, noise_min_points=25, method=None, deconvolution_min_bic_delta=10.0, deconvolution_overlap_factor=0.75, deconvolution_replace_singles=True, output_dir=None, verbose=False, group_patterns=None, group_fn=None)[source]
Bases:
objectProcess normalized ToF-SIMS spectra from CSV files, detect peaks, align them across samples, and calculate both intensity and area tables for each aligned m/z value.
- Parameters:
mz_tolerance (float, optional (default=0.2)) – Maximum distance (in m/z units) for clustering peaks across samples.
mz_rounding_precision (int, optional (default=1)) – Number of decimal places for rounding aligned m/z values in output tables.
min_intensity (float, optional (default=1)) – Minimum intensity threshold for considering data points.
min_snr (float, optional (default=3)) – Minimum signal-to-noise ratio for peak detection.
min_distance (int, optional (default=2)) – Minimum distance (in data points) between peaks.
peak_height (float, optional (default=50)) – Minimum peak height for initial peak detection.
prominence (float, optional (default=10)) – Minimum prominence for peak detection.
min_peak_width (int, optional (default=1)) – Minimum peak width (in data points).
max_peak_width (int, optional (default=75)) – Maximum peak width (in data points).
width_rel_height (float, optional (default=0.5)) – Relative height for peak width calculation (0.5 = FWHM).
noise_model ({"global", "mz_binned"}, optional (default="global")) – Noise model used for threshold estimation.
noise_bins (int, optional (default=20)) – Number of m/z bins when using
noise_model="mz_binned".noise_min_points (int, optional (default=25)) – Minimum positive noise points per bin for the local model.
method (str or None, optional (default=None)) – Peak-detection / fitting method. None uses simple local-max detection (
detect_peaks_with_area_v2),'cwt'uses CWT detection, and'Gaussian'/'Lorentzian'/'Voigt'use curve-fit detection viarobust_peak_detection.deconvolution_min_bic_delta (float, optional (default=10.0)) – Minimum BIC improvement required before accepting a two-Gaussian deconvolution over a single-peak fit.
deconvolution_overlap_factor (float, optional (default=0.75)) – Scale factor applied to the mean measured peak width when deriving the adaptive deconvolution spacing gate.
deconvolution_replace_singles (bool, optional (default=True)) – If True, replace overlapping single-peak fits with the accepted deconvoluted components in the output table.
output_dir (str or None, optional) – Directory to save output CSV files. If None, files are not saved.
verbose (bool, optional (default=False)) – If True, print progress information.
Examples
>>> from mioXpektron.detection import PeakAlignIntensityArea >>> import glob >>> >>> # Get all normalized spectra >>> csv_files = glob.glob('output_files/normalized_spectra/*.csv') >>> >>> # Create analyzer instance >>> analyzer = PeakAlignIntensityArea( ... mz_tolerance=0.1, ... min_snr=3, ... output_dir='output_files/peak_analysis' ... ) >>> >>> # Process with m/z cutoff >>> intensity_table, area_table, peaks_df = analyzer.run( ... csv_files, ... mz_min=50, ... mz_max=500 ... ) >>> >>> print(f"Detected {len(peaks_df)} peaks across {len(csv_files)} samples") >>> print(f"Aligned to {intensity_table.shape[1]} unique m/z values")
- __init__(mz_tolerance=0.2, mz_rounding_precision=1, min_intensity=1, min_snr=3, min_distance=2, peak_height=50, prominence=10, min_peak_width=1, max_peak_width=75, width_rel_height=0.5, noise_model='global', noise_bins=20, noise_min_points=25, method=None, deconvolution_min_bic_delta=10.0, deconvolution_overlap_factor=0.75, deconvolution_replace_singles=True, output_dir=None, verbose=False, group_patterns=None, group_fn=None)[source]
Initialize the PeakAlignIntensityArea analyzer with default parameters.
- run(csv_files, mz_min=None, mz_max=None)[source]
Process CSV files and perform peak detection, alignment, and quantification.
- Parameters:
csv_files (list of str) – List of paths to normalized spectrum CSV files. Each CSV should have columns: ‘channel’, ‘mz’, ‘intensity’
mz_min (float or None, optional) – Minimum m/z value to consider for peak detection. If None, use full range.
mz_max (float or None, optional) – Maximum m/z value to consider for peak detection. If None, use full range.
- Returns:
intensity_table (pd.DataFrame) – DataFrame with samples as rows and aligned m/z values as columns, containing peak intensities (amplitudes). Missing peaks are filled with 0.
area_table (pd.DataFrame) – DataFrame with samples as rows and aligned m/z values as columns, containing peak areas. Missing peaks are filled with 0.
peaks_df (pd.DataFrame) – DataFrame containing all detected peaks with their properties before alignment.