mioXpektron.detection.detection

Functions

align_peaks(peaks_df[, mz_tolerance, ...])

Cluster peaks by m/z and return an aligned feature matrix.

collect_peak_properties_batch(files[, ...])

Collect peak properties from a batch of ToF-SIMS files.

detect_peaks_cwt_with_area(mz_values, ...[, ...])

Peak detection using Continuous Wavelet Transform (CWT) for ToF-SIMS spectra.

detect_peaks_with_area(mz_values, ...[, ...])

Fast peak detection in ToF-SIMS or similar spectra, including peak area.

detect_peaks_with_area_v2(mz, intens, ...[, ...])

gaussian(x, amp, cen, sigma)

Gaussian lineshape function.

handle_missing_values(mz_values, intensities)

Fill missing intensity values using the requested strategy.

lorentzian(x, A, x0, gamma)

Lorentzian lineshape function.

robust_noise_estimation(intensities[, ...])

Robust noise estimation by excluding regions near detected peaks.

robust_noise_estimation_mz(mz_values, ...)

Estimate noise from a user-specified m/z baseline region.

robust_noise_estimation_mz_dependent(...[, ...])

Estimate local noise as piecewise-constant m/z bins interpolated over the spectrum.

robust_peak_detection(mz_values, ...[, ...])

Fast peak detection in ToF-SIMS or similar spectra, including peak area.

two_gaussians(x, amp1, cen1, wid1, amp2, ...)

voigt(x, A, x0, sigma, gamma)

Voigt profile (convolution of Gaussian and Lorentzian).

Classes

PeakAlignIntensityArea([mz_tolerance, ...])

Process normalized ToF-SIMS spectra from CSV files, detect peaks, align them across samples, and calculate both intensity and area tables for each aligned m/z value.

mioXpektron.detection.detection.handle_missing_values(mz_values, intensities, method='interpolation')[source]

Fill missing intensity values using the requested strategy.

mioXpektron.detection.detection.robust_noise_estimation(intensities, peak_indices=None, window=2, peak_height=None, peak_prominence=None, min_peak_width=1, max_peak_width=75)[source]

Robust noise estimation by excluding regions near detected peaks.

Parameters:
  • intensities (np.ndarray) – Denoised, baseline-corrected intensities.

  • peak_indices (np.ndarray or None) – Indices of detected peaks. If None, function will detect peaks automatically.

  • window (int) – Extra number of data points to exclude on each side of the detected peak width. The measured peak extent is always masked first.

  • peak_height (float or None) – Minimum height for peak detection. If None, defaults to the median of positive intensities (data-adaptive).

  • peak_prominence (float or None) – Minimum prominence for peak detection. If None, defaults to 3x the MAD of positive intensities (data-adaptive).

Returns:

  • median_intensity (float) – Median intensity of noise region.

  • robust_std (float) – Robust standard deviation (Gaussian-equivalent MAD) of noise region.

mioXpektron.detection.detection.robust_noise_estimation_mz(mz_values, intensities, min_mz, max_mz)[source]

Estimate noise from a user-specified m/z baseline region.

Parameters:
  • mz_values (np.ndarray) – m/z axis.

  • intensities (np.ndarray) – Corresponding intensity values.

  • min_mz (float) – m/z window that defines the baseline region.

  • max_mz (float) – m/z window that defines the baseline region.

Returns:

  • median_intensity (float) – Median intensity of the baseline region.

  • robust_std (float) – Robust standard deviation (MAD-scaled) of the baseline region.

mioXpektron.detection.detection.robust_noise_estimation_mz_dependent(mz_values, intensities, peak_indices=None, window=2, peak_height=None, peak_prominence=None, min_peak_width=1, max_peak_width=75, n_bins=20, min_points_per_bin=25)[source]

Estimate local noise as piecewise-constant m/z bins interpolated over the spectrum.

Returns:

  • median_profile (np.ndarray) – Per-point local median noise estimate.

  • std_profile (np.ndarray) – Per-point local Gaussian-equivalent robust std estimate.

mioXpektron.detection.detection.detect_peaks_with_area(mz_values, intensities, sample_name, group, min_intensity=1, min_snr=3, min_distance=2, window_size=10, peak_height=50, prominence=10, min_peak_width=1, max_peak_width=75, width_rel_height=0.5, noise_model='global', noise_bins=20, noise_min_points=25, verbose=False)[source]

Fast peak detection in ToF-SIMS or similar spectra, including peak area.

Returns:

peak_indicesnp.ndarray

Indices of detected peaks

peak_propertiesdict

Contains: mz, intensities, widths, prominences, heights, areas

mioXpektron.detection.detection.detect_peaks_with_area_v2(mz, intens, sample_name, group, *, min_intensity=1, min_snr=3, min_distance=2, prominence=10, min_peak_width=1, max_peak_width=75, rel_height=0.5, noise_model='global', noise_bins=20, noise_min_points=25, noise_window=10, verbose=False)[source]
mioXpektron.detection.detection.detect_peaks_cwt_with_area(mz_values, intensities, sample_name, group, min_intensity=1, min_snr=3, min_distance=2, window_size=10, peak_height=50, prominence=10, min_peak_width=1, max_peak_width=75, width_rel_height=0.5, noise_model='global', noise_bins=20, noise_min_points=25, verbose=False)[source]

Peak detection using Continuous Wavelet Transform (CWT) for ToF-SIMS spectra.

Returns:

peak_propertiespd.DataFrame

Contains: mz, intensities, widths (approx), amplitudes, areas

mioXpektron.detection.detection.gaussian(x, amp, cen, sigma)[source]

Gaussian lineshape function. amp: Peak height at cen cen: Peak centre (mean) sigma: standard deviation sigma of the Gaussian

returns: Gaussian function evaluated at x.

mioXpektron.detection.detection.lorentzian(x, A, x0, gamma)[source]

Lorentzian lineshape function. A: area under the peak (scaling factor) x0: center gamma: half-width at half-maximum (HWHM)

mioXpektron.detection.detection.voigt(x, A, x0, sigma, gamma)[source]

Voigt profile (convolution of Gaussian and Lorentzian). A: area under the peak (scaling factor) x0: center sigma: standard deviation of Gaussian component gamma: HWHM of Lorentzian component

mioXpektron.detection.detection.two_gaussians(x, amp1, cen1, wid1, amp2, cen2, wid2)[source]
mioXpektron.detection.detection.robust_peak_detection(mz_values, intensities, sample_name, group, method='Gaussian', min_intensity=1, min_snr=3, min_distance=2, window_size=10, peak_height=50, prominence=10, min_peak_width=1, max_peak_width=75, width_rel_height=0.5, distance_threshold=0.1, combined=False, use_cwt=False, noise_model='global', noise_bins=20, noise_min_points=25, deconvolution_min_bic_delta=10.0, deconvolution_overlap_factor=0.75, deconvolution_replace_singles=True, verbose=False)[source]

Fast peak detection in ToF-SIMS or similar spectra, including peak area.

Returns:

peak_indicesnp.ndarray

Indices of detected peaks

peak_propertiesdict

Contains: mz, intensities, widths, prominences, heights, areas

Notes

Overlapping-peak deconvolution now requires both geometric overlap and a BIC improvement over a single-Gaussian window fit. Fitted component widths must also remain within the user-specified peak-width bounds.

mioXpektron.detection.detection.collect_peak_properties_batch(files, mz_min=None, mz_max=None, baseline_method='airpls', noise_method='wavelet', missing_value_method='interpolation', normalization_target=100000000.0, method='Gaussian', min_intensity=1, min_snr=3, min_distance=5, window_size=10, peak_height=50, prominence=50, min_peak_width=1, max_peak_width=75, width_rel_height=0.5, distance_threshold=0.01, combined=False, noise_model='global', noise_bins=20, noise_min_points=25, deconvolution_min_bic_delta=10.0, deconvolution_overlap_factor=0.75, deconvolution_replace_singles=True)[source]

Collect peak properties from a batch of ToF-SIMS files.

Parameters:
  • files (list of str) – List of file paths to process.

  • mz_min (float or None) – m/z window for data import (if supported).

  • mz_max (float or None) – m/z window for data import (if supported).

  • baseline_method (str) – Method for baseline correction.

  • noise_method (str) – Noise filtering method.

  • missing_value_method (str) – Method for handling missing values.

  • normalization_target (float) – Target TIC normalization value.

  • min_snr (int or float) – Minimum signal-to-noise ratio for peak detection.

  • min_distance (int) – Minimum distance between peaks (in data points).

  • prominence (int or float or None) – Minimum peak prominence for detection.

  • width_rel_height (float) – Relative height for width calculation (e.g., 0.5 = FWHM).

  • noise_model ({"global", "mz_binned"}) – Noise model used to derive peak thresholds.

  • noise_bins (int) – Number of m/z bins for noise_model="mz_binned".

  • noise_min_points (int) – Minimum positive noise points per bin before using local estimates.

Returns:

peaks_df – DataFrame with all peak properties for all files.

Return type:

pd.DataFrame

mioXpektron.detection.detection.align_peaks(peaks_df, mz_tolerance=0.2, mz_rounding_precision=1, output='intensity')[source]

Cluster peaks by m/z and return an aligned feature matrix.

Uses a greedy sorted-bin algorithm that guarantees every aligned bin spans at most mz_tolerance in m/z.

class mioXpektron.detection.detection.PeakAlignIntensityArea(mz_tolerance=0.2, mz_rounding_precision=1, min_intensity=1, min_snr=3, min_distance=2, peak_height=50, prominence=10, min_peak_width=1, max_peak_width=75, width_rel_height=0.5, noise_model='global', noise_bins=20, noise_min_points=25, method=None, deconvolution_min_bic_delta=10.0, deconvolution_overlap_factor=0.75, deconvolution_replace_singles=True, output_dir=None, verbose=False, group_patterns=None, group_fn=None)[source]

Bases: object

Process normalized ToF-SIMS spectra from CSV files, detect peaks, align them across samples, and calculate both intensity and area tables for each aligned m/z value.

Parameters:
  • mz_tolerance (float, optional (default=0.2)) – Maximum distance (in m/z units) for clustering peaks across samples.

  • mz_rounding_precision (int, optional (default=1)) – Number of decimal places for rounding aligned m/z values in output tables.

  • min_intensity (float, optional (default=1)) – Minimum intensity threshold for considering data points.

  • min_snr (float, optional (default=3)) – Minimum signal-to-noise ratio for peak detection.

  • min_distance (int, optional (default=2)) – Minimum distance (in data points) between peaks.

  • peak_height (float, optional (default=50)) – Minimum peak height for initial peak detection.

  • prominence (float, optional (default=10)) – Minimum prominence for peak detection.

  • min_peak_width (int, optional (default=1)) – Minimum peak width (in data points).

  • max_peak_width (int, optional (default=75)) – Maximum peak width (in data points).

  • width_rel_height (float, optional (default=0.5)) – Relative height for peak width calculation (0.5 = FWHM).

  • noise_model ({"global", "mz_binned"}, optional (default="global")) – Noise model used for threshold estimation.

  • noise_bins (int, optional (default=20)) – Number of m/z bins when using noise_model="mz_binned".

  • noise_min_points (int, optional (default=25)) – Minimum positive noise points per bin for the local model.

  • method (str or None, optional (default=None)) – Peak-detection / fitting method. None uses simple local-max detection (detect_peaks_with_area_v2), 'cwt' uses CWT detection, and 'Gaussian' / 'Lorentzian' / 'Voigt' use curve-fit detection via robust_peak_detection.

  • deconvolution_min_bic_delta (float, optional (default=10.0)) – Minimum BIC improvement required before accepting a two-Gaussian deconvolution over a single-peak fit.

  • deconvolution_overlap_factor (float, optional (default=0.75)) – Scale factor applied to the mean measured peak width when deriving the adaptive deconvolution spacing gate.

  • deconvolution_replace_singles (bool, optional (default=True)) – If True, replace overlapping single-peak fits with the accepted deconvoluted components in the output table.

  • output_dir (str or None, optional) – Directory to save output CSV files. If None, files are not saved.

  • verbose (bool, optional (default=False)) – If True, print progress information.

Examples

>>> from mioXpektron.detection import PeakAlignIntensityArea
>>> import glob
>>>
>>> # Get all normalized spectra
>>> csv_files = glob.glob('output_files/normalized_spectra/*.csv')
>>>
>>> # Create analyzer instance
>>> analyzer = PeakAlignIntensityArea(
...     mz_tolerance=0.1,
...     min_snr=3,
...     output_dir='output_files/peak_analysis'
... )
>>>
>>> # Process with m/z cutoff
>>> intensity_table, area_table, peaks_df = analyzer.run(
...     csv_files,
...     mz_min=50,
...     mz_max=500
... )
>>>
>>> print(f"Detected {len(peaks_df)} peaks across {len(csv_files)} samples")
>>> print(f"Aligned to {intensity_table.shape[1]} unique m/z values")
__init__(mz_tolerance=0.2, mz_rounding_precision=1, min_intensity=1, min_snr=3, min_distance=2, peak_height=50, prominence=10, min_peak_width=1, max_peak_width=75, width_rel_height=0.5, noise_model='global', noise_bins=20, noise_min_points=25, method=None, deconvolution_min_bic_delta=10.0, deconvolution_overlap_factor=0.75, deconvolution_replace_singles=True, output_dir=None, verbose=False, group_patterns=None, group_fn=None)[source]

Initialize the PeakAlignIntensityArea analyzer with default parameters.

run(csv_files, mz_min=None, mz_max=None)[source]

Process CSV files and perform peak detection, alignment, and quantification.

Parameters:
  • csv_files (list of str) – List of paths to normalized spectrum CSV files. Each CSV should have columns: ‘channel’, ‘mz’, ‘intensity’

  • mz_min (float or None, optional) – Minimum m/z value to consider for peak detection. If None, use full range.

  • mz_max (float or None, optional) – Maximum m/z value to consider for peak detection. If None, use full range.

Returns:

  • intensity_table (pd.DataFrame) – DataFrame with samples as rows and aligned m/z values as columns, containing peak intensities (amplitudes). Missing peaks are filled with 0.

  • area_table (pd.DataFrame) – DataFrame with samples as rows and aligned m/z values as columns, containing peak areas. Missing peaks are filled with 0.

  • peaks_df (pd.DataFrame) – DataFrame containing all detected peaks with their properties before alignment.