Pipeline Reference
The mioXpektron pipeline provides end-to-end batch processing of ToF-SIMS spectra. It chains recalibration, denoising, baseline correction, normalization, peak detection, and cross-sample alignment into a single call.
Pipeline Steps
The pipeline executes these steps in order:
Recalibration (optional) — convert channel numbers to m/z values using reference masses and a TOF model.
Denoising — reduce noise with wavelet, Gaussian, median, or Savitzky-Golay filters.
Baseline correction — remove broad background with AirPLS, AsLS, or other pybaselines methods.
TIC normalization — scale spectra to a common Total Ion Current.
Peak detection — find peaks using local maximum or CWT algorithms, with automatic noise estimation.
Alignment — align detected peaks across samples by m/z tolerance, producing unified intensity and area matrices.
Configuration
from mioXpektron import PipelineConfig
config = PipelineConfig(
# Recalibration
use_recalibration=True,
reference_masses=[1.008, 22.99, 38.96, 58.07],
output_folder_calibrated="calibrated_spectra",
# Denoising
denoise_method="wavelet", # wavelet | gaussian | median | savitzky_golay | none
denoise_params=None, # dict of method-specific keyword arguments
# Baseline
baseline_method="airpls",
baseline_params=None,
clip_negative_after_baseline=True,
# Normalization
normalization_target=1e6,
# Peak alignment
mz_min=None, # optional m/z range filter
mz_max=None,
mz_tolerance=0.2, # Da tolerance for cross-sample alignment
mz_rounding_precision=1,
# Parallelism
max_workers=None, # None = use all available cores
# Adaptive parameterization (opt-in)
auto_tune=False, # True = derive mz_tolerance and normalization_target from data
)
Running the Pipeline
from mioXpektron import run_pipeline
files = ["sample_01.txt", "sample_02.txt", "sample_03.txt"]
intensity_df, area_df = run_pipeline(files, config=config)
With calibration data:
calib_channels = {
"sample_01.txt": [100, 500, 1000, 2000],
"sample_02.txt": [101, 502, 998, 2001],
}
intensity_df, area_df = run_pipeline(
files,
calib_channels_dict=calib_channels,
config=config,
)
Output Format
The pipeline returns two pandas.DataFrame objects:
- intensity_df
Rows = m/z values (aligned across samples), columns = sample names. Values are peak intensities after processing.
- area_df
Same structure as
intensity_dfbut values are integrated peak areas.
Both DataFrames share the same m/z index, making them ready for downstream statistical analysis.
Adaptive Parameterization
Set auto_tune=True to let the pipeline derive mz_tolerance and
normalization_target from the data instead of using hardcoded defaults:
config = PipelineConfig(auto_tune=True)
intensity_df, area_df = run_pipeline(files, config=config)
When auto_tune is active the pipeline:
Estimates
mz_tolerancefrom median m/z spacing across a pilot sample.Estimates
normalization_targetfrom median raw TIC across the batch.
All other parameters keep their defaults and can still be overridden manually. See Adaptive Parameterization for details on each estimator.
Reference Masses
The pipeline now provides a canonical reference mass list
DEFAULT_REFERENCE_MASSES (18 ions) used as the fallback when
reference_masses is not provided. Import it directly:
from mioXpektron import DEFAULT_REFERENCE_MASSES
PipelineConfig Reference
- class mioXpektron.PipelineConfig(use_recalibration=True, reference_masses=None, output_folder_calibrated='calibrated_spectra', denoise_method='wavelet', denoise_params=None, baseline_method='airpls', baseline_params=None, clip_negative_after_baseline=True, normalization_target=1000000.0, mz_min=None, mz_max=None, mz_tolerance=0.2, mz_rounding_precision=1, max_workers=None, auto_tune=False)[source]
Bases:
objectHigh-level pipeline configuration for batch ToF‑SIMS processing.
- Parameters:
use_recalibration (bool)
output_folder_calibrated (str)
denoise_method (str)
denoise_params (Dict | None)
baseline_method (str)
baseline_params (Dict | None)
clip_negative_after_baseline (bool)
normalization_target (float)
mz_min (float | None)
mz_max (float | None)
mz_tolerance (float)
mz_rounding_precision (int)
max_workers (int | None)
auto_tune (bool)