Quick Start
This guide walks through a typical mioXpektron workflow: loading data, processing it step by step, and extracting peaks.
Importing Data
mioXpektron reads tab- or comma-separated ToF-SIMS spectra with columns for m/z values and intensities:
import mioXpektron as mx
# Load a single spectrum
mz, intensity, sample_name, group = mx.import_data("path/to/spectrum.txt")
The loader automatically detects separators, skips comment lines, and infers sample names from the filename.
Step-by-Step Processing
Denoising
Reduce noise while preserving peak shapes:
denoised = mx.noise_filtering(intensity, method="wavelet")
Available methods: "wavelet", "gaussian", "median",
"savitzky_golay", "none".
Baseline Correction
Remove broad background signals:
corrected = mx.baseline_correction(denoised, method="airpls")
Over 20 methods are available from the pybaselines library, including
"airpls", "asls", "mor", "snip", and more.
Normalization
Normalize spectra using any of 14 available methods:
from mioXpektron import normalize
# TIC normalization (default)
normalized = normalize(corrected, method="tic", target_tic=1e6)
# Poisson scaling (recommended before PCA)
scaled = normalize(corrected, method="poisson")
# Or use the direct function
from mioXpektron import tic_normalization
normalized = tic_normalization(corrected, target_tic=1e6)
Peak Detection
Detect peaks with area integration:
peaks_df = mx.detect_peaks_with_area(corrected, snr_threshold=3.0)
For continuous wavelet transform (CWT) based detection:
peaks_df = mx.detect_peaks_cwt_with_area(corrected, min_snr=3.0)
Visualization
Plot spectra with annotated peaks:
mx.PlotPeak(corrected, peaks_df)
Automated Pipeline
For batch processing, use the built-in pipeline that chains all steps:
from mioXpektron import run_pipeline, PipelineConfig
config = PipelineConfig(
denoise_method="wavelet",
baseline_method="airpls",
normalization_target=1e6,
mz_tolerance=0.2,
)
files = ["sample_01.txt", "sample_02.txt", "sample_03.txt"]
intensity_df, area_df = run_pipeline(files, config=config)
The pipeline returns two DataFrames: an intensity matrix and an area matrix, both aligned by m/z across all samples.
Adaptive Parameterization
Set auto_tune=True to let the pipeline derive optimal thresholds from
your data instead of using fixed defaults:
from mioXpektron import FlexibleCalibrator, FlexibleCalibConfig
config = FlexibleCalibConfig(
reference_masses=[1.0073, 27.0229, 29.0386, 41.0386, 57.0699, 104.1075],
calibration_method="quad_sqrt",
auto_tune=True,
)
calibrator = FlexibleCalibrator(config)
summary = calibrator.calibrate(file_list)
The pipeline also supports this flag:
config = PipelineConfig(auto_tune=True)
intensity_df, area_df = run_pipeline(files, config=config)
When auto_tune is active, parameters like calibration tolerance,
outlier threshold, screening thresholds, normalization target, and
alignment tolerance are estimated from the spectra. See
Adaptive Parameterization for details on each estimator.
Mass Calibration
Calibrate channel-based spectra to m/z:
from mioXpektron import AutoCalibrator, AutoCalibConfig
config = AutoCalibConfig(
reference_masses=[12.0, 28.0, 56.0],
model="quadratic",
)
calibrator = AutoCalibrator(config)
calibrated_data = calibrator.calibrate(data)
For more control, use FlexibleCalibrator with explicit channel-to-mass
mappings. See Calibration for details.
Batch Processing
Process entire directories of spectra:
from mioXpektron import BatchDenoising, batch_tic_norm
# Batch denoising
denoiser = BatchDenoising(method="savgol", window_length=11)
denoised_files = denoiser.process_directory("data/")
# Batch normalization
normalized = batch_tic_norm("data/", output_dir="normalized/")
Method Comparison
Compare denoising strategies on your data:
from mioXpektron import compare_denoising_methods
results = compare_denoising_methods(
data,
methods=["wavelet", "gaussian", "savgol"],
metric="snr",
)
Evaluate baseline correction approaches:
import glob
import random
from mioXpektron import BaselineMethodEvaluator, ScanForFlatRegion
files = sorted(glob.glob("output_files/denoised_spectrums_*/*.txt"))
sample = sorted(random.Random(42).sample(files, min(10, len(files))))
windows = ScanForFlatRegion(files=sample).run()
param_grid = {
"pspline_lsrpls": [{"lam": 1e6}],
"pspline_drpls": [{"lam": 1e6}],
"aspls": [{"lam": 1e6}],
"imodpoly": [{"poly_order": 3}],
}
evaluator = BaselineMethodEvaluator(
files=sample,
methods=list(param_grid),
param_grid=param_grid,
flat_windows=windows,
)
summary = evaluator.evaluate()
best_method = summary["overall_best_spec"]
Evaluate normalization strategies:
from mioXpektron import NormalizationEvaluator
evaluator = NormalizationEvaluator(
files=["spectra/"],
methods=["tic", "robust_snv", "pqn", "mass_stratified_pqn", "log"],
method_kwargs_map={
"mass_stratified_pqn": {
"strata": [(0.0, 100.0), (100.0, 400.0), (400.0, float("inf"))],
},
},
)
results = evaluator.evaluate()
evaluator.print_summary()
evaluator.plot()
For baseline-corrected CSV cohorts, use the repository notebook
NoteBooks/_06_Normalization.ipynb to resample spectra onto a shared m/z
grid before ranking methods. The notebook includes mass_stratified_pqn by
default and can enable multi_ion_reference when reference ions are known.
Next Steps
Pipeline Reference — detailed pipeline configuration reference
Adaptive Parameterization — data-driven adaptive parameterization
Module Overview — in-depth module documentation
Calibration — calibration models and strategies
Denoising — denoising algorithms and parameter tuning