Baseline Correction
The baseline module removes broad background signals from ToF-SIMS spectra. It wraps the pybaselines library and adds batch processing, method evaluation, and flexible column name handling.
Quick Example
from mioXpektron import baseline_correction
corrected, baseline = baseline_correction(
intensity,
method="airpls",
lam=1e6,
return_baseline=True,
)
Available Methods
mioXpektron supports the 1-D baseline methods exposed by pybaselines plus
two lightweight filters:
"median_filter""adaptive_window""poly"as a convenience aliasthe available
pybaselinesmethods returned bybaseline_method_names()
For the current method list in your environment:
from mioXpektron import baseline_method_names
print(baseline_method_names())
Each method accepts its own keyword arguments, which are passed through to the
underlying implementation. Parameterized evaluator labels such as
"aspls(lam=1000000.0)" can also be passed back into the baseline utilities.
Batch Baseline Correction
Process multiple files in parallel:
from mioXpektron import BaselineBatchCorrector
corrector = BaselineBatchCorrector(
in_dir="denoised_spectra",
pattern="*.txt",
method="airpls",
method_kwargs={"lam": 1e6},
n_jobs=4,
)
out_dir = corrector.run(out_root="output_files")
Method Evaluation
Systematically compare baseline methods using quality metrics:
import glob
import random
from mioXpektron import BaselineMethodEvaluator, ScanForFlatRegion
files = sorted(glob.glob("output_files/denoised_spectrums_*/*.txt"))
sample = sorted(random.Random(42).sample(files, min(10, len(files))))
windows = ScanForFlatRegion(files=sample).run()
param_grid = {
"pspline_lsrpls": [{"lam": 1e6}],
"pspline_drpls": [{"lam": 1e6}],
"pspline_iarpls": [{"lam": 1e6}],
"pspline_arpls": [{"lam": 1e6}],
"pspline_airpls": [{"lam": 1e6}],
"aspls": [{"lam": 1e6}],
"imodpoly": [{"poly_order": 3}],
}
evaluator = BaselineMethodEvaluator(
files=sample,
methods=list(param_grid),
param_grid=param_grid,
flat_windows=windows,
)
summary = evaluator.evaluate(n_jobs=4)
best = summary["overall_best_spec"]
print(best["label"], best["method"], best["kwargs"])
evaluator.preview_overlay(
file=sample[0],
methods=[spec["label"] for spec in summary["overall_order_specs"][:3]],
)
If param_grid is provided and methods is omitted, the evaluator uses
the grid keys as the candidate set. For large cohorts, evaluating a
representative random subset of spectra first is usually much faster than
scoring every file.
The evaluator computes six metrics:
RFZN — Residual Flatness in Zero-signal regions (Noise)
NAR — Negative Area Ratio (how much correction goes below zero)
SNR — Signal-to-Noise Ratio improvement
BBI — Baseline-Below-Input indicator
BR — Baseline Roughness
NBC — Negative Bin Count
Flat Region Detection
Identify flat (signal-free) regions for baseline evaluation:
from mioXpektron import ScanForFlatRegion
scanner = ScanForFlatRegion(files=sample)
flat_regions = scanner.run()
Column Name Flexibility
The baseline module automatically recognizes common column naming conventions:
Channel:
channel,chan,ch,index,idxm/z:
m/z,mz,mass,moverz,m_over_zIntensity:
intensity,counts,signal,y,ion_counts
Matching is case-insensitive.
API Reference
- mioXpektron.baseline.baseline_correction(intensities, method='airpls', window_size=101, poly_order=4, clip_negative=True, return_baseline=False, **kwargs)[source]
Baseline-correct a 1‑D spectrum with pybaselines or custom filters.
- Parameters:
intensities (array-like) – Raw y values.
method (str) – Algorithm name; see
baseline_method_names().window_size (int) – Kernel width for the two custom filters.
poly_order (int) – Polynomial order for the ‘poly’ alias.
clip_negative (bool) – If True, negative corrected values are set to 0.
return_baseline (bool) – If True, also return the estimated baseline.
**kwargs – Forwarded to the chosen algorithm (e.g. lam=1e6, p=0.01).
- Return type:
corrected or (corrected, baseline)
- class mioXpektron.baseline.BaselineMethodEvaluator(files=<factory>, methods=None, param_grid=None, use_small_param_preset=False, auto_scale_window_size=True, eval_clip_negative=False, topk_for_snr=5, raw_noise_quantile=0.2, flat_windows=None, metrics_for_composite=('rfzn', 'nar', 'snr', 'bbi', 'br', 'nbc'), n_jobs=-1)[source]
Bases:
objectEvaluate baseline algorithms on ToF‑SIMS files supplied as paths or globs.
- Parameters:
- preview_overlay(file, methods=None, max_methods=5, save_to='baseline_selection_output', show_errors=True)[source]
Plot raw, baseline and corrected overlays for a few methods on a single file.
- Parameters:
file (str or Path) – Path to a single spectrum file (not a list!)
methods (list of str, optional) – Method names to plot. If None, uses top methods from evaluation.
max_methods (int) – Maximum number of methods to plot (default: 5)
save_to (str or Path, optional) – Directory to save plots. Set to None to skip saving.
show_errors (bool) – If True (default), print errors when methods fail instead of silently ignoring them.
- class mioXpektron.baseline.BaselineBatchCorrector(in_dir: 'Union[str, Path]', pattern: 'str' = '*.csv', recursive: 'bool' = False, method: 'str' = 'airpls', method_kwargs: 'Dict' = <factory>, clip_negative: 'bool' = True, per_file_best: 'bool' = False, best_method_map: 'Optional[Dict[str, str]]'=None, n_jobs: 'int' = -1, save_plots: 'bool' = False)[source]
Bases:
object- Parameters:
- class mioXpektron.baseline.ScanForFlatRegion(files: 'List[Union[str, Path]]'=<factory>, out_dir: 'Union[str, Path]'='flat_windows_out', n_jobs: 'int' = -1, flat_params: 'FlatParams' = <factory>, agg_params: 'AggregateParams' = <factory>, auto_tune: 'bool' = False)[source]
Bases:
object- Parameters:
n_jobs (int)
flat_params (FlatParams)
agg_params (AggregateParams)
auto_tune (bool)
- flat_params: FlatParams
- agg_params: AggregateParams