mioXpektron.baseline

Baseline correction utilities for the Xpektron toolkit.

class mioXpektron.baseline.AggregateParams(bin_width: 'float' = 0.1, coverage_threshold: 'float' = 0.5, top_k: 'int' = 6)[source]

Bases: object

Parameters:
bin_width: float = 0.1
coverage_threshold: float = 0.5
top_k: int = 6
class mioXpektron.baseline.BaselineBatchCorrector(in_dir: 'Union[str, Path]', pattern: 'str' = '*.csv', recursive: 'bool' = False, method: 'str' = 'airpls', method_kwargs: 'Dict' = <factory>, clip_negative: 'bool' = True, per_file_best: 'bool' = False, best_method_map: 'Optional[Dict[str, str]]'=None, n_jobs: 'int' = -1, save_plots: 'bool' = False)[source]

Bases: object

Parameters:
in_dir: str | Path
pattern: str = '*.csv'
recursive: bool = False
method: str = 'airpls'
method_kwargs: Dict
clip_negative: bool = True
per_file_best: bool = False
best_method_map: Dict[str, str] | None = None
n_jobs: int = -1
save_plots: bool = False
run(out_root=None)[source]
Parameters:

out_root (str | Path | None)

Return type:

Path

class mioXpektron.baseline.BaselineMethodEvaluator(files=<factory>, methods=None, param_grid=None, use_small_param_preset=False, auto_scale_window_size=True, eval_clip_negative=False, topk_for_snr=5, raw_noise_quantile=0.2, flat_windows=None, metrics_for_composite=('rfzn', 'nar', 'snr', 'bbi', 'br', 'nbc'), n_jobs=-1)[source]

Bases: object

Evaluate baseline algorithms on ToF‑SIMS files supplied as paths or globs.

Parameters:
files: List[str | Path]
methods: List[str] | None = None
param_grid: Dict[str, List[Dict]] | None = None
use_small_param_preset: bool = False
auto_scale_window_size: bool = True
eval_clip_negative: bool = False
topk_for_snr: int = 5
raw_noise_quantile: float = 0.2
flat_windows: List[Tuple[float, float]] | None = None
metrics_for_composite: Tuple[str, ...] = ('rfzn', 'nar', 'snr', 'bbi', 'br', 'nbc')
n_jobs: int = -1
labels: List[str]
specs: List[Tuple[str, Dict]]
evaluate(noise_quantile=None, n_jobs=None)[source]
Parameters:
  • noise_quantile (float | None)

  • n_jobs (int | None)

warning_log()[source]
Return type:

DataFrame

plot(out_dir='baseline_selection_output')[source]
Parameters:

out_dir (str | Path)

Return type:

List[Path]

preview_overlay(file, methods=None, max_methods=5, save_to='baseline_selection_output', show_errors=True)[source]

Plot raw, baseline and corrected overlays for a few methods on a single file.

Parameters:
  • file (str or Path) – Path to a single spectrum file (not a list!)

  • methods (list of str, optional) – Method names to plot. If None, uses top methods from evaluation.

  • max_methods (int) – Maximum number of methods to plot (default: 5)

  • save_to (str or Path, optional) – Directory to save plots. Set to None to skip saving.

  • show_errors (bool) – If True (default), print errors when methods fail instead of silently ignoring them.

class mioXpektron.baseline.FlatParams(y_quantile: 'float' = 0.2, grad_quantile: 'float' = 0.4, curv_quantile: 'float' = 0.4, savgol_window: 'int' = 11, savgol_poly: 'int' = 2, min_width: 'float' = 0.2, min_points: 'int' = 20)[source]

Bases: object

Parameters:
y_quantile: float = 0.2
grad_quantile: float = 0.4
curv_quantile: float = 0.4
savgol_window: int = 11
savgol_poly: int = 2
min_width: float = 0.2
min_points: int = 20
class mioXpektron.baseline.ScanForFlatRegion(files: 'List[Union[str, Path]]'=<factory>, out_dir: 'Union[str, Path]'='flat_windows_out', n_jobs: 'int' = -1, flat_params: 'FlatParams' = <factory>, agg_params: 'AggregateParams' = <factory>, auto_tune: 'bool' = False)[source]

Bases: object

Parameters:
files: List[str | Path]
out_dir: str | Path = 'flat_windows_out'
n_jobs: int = -1
flat_params: FlatParams
agg_params: AggregateParams
auto_tune: bool = False
run()[source]
mioXpektron.baseline.baseline_correction(intensities, method='airpls', window_size=101, poly_order=4, clip_negative=True, return_baseline=False, **kwargs)[source]

Baseline-correct a 1‑D spectrum with pybaselines or custom filters.

Parameters:
  • intensities (array-like) – Raw y values.

  • method (str) – Algorithm name; see baseline_method_names().

  • window_size (int) – Kernel width for the two custom filters.

  • poly_order (int) – Polynomial order for the ‘poly’ alias.

  • clip_negative (bool) – If True, negative corrected values are set to 0.

  • return_baseline (bool) – If True, also return the estimated baseline.

  • **kwargs – Forwarded to the chosen algorithm (e.g. lam=1e6, p=0.01).

Return type:

corrected or (corrected, baseline)

mioXpektron.baseline.baseline_method_names()[source]

Return a sorted list of available baseline algorithms.

Based on pybaselines.Baseline public callables, plus two custom filters (“median_filter”, “adaptive_window”) and a ‘poly’ alias. A few methods that are not 1‑D safe or impractically slow are removed.

Return type:

List[str]

mioXpektron.baseline.small_param_grid_preset(n_points=None)[source]

A compact parameter grid for common methods.

Keys must match pybaselines.Baseline method names (plus ‘poly’ and our two filters).

Parameters:

n_points (int, optional) – Number of data points in spectrum. If provided, window_size will be calculated adaptively as a percentage of data size. If None, uses moderate defaults suitable for ~100K point spectra.

Returns:

Parameter grid with method names as keys

Return type:

dict

Notes

Window sizes are calculated as: - Small: 0.05% of data (min 51) - Medium: 0.10% of data (min 101) - Large: 0.20% of data (min 501)

This adaptive scaling ensures that filter methods perform consistently across datasets of different sizes. Fixed window sizes work poorly: - For 10K points: window=101 is 1.0% (OK) - For 1M points: window=101 is 0.01% (too small, causes jagged baselines)

Examples

>>> # Auto-scale for 938K point spectrum
>>> grid = small_param_grid_preset(n_points=938000)
>>> grid['median_filter']
[{'window_size': 469}, {'window_size': 938}, {'window_size': 1876}]
>>> # Use defaults for unknown size
>>> grid = small_param_grid_preset()
>>> grid['median_filter']
[{'window_size': 501}, {'window_size': 1001}, {'window_size': 2001}]

Modules

baseline_base

baseline_batch

baseline_eval

baseline_main

flat_window_suggester

flat_window_suggester_polars.py Small application to discover common "flat" m/z windows across a set of ToF‑SIMS spectra. Inputs are 3‑column tables: Channel, m/z, intensity (case‑insensitive).

test_baseline_selection

test_column_aliases

Test script to verify that column name aliasing works correctly for various naming conventions (Channel, channel, ch, idx, etc.)