Calibration

The recalibrate module converts raw channel-based ToF-SIMS spectra to calibrated m/z axes. It supports multiple Time-of-Flight models and provides both automatic and manual calibration workflows.

Current model families

The recalibration backend supports the following model families:

quad_sqrt: empirical TOF model t = k*sqrt(m) + c*m + t0
linear_sqrt: reduced two-parameter sqrt model
poly2: empirical quadratic calibration in channel space
reflectron: extended TOF model with a reflectron correction term
spline: non-parametric spline calibration
multisegment: piecewise quad_sqrt over user-defined mass ranges

multisegment is available by explicit selection but remains experimental. If you use it, choose breakpoints so every segment contains at least three reference masses.

Quick Example

from mioXpektron import FlexibleCalibrator, FlexibleCalibConfig

config = FlexibleCalibConfig(
    reference_masses=[1.0073, 27.0229, 29.0386, 41.0386, 57.0699, 104.1075],
    calibration_method="quad_sqrt",
    autodetect_method="parabolic",
    autodetect_fallback_policy="max",
    autodetect_strategy="mz",
    auto_screen_reference_masses=True,
)

calibrator = FlexibleCalibrator(config)
summary = calibrator.calibrate(file_list)

AutoCalibrator

Fully automatic calibration using known reference masses:

from mioXpektron import AutoCalibrator, AutoCalibConfig

config = AutoCalibConfig(
    reference_masses=[1.0073, 22.9892, 38.9632, 58.0657, 86.0970, 184.0733],
    model="quad_sqrt",
    autodetect_method="gaussian",
    autodetect_fallback_policy="max",
    autodetect_strategy="mz",
    output_folder="calibrated/",
    max_workers=4,
)

calibrator = AutoCalibrator(config)
results = calibrator.calibrate(file_list)

The calibrator:

Auto-detects calibrant channels from either the m/z axis or a channel-only bootstrap path.
Applies the requested peak-picking method near each candidate calibrant.
Fits the selected model or, for AutoCalibrator, compares the requested model set and keeps the best valid fit.
Applies the calibration to the full spectrum.
Writes calibrated spectra and summary tables to the output folder.

Autodetection modes

Both calibrators support two autodetection strategies:

autodetect_strategy="mz" searches around the existing m/z axis.
autodetect_strategy="bootstrap" reconstructs approximate channel positions directly from Channel and Intensity columns.

The bootstrap strategy now estimates both the slope and intercept of the channel-to-mass relationship before searching locally for each calibrant, so it is appropriate for spectra that do not yet have a trustworthy m/z axis.

Peak-picking methods

The recalibration backend supports:

max
centroid
centroid_raw
parabolic
gaussian
voigt

Refined methods return fractional channel positions. centroid is now baseline-aware and apex-focused; centroid_raw preserves the earlier windowed center-of-mass behavior for comparison.

Fallback policy

autodetect_fallback_policy controls what happens when a refined method fails for a specific calibrant:

"max": fall back to a robust local maximum pick
"nan": keep the calibrant unresolved
"raise": stop the run immediately

The actual method used per calibrant is recorded in calibrator.last_autodetect_methods.

FlexibleCalibrator

For more control over the calibration process:

from mioXpektron import FlexibleCalibrator, FlexibleCalibConfig

config = FlexibleCalibConfig(
    reference_masses=[1.0073, 27.0229, 29.0386, 41.0386, 57.0699, 104.1075],
    calibration_method="quad_sqrt",
    autodetect_method="parabolic",
    autodetect_fallback_policy="max",
    autodetect_strategy="mz",
    auto_screen_reference_masses=True,
    multisegment_breakpoints=[50, 150],
    outlier_threshold=3.0,
)

calibrator = FlexibleCalibrator(config)
summary = calibrator.calibrate(file_list)

Features:

Single-model calibration when you want to compare one model family directly
Multiple peak-picking methods with explicit fallback control
Outlier detection using Huber regression
PPM and Dalton error reporting
Optional two-pass reference-mass screening with per-mass residual summaries

Adaptive parameterization

Set auto_tune=True to derive calibration parameters from the data:

config = FlexibleCalibConfig(
    reference_masses=reference_masses,
    calibration_method="quad_sqrt",
    auto_tune=True,
)

calibrator = FlexibleCalibrator(config)
summary = calibrator.calibrate(file_list)

When auto_tune is active, the calibrator estimates:

autodetect_tol_da from observed peak widths near calibrant masses
multisegment_breakpoints from the calibrant mass distribution
outlier_threshold from the residual distribution after the first fit pass
screen_max_mean_abs_ppm and screen_min_valid_fraction from batch-level stability statistics

All estimates use sensible fallback values if insufficient data is available. See Adaptive Parameterization for the individual estimator functions.

Reference-mass screening

FlexibleCalibrator can perform a fit-only first pass, summarize calibrant stability, and refit using only stable reference masses:

config = FlexibleCalibConfig(
    reference_masses=reference_masses,
    calibration_method="quad_sqrt",
    auto_screen_reference_masses=True,
    screen_max_mean_abs_ppm=50.0,
    screen_min_valid_fraction=0.8,
    screen_min_count=3,
    screen_exclude_below_mz=1.5,
)

calibrator = FlexibleCalibrator(config)
summary = calibrator.calibrate(file_list)

print(calibrator.last_reference_masses_used)
print(calibrator.last_reference_masses_screened_out)

This is useful when a small number of unstable tissue-specific anchors dominate the overall calibration error.

Multisegment calibration

multisegment fits independent quad_sqrt models over mass intervals defined by multisegment_breakpoints. For example:

config = FlexibleCalibConfig(
    reference_masses=reference_masses,
    calibration_method="multisegment",
    multisegment_breakpoints=[50, 150],
)

produces the segments 0-50, 50-150, and 150-inf.

Debugging Calibration

Use the debug calibrator for detailed diagnostic output:

from mioXpektron import FlexibleCalibratorDebug, FlexibleCalibConfigDebug

config = FlexibleCalibConfigDebug(calibration_method="quad_sqrt")
calibrator = FlexibleCalibratorDebug(config)

# Produces detailed logs of each calibration step
result = calibrator.calibrate(channels, known_channels, known_masses)

The debug version logs:

Peak picking decisions at each reference mass
Model fit residuals and quality metrics
Outlier detection and removal steps
Final calibration coefficients and errors

Notebook workflow

The calibration comparison notebook NoteBooks/_01_Calibration_Methods_Comparison.ipynb has been updated to expose:

autodetect_fallback_policy
autodetect_strategy
centroid_raw
reference-mass screening settings
per-spectrum autodetect diagnostics

When rerunning the notebook after backend changes, restart the kernel or rerun the import cell so the local recalibration modules are reloaded explicitly.

API Reference

class mioXpektron.recalibrate.AutoCalibrator(config=None)[source]

Bases: object

Automatic multi-model calibrator.

Fits all requested models, selects the best one per file, and writes calibrated spectra.

Parameters:: config (AutoCalibConfig | None)

last_autodetect_methods: Dict[str, List[str]]

calibrate(files, calib_channels_dict=None)[source]

Calibrate all files with automatic model selection.

Parameters:

files (Sequence[str])
calib_channels_dict (Dict[str, Sequence[float]] | None)

Return type:

DataFrame

class mioXpektron.recalibrate.AutoCalibConfig(reference_masses, output_folder='calibrated_spectra', max_workers=None, autodetect_tol_da=None, autodetect_tol_ppm=None, autodetect_method='gaussian', autodetect_fallback_policy='max', autodetect_strategy='mz', prefer_recompute_from_channel=False, outlier_threshold=3.0, use_outlier_rejection=True, max_iterations=3, model=None, models_to_try=None, prefer_physical_models=True, min_calibrants=3, max_ppm_warning=100.0, max_ppm_error=500.0, use_bootstrap_init=True, spline_smoothing=None, multisegment_breakpoints=<factory>, instrument_params=<factory>)[source]

Bases: object

Universal calibration configuration with robust options.

Parameters:

reference_masses (list of float) – Known calibrant ion masses (m/z).
model (str, optional) – Convenience shortcut — a single model name (or common alias like 'quadratic', 'tof', 'linear'). Resolved into models_to_try during __post_init__. Ignored when models_to_try is explicitly provided.
models_to_try (list of str, optional) – Explicit list of model names to fit. Default: all production-ready models (excludes experimental ones such as multisegment and physical).
output_folder (str)
max_workers (int | None)
autodetect_tol_da (float | None)
autodetect_tol_ppm (float | None)
autodetect_method (str)
autodetect_fallback_policy (str)
autodetect_strategy (str)
prefer_recompute_from_channel (bool)
outlier_threshold (float)
use_outlier_rejection (bool)
max_iterations (int)
prefer_physical_models (bool)
min_calibrants (int)
max_ppm_warning (float)
max_ppm_error (float)
use_bootstrap_init (bool)
spline_smoothing (float | None)
multisegment_breakpoints (List[float])
instrument_params (Dict[str, float])

reference_masses: List[float]

output_folder: str = 'calibrated_spectra'

max_workers: int | None = None

autodetect_tol_da: float | None = None

autodetect_tol_ppm: float | None = None

autodetect_method: str = 'gaussian'

autodetect_fallback_policy: str = 'max'

autodetect_strategy: str = 'mz'

prefer_recompute_from_channel: bool = False

outlier_threshold: float = 3.0

use_outlier_rejection: bool = True

max_iterations: int = 3

model: str | None = None

models_to_try: List[str] | None = None

prefer_physical_models: bool = True

min_calibrants: int = 3

max_ppm_warning: float = 100.0

max_ppm_error: float = 500.0

use_bootstrap_init: bool = True

spline_smoothing: float | None = None

multisegment_breakpoints: List[float]

instrument_params: Dict[str, float]

class mioXpektron.recalibrate.FlexibleCalibrator(config=None)[source]

Bases: object

Single-model calibrator with user-selected method.

Unlike AutoCalibrator, this calibrator fits exactly one model and provides more control over outlier rejection, quality thresholds, and per-model parameters.

Parameters:: config (FlexibleCalibConfig | None)

last_autodetect_methods: Dict[str, List[str]]

last_reference_masses_initial: List[float]

last_reference_masses_used: List[float]

last_reference_masses_screened_out: List[float]

last_reference_mass_screening: DataFrame

last_failed_or_excluded_files: DataFrame

calibrate(files, calib_channels_dict=None)[source]

Calibrate all files using the selected calibration method.

Parameters:

files (Sequence[str])
calib_channels_dict (Dict[str, Sequence[float]] | None)

Return type:

DataFrame

class mioXpektron.recalibrate.FlexibleCalibConfig(reference_masses, calibration_method='quad_sqrt', output_folder='calibrated_spectra', output_mz_range=None, max_workers=None, autodetect_tol_da=None, autodetect_tol_ppm=None, autodetect_method='gaussian', autodetect_fallback_policy='max', autodetect_strategy='mz', prefer_recompute_from_channel=False, outlier_threshold=3.0, use_outlier_rejection=True, max_iterations=3, min_calibrants=3, max_ppm_threshold=100.0, fail_on_high_error=False, retry_high_error_with_pruning=False, retry_high_error_with_mz_fallback=False, retry_high_error_max_removals=5, exclude_reference_masses=<factory>, auto_screen_reference_masses=False, screen_max_mean_abs_ppm=50.0, screen_max_median_abs_ppm=None, screen_min_valid_fraction=0.8, screen_min_count=3, screen_exclude_below_mz=1.5, spline_smoothing=None, multisegment_breakpoints=<factory>, instrument_params=<factory>, save_diagnostic_plots=False, verbose=True, auto_tune=False)[source]

Bases: object

Configuration for flexible calibration with a single user-selected method.

Parameters:

reference_masses (List[float])
calibration_method (Literal['quad_sqrt', 'linear_sqrt', 'poly2', 'reflectron', 'multisegment', 'spline', 'physical'])
output_folder (str)
output_mz_range (Tuple[float | None, float | None] | None)
max_workers (int | None)
autodetect_tol_da (float | Sequence[float] | None)
autodetect_tol_ppm (float | None)
autodetect_method (str)
autodetect_fallback_policy (str)
autodetect_strategy (str)
prefer_recompute_from_channel (bool)
outlier_threshold (float)
use_outlier_rejection (bool)
max_iterations (int)
min_calibrants (int)
max_ppm_threshold (float | None)
fail_on_high_error (bool)
retry_high_error_with_pruning (bool)
retry_high_error_with_mz_fallback (bool)
retry_high_error_max_removals (int)
exclude_reference_masses (List[float])
auto_screen_reference_masses (bool)
screen_max_mean_abs_ppm (float)
screen_max_median_abs_ppm (float | None)
screen_min_valid_fraction (float)
screen_min_count (int)
screen_exclude_below_mz (float)
spline_smoothing (float | None)
multisegment_breakpoints (List[float])
instrument_params (Dict[str, float])
save_diagnostic_plots (bool)
verbose (bool)
auto_tune (bool)

reference_masses: List[float]

calibration_method: Literal['quad_sqrt', 'linear_sqrt', 'poly2', 'reflectron', 'multisegment', 'spline', 'physical'] = 'quad_sqrt'

output_folder: str = 'calibrated_spectra'

output_mz_range: Tuple[float | None, float | None] | None = None

max_workers: int | None = None

autodetect_tol_da: float | Sequence[float] | None = None

autodetect_tol_ppm: float | None = None

autodetect_method: str = 'gaussian'

autodetect_fallback_policy: str = 'max'

autodetect_strategy: str = 'mz'

prefer_recompute_from_channel: bool = False

outlier_threshold: float = 3.0

use_outlier_rejection: bool = True

max_iterations: int = 3

min_calibrants: int = 3

max_ppm_threshold: float | None = 100.0

fail_on_high_error: bool = False

retry_high_error_with_pruning: bool = False

retry_high_error_with_mz_fallback: bool = False

retry_high_error_max_removals: int = 5

exclude_reference_masses: List[float]

auto_screen_reference_masses: bool = False

screen_max_mean_abs_ppm: float = 50.0

screen_max_median_abs_ppm: float | None = None

screen_min_valid_fraction: float = 0.8

screen_min_count: int = 3

screen_exclude_below_mz: float = 1.5

spline_smoothing: float | None = None

multisegment_breakpoints: List[float]

instrument_params: Dict[str, float]

save_diagnostic_plots: bool = False

verbose: bool = True

auto_tune: bool = False