Calibration

The recalibrate module converts raw channel-based ToF-SIMS spectra to calibrated m/z axes. It supports multiple Time-of-Flight models and provides both automatic and manual calibration workflows.

Current model families

The recalibration backend supports the following model families:

  • quad_sqrt: empirical TOF model t = k*sqrt(m) + c*m + t0

  • linear_sqrt: reduced two-parameter sqrt model

  • poly2: empirical quadratic calibration in channel space

  • reflectron: extended TOF model with a reflectron correction term

  • spline: non-parametric spline calibration

  • multisegment: piecewise quad_sqrt over user-defined mass ranges

multisegment is available by explicit selection but remains experimental. If you use it, choose breakpoints so every segment contains at least three reference masses.

Quick Example

from mioXpektron import FlexibleCalibrator, FlexibleCalibConfig

config = FlexibleCalibConfig(
    reference_masses=[1.0073, 27.0229, 29.0386, 41.0386, 57.0699, 104.1075],
    calibration_method="quad_sqrt",
    autodetect_method="parabolic",
    autodetect_fallback_policy="max",
    autodetect_strategy="mz",
    auto_screen_reference_masses=True,
)

calibrator = FlexibleCalibrator(config)
summary = calibrator.calibrate(file_list)

AutoCalibrator

Fully automatic calibration using known reference masses:

from mioXpektron import AutoCalibrator, AutoCalibConfig

config = AutoCalibConfig(
    reference_masses=[1.0073, 22.9892, 38.9632, 58.0657, 86.0970, 184.0733],
    model="quad_sqrt",
    autodetect_method="gaussian",
    autodetect_fallback_policy="max",
    autodetect_strategy="mz",
    output_folder="calibrated/",
    max_workers=4,
)

calibrator = AutoCalibrator(config)
results = calibrator.calibrate(file_list)

The calibrator:

  1. Auto-detects calibrant channels from either the m/z axis or a channel-only bootstrap path.

  2. Applies the requested peak-picking method near each candidate calibrant.

  3. Fits the selected model or, for AutoCalibrator, compares the requested model set and keeps the best valid fit.

  4. Applies the calibration to the full spectrum.

  5. Writes calibrated spectra and summary tables to the output folder.

Autodetection modes

Both calibrators support two autodetection strategies:

  • autodetect_strategy="mz" searches around the existing m/z axis.

  • autodetect_strategy="bootstrap" reconstructs approximate channel positions directly from Channel and Intensity columns.

The bootstrap strategy now estimates both the slope and intercept of the channel-to-mass relationship before searching locally for each calibrant, so it is appropriate for spectra that do not yet have a trustworthy m/z axis.

Peak-picking methods

The recalibration backend supports:

  • max

  • centroid

  • centroid_raw

  • parabolic

  • gaussian

  • voigt

Refined methods return fractional channel positions. centroid is now baseline-aware and apex-focused; centroid_raw preserves the earlier windowed center-of-mass behavior for comparison.

Fallback policy

autodetect_fallback_policy controls what happens when a refined method fails for a specific calibrant:

  • "max": fall back to a robust local maximum pick

  • "nan": keep the calibrant unresolved

  • "raise": stop the run immediately

The actual method used per calibrant is recorded in calibrator.last_autodetect_methods.

FlexibleCalibrator

For more control over the calibration process:

from mioXpektron import FlexibleCalibrator, FlexibleCalibConfig

config = FlexibleCalibConfig(
    reference_masses=[1.0073, 27.0229, 29.0386, 41.0386, 57.0699, 104.1075],
    calibration_method="quad_sqrt",
    autodetect_method="parabolic",
    autodetect_fallback_policy="max",
    autodetect_strategy="mz",
    auto_screen_reference_masses=True,
    multisegment_breakpoints=[50, 150],
    outlier_threshold=3.0,
)

calibrator = FlexibleCalibrator(config)
summary = calibrator.calibrate(file_list)

Features:

  • Single-model calibration when you want to compare one model family directly

  • Multiple peak-picking methods with explicit fallback control

  • Outlier detection using Huber regression

  • PPM and Dalton error reporting

  • Optional two-pass reference-mass screening with per-mass residual summaries

Adaptive parameterization

Set auto_tune=True to derive calibration parameters from the data:

config = FlexibleCalibConfig(
    reference_masses=reference_masses,
    calibration_method="quad_sqrt",
    auto_tune=True,
)

calibrator = FlexibleCalibrator(config)
summary = calibrator.calibrate(file_list)

When auto_tune is active, the calibrator estimates:

  • autodetect_tol_da from observed peak widths near calibrant masses

  • multisegment_breakpoints from the calibrant mass distribution

  • outlier_threshold from the residual distribution after the first fit pass

  • screen_max_mean_abs_ppm and screen_min_valid_fraction from batch-level stability statistics

All estimates use sensible fallback values if insufficient data is available. See Adaptive Parameterization for the individual estimator functions.

Reference-mass screening

FlexibleCalibrator can perform a fit-only first pass, summarize calibrant stability, and refit using only stable reference masses:

config = FlexibleCalibConfig(
    reference_masses=reference_masses,
    calibration_method="quad_sqrt",
    auto_screen_reference_masses=True,
    screen_max_mean_abs_ppm=50.0,
    screen_min_valid_fraction=0.8,
    screen_min_count=3,
    screen_exclude_below_mz=1.5,
)

calibrator = FlexibleCalibrator(config)
summary = calibrator.calibrate(file_list)

print(calibrator.last_reference_masses_used)
print(calibrator.last_reference_masses_screened_out)

This is useful when a small number of unstable tissue-specific anchors dominate the overall calibration error.

Multisegment calibration

multisegment fits independent quad_sqrt models over mass intervals defined by multisegment_breakpoints. For example:

config = FlexibleCalibConfig(
    reference_masses=reference_masses,
    calibration_method="multisegment",
    multisegment_breakpoints=[50, 150],
)

produces the segments 0-50, 50-150, and 150-inf.

Debugging Calibration

Use the debug calibrator for detailed diagnostic output:

from mioXpektron import FlexibleCalibratorDebug, FlexibleCalibConfigDebug

config = FlexibleCalibConfigDebug(calibration_method="quad_sqrt")
calibrator = FlexibleCalibratorDebug(config)

# Produces detailed logs of each calibration step
result = calibrator.calibrate(channels, known_channels, known_masses)

The debug version logs:

  • Peak picking decisions at each reference mass

  • Model fit residuals and quality metrics

  • Outlier detection and removal steps

  • Final calibration coefficients and errors

Notebook workflow

The calibration comparison notebook NoteBooks/_01_Calibration_Methods_Comparison.ipynb has been updated to expose:

  • autodetect_fallback_policy

  • autodetect_strategy

  • centroid_raw

  • reference-mass screening settings

  • per-spectrum autodetect diagnostics

When rerunning the notebook after backend changes, restart the kernel or rerun the import cell so the local recalibration modules are reloaded explicitly.

API Reference

class mioXpektron.recalibrate.AutoCalibrator(config=None)[source]

Bases: object

Automatic multi-model calibrator.

Fits all requested models, selects the best one per file, and writes calibrated spectra.

Parameters:

config (AutoCalibConfig | None)

last_autodetect_methods: Dict[str, List[str]]
calibrate(files, calib_channels_dict=None)[source]

Calibrate all files with automatic model selection.

Parameters:
Return type:

DataFrame

class mioXpektron.recalibrate.AutoCalibConfig(reference_masses, output_folder='calibrated_spectra', max_workers=None, autodetect_tol_da=None, autodetect_tol_ppm=None, autodetect_method='gaussian', autodetect_fallback_policy='max', autodetect_strategy='mz', prefer_recompute_from_channel=False, outlier_threshold=3.0, use_outlier_rejection=True, max_iterations=3, model=None, models_to_try=None, prefer_physical_models=True, min_calibrants=3, max_ppm_warning=100.0, max_ppm_error=500.0, use_bootstrap_init=True, spline_smoothing=None, multisegment_breakpoints=<factory>, instrument_params=<factory>)[source]

Bases: object

Universal calibration configuration with robust options.

Parameters:
  • reference_masses (list of float) – Known calibrant ion masses (m/z).

  • model (str, optional) – Convenience shortcut — a single model name (or common alias like 'quadratic', 'tof', 'linear'). Resolved into models_to_try during __post_init__. Ignored when models_to_try is explicitly provided.

  • models_to_try (list of str, optional) – Explicit list of model names to fit. Default: all production-ready models (excludes experimental ones such as multisegment and physical).

  • output_folder (str)

  • max_workers (int | None)

  • autodetect_tol_da (float | None)

  • autodetect_tol_ppm (float | None)

  • autodetect_method (str)

  • autodetect_fallback_policy (str)

  • autodetect_strategy (str)

  • prefer_recompute_from_channel (bool)

  • outlier_threshold (float)

  • use_outlier_rejection (bool)

  • max_iterations (int)

  • prefer_physical_models (bool)

  • min_calibrants (int)

  • max_ppm_warning (float)

  • max_ppm_error (float)

  • use_bootstrap_init (bool)

  • spline_smoothing (float | None)

  • multisegment_breakpoints (List[float])

  • instrument_params (Dict[str, float])

reference_masses: List[float]
output_folder: str = 'calibrated_spectra'
max_workers: int | None = None
autodetect_tol_da: float | None = None
autodetect_tol_ppm: float | None = None
autodetect_method: str = 'gaussian'
autodetect_fallback_policy: str = 'max'
autodetect_strategy: str = 'mz'
prefer_recompute_from_channel: bool = False
outlier_threshold: float = 3.0
use_outlier_rejection: bool = True
max_iterations: int = 3
model: str | None = None
models_to_try: List[str] | None = None
prefer_physical_models: bool = True
min_calibrants: int = 3
max_ppm_warning: float = 100.0
max_ppm_error: float = 500.0
use_bootstrap_init: bool = True
spline_smoothing: float | None = None
multisegment_breakpoints: List[float]
instrument_params: Dict[str, float]
class mioXpektron.recalibrate.FlexibleCalibrator(config=None)[source]

Bases: object

Single-model calibrator with user-selected method.

Unlike AutoCalibrator, this calibrator fits exactly one model and provides more control over outlier rejection, quality thresholds, and per-model parameters.

Parameters:

config (FlexibleCalibConfig | None)

last_autodetect_methods: Dict[str, List[str]]
last_reference_masses_initial: List[float]
last_reference_masses_used: List[float]
last_reference_masses_screened_out: List[float]
last_reference_mass_screening: DataFrame
last_failed_or_excluded_files: DataFrame
calibrate(files, calib_channels_dict=None)[source]

Calibrate all files using the selected calibration method.

Parameters:
Return type:

DataFrame

class mioXpektron.recalibrate.FlexibleCalibConfig(reference_masses, calibration_method='quad_sqrt', output_folder='calibrated_spectra', output_mz_range=None, max_workers=None, autodetect_tol_da=None, autodetect_tol_ppm=None, autodetect_method='gaussian', autodetect_fallback_policy='max', autodetect_strategy='mz', prefer_recompute_from_channel=False, outlier_threshold=3.0, use_outlier_rejection=True, max_iterations=3, min_calibrants=3, max_ppm_threshold=100.0, fail_on_high_error=False, retry_high_error_with_pruning=False, retry_high_error_with_mz_fallback=False, retry_high_error_max_removals=5, exclude_reference_masses=<factory>, auto_screen_reference_masses=False, screen_max_mean_abs_ppm=50.0, screen_max_median_abs_ppm=None, screen_min_valid_fraction=0.8, screen_min_count=3, screen_exclude_below_mz=1.5, spline_smoothing=None, multisegment_breakpoints=<factory>, instrument_params=<factory>, save_diagnostic_plots=False, verbose=True, auto_tune=False)[source]

Bases: object

Configuration for flexible calibration with a single user-selected method.

Parameters:
  • reference_masses (List[float])

  • calibration_method (Literal['quad_sqrt', 'linear_sqrt', 'poly2', 'reflectron', 'multisegment', 'spline', 'physical'])

  • output_folder (str)

  • output_mz_range (Tuple[float | None, float | None] | None)

  • max_workers (int | None)

  • autodetect_tol_da (float | Sequence[float] | None)

  • autodetect_tol_ppm (float | None)

  • autodetect_method (str)

  • autodetect_fallback_policy (str)

  • autodetect_strategy (str)

  • prefer_recompute_from_channel (bool)

  • outlier_threshold (float)

  • use_outlier_rejection (bool)

  • max_iterations (int)

  • min_calibrants (int)

  • max_ppm_threshold (float | None)

  • fail_on_high_error (bool)

  • retry_high_error_with_pruning (bool)

  • retry_high_error_with_mz_fallback (bool)

  • retry_high_error_max_removals (int)

  • exclude_reference_masses (List[float])

  • auto_screen_reference_masses (bool)

  • screen_max_mean_abs_ppm (float)

  • screen_max_median_abs_ppm (float | None)

  • screen_min_valid_fraction (float)

  • screen_min_count (int)

  • screen_exclude_below_mz (float)

  • spline_smoothing (float | None)

  • multisegment_breakpoints (List[float])

  • instrument_params (Dict[str, float])

  • save_diagnostic_plots (bool)

  • verbose (bool)

  • auto_tune (bool)

reference_masses: List[float]
calibration_method: Literal['quad_sqrt', 'linear_sqrt', 'poly2', 'reflectron', 'multisegment', 'spline', 'physical'] = 'quad_sqrt'
output_folder: str = 'calibrated_spectra'
output_mz_range: Tuple[float | None, float | None] | None = None
max_workers: int | None = None
autodetect_tol_da: float | Sequence[float] | None = None
autodetect_tol_ppm: float | None = None
autodetect_method: str = 'gaussian'
autodetect_fallback_policy: str = 'max'
autodetect_strategy: str = 'mz'
prefer_recompute_from_channel: bool = False
outlier_threshold: float = 3.0
use_outlier_rejection: bool = True
max_iterations: int = 3
min_calibrants: int = 3
max_ppm_threshold: float | None = 100.0
fail_on_high_error: bool = False
retry_high_error_with_pruning: bool = False
retry_high_error_with_mz_fallback: bool = False
retry_high_error_max_removals: int = 5
exclude_reference_masses: List[float]
auto_screen_reference_masses: bool = False
screen_max_mean_abs_ppm: float = 50.0
screen_max_median_abs_ppm: float | None = None
screen_min_valid_fraction: float = 0.8
screen_min_count: int = 3
screen_exclude_below_mz: float = 1.5
spline_smoothing: float | None = None
multisegment_breakpoints: List[float]
instrument_params: Dict[str, float]
save_diagnostic_plots: bool = False
verbose: bool = True
auto_tune: bool = False