Adaptive Parameterization

The adaptive module provides data-driven estimators that replace fixed thresholds and constants with values derived from your spectra. All estimators are opt-in: existing defaults remain unchanged unless you explicitly activate adaptive tuning.

Motivation

Many pipeline parameters (calibration tolerance, outlier threshold, normalization target, denoise cutoffs, baseline quantiles) are typically set by convention. The adaptive module estimates these values from a pilot sample of your spectra, making the pipeline more robust across instruments, sample types, and acquisition conditions.

Quick Start

The simplest way to activate adaptive parameterization is to set auto_tune=True on the relevant config:

from mioXpektron import FlexibleCalibrator, FlexibleCalibConfig

config = FlexibleCalibConfig(
    reference_masses=my_masses,
    calibration_method="quad_sqrt",
    auto_tune=True,           # <-- activates data-driven estimation
)

calibrator = FlexibleCalibrator(config)
summary = calibrator.calibrate(file_list)

When auto_tune=True, the calibrator will:

  1. Estimate autodetect_tol_da from peak widths near calibrant masses.

  2. Derive multisegment_breakpoints from the calibrant distribution.

  3. Enable auto_screen_reference_masses automatically.

  4. After the first fit pass, derive outlier_threshold from residuals.

  5. Derive screen_max_mean_abs_ppm and screen_min_valid_fraction from the batch-level stability table.

The pipeline config supports the same flag:

from mioXpektron import PipelineConfig, run_pipeline

config = PipelineConfig(auto_tune=True)
intensity_df, area_df = run_pipeline(files, config=config)

This estimates mz_tolerance from median channel spacing and normalization_target from the median raw TIC.

The flat-window scanner and denoise evaluator also accept the flag:

from mioXpektron import ScanForFlatRegion
scanner = ScanForFlatRegion(files=my_files, auto_tune=True)
scanner.run()
from mioXpektron.denoise import compare_methods_in_windows
rollup, summary, detail = compare_methods_in_windows(
    x, y, windows,
    auto_tune=True,
    auto_tune_files=my_files,
)

Using Individual Estimators

Each estimator can be called directly if you want fine-grained control:

from mioXpektron.adaptive import (
    estimate_autodetect_tolerance,
    estimate_outlier_threshold,
    estimate_screening_thresholds,
    estimate_multisegment_breakpoints,
    estimate_normalization_target,
    estimate_mz_tolerance,
    estimate_flat_params,
    estimate_denoise_params,
    estimate_bootstrap_heuristics,
    auto_tune_calib_config,
)

# Calibration tolerance from peak widths
tol_da = estimate_autodetect_tolerance(files, reference_masses)

# Outlier threshold from residual distribution
import numpy as np
threshold = estimate_outlier_threshold(np.array(ppm_residuals))

# Screening thresholds from stability summary
screen = estimate_screening_thresholds(stability_df)

# Multisegment breakpoints from mass distribution
bps = estimate_multisegment_breakpoints(reference_masses, n_segments=3)

# Pipeline parameters
norm_target = estimate_normalization_target(files)
mz_tol = estimate_mz_tolerance(files)

# Flat-window parameters
flat_overrides = estimate_flat_params(files)

# Denoise parameters
denoise_overrides = estimate_denoise_params(files)

# Bootstrap heuristic overrides
bootstrap_ov = estimate_bootstrap_heuristics(files)

# Build a complete config with data-driven values
config = auto_tune_calib_config(files, reference_masses)

Estimator Reference

Adaptive parameterization helpers for the mioXpektron pipeline.

Each estimator derives a value from data that would otherwise be a fixed constant. All functions are pure (no mutation of global state) and return plain Python / NumPy values that callers feed into the existing config dataclasses.

Usage is opt-in: every config keeps its current defaults; the new auto_tune=True flag triggers adaptive estimation.

mioXpektron.adaptive.estimate_autodetect_tolerance(files, reference_masses, *, sample_n=10, quantile=0.9)[source]

Estimate autodetect_tol_da from observed peak widths near calibrant m/z values.

Reads a sample of spectra, measures the FWHM of the strongest peak within +/-1 Da of each reference mass, and returns a tolerance equal to quantile of those widths (clamped to [0.05, 2.0] Da).

Parameters:
Return type:

float

mioXpektron.adaptive.estimate_outlier_threshold(residuals, *, target_false_rejection_rate=0.01, bounds=(2.0, 5.0))[source]

Derive outlier_threshold from observed residual spread.

Uses the empirical quantile corresponding to 1 - target_false_rejection_rate of absolute z-scores (MAD-scaled), clamped to bounds.

Parameters:
Return type:

float

mioXpektron.adaptive.estimate_screening_thresholds(stability_df, *, ppm_quantile=0.85, valid_frac_quantile=0.2)[source]

Derive screen_max_mean_abs_ppm and screen_min_valid_fraction from a reference-mass stability table (output of summarize_reference_mass_stability).

Returns a dict with keys screen_max_mean_abs_ppm and screen_min_valid_fraction.

Parameters:
Return type:

Dict[str, float]

mioXpektron.adaptive.estimate_multisegment_breakpoints(reference_masses, n_segments=3)[source]

Place segment breakpoints at quantile boundaries of the reference mass range so each segment contains roughly equal calibrant counts.

Parameters:
Return type:

List[float]

mioXpektron.adaptive.estimate_normalization_target(files, *, sample_n=20, mz_min=None, mz_max=None)[source]

Estimate normalization_target as the median raw TIC across a sample of spectra. Falls back to 1e6 on failure.

Parameters:
Return type:

float

mioXpektron.adaptive.estimate_mz_tolerance(files, *, sample_n=10, multiplier=3.0)[source]

Estimate mz_tolerance from observed median m/z spacing, scaled by multiplier. Clamped to [0.01, 1.0].

Parameters:
Return type:

float

mioXpektron.adaptive.estimate_flat_params(files, *, sample_n=10)[source]

Estimate savgol_window and quantile thresholds for FlatParams from the data.

Returns a dict of keyword overrides suitable for dataclasses.replace(FlatParams(), **result).

Parameters:
Return type:

Dict[str, object]

mioXpektron.adaptive.estimate_denoise_params(files, *, sample_n=5)[source]

Estimate hf_cutoff_frac and max_peaks for the denoise selection evaluator from pilot spectra.

Returns a dict of keyword overrides for compare_denoising_methods.

Parameters:
Return type:

Dict[str, object]

mioXpektron.adaptive.estimate_bootstrap_heuristics(files, *, sample_n=10)[source]

Derive adaptive bootstrap peak-matching constants from observed channel statistics (noise, spacing, range).

Returns a dict whose keys match the _BOOTSTRAP_* constant names in _models.py (without the leading underscore).

Parameters:
Return type:

Dict[str, float]

mioXpektron.adaptive.auto_tune_calib_config(files, reference_masses, *, base_config=None, sample_n=10)[source]

Build a FlexibleCalibConfig with data-driven parameters.

Starts from base_config (or the default) and replaces tolerance, screening values, and breakpoints with adaptive estimates. The caller can further override any field afterwards.

Returns a FlexibleCalibConfig instance.

Parameters: