mioXpektron.baseline.flat_window_suggester
flat_window_suggester_polars.py
Small application to discover common “flat” m/z windows across a set of ToF‑SIMS spectra. Inputs are 3‑column tables: Channel, m/z, intensity (case‑insensitive).
Key changes vs. the original: - Replaced all pandas operations with Polars (Rust/Arrow backend). - Added support for providing an explicit list of file paths or glob patterns,
so data can be spread across many folders (no need for a single root dir).
Kept the numerical core (NumPy/SciPy) for smoothing & derivatives.
What it does (unchanged conceptually)
Per spectrum: - Smooth intensities (Savitzky–Golay) and compute 1st/2nd derivatives. - Flag baseline‑candidate points where simultaneously:
Merge contiguous candidate points into segments; keep segments that satisfy minimum width & minimum number of points.
Across all spectra: - Discretize the global m/z range into bins (width = bin_width). - For each file, mark bins covered by any of its segments. - Compute the coverage fraction per bin (#files covering bin / #files total). - Extract contiguous regions whose coverage ≥ coverage_threshold. - Rank regions by mean coverage (then by width) and return top_k windows.
Outputs
out_dir / per_file_segments.csv (Polars CSV)
out_dir / flat_windows_suggestions.csv (Polars CSV with coverage stats)
out_dir / flat_windows.json (list[[lo, hi], …])
out_dir / coverage_curve.(png|pdf) (plot of coverage vs m/z)
Functions
|
Merge per-file segments into common windows via m/z bin coverage. |
|
Return list of (lo, hi, n_points) flat segments for one spectrum. |
|
Robust reader that returns Polars DataFrame with standardized columns. |
Classes
|
|
|
|
|
- mioXpektron.baseline.flat_window_suggester.read_spectrum_table(path)[source]
Robust reader that returns Polars DataFrame with standardized columns. Tries comma, tab, then whitespace-delimited tables (with ‘#’ comments).
- class mioXpektron.baseline.flat_window_suggester.FlatParams(y_quantile: 'float' = 0.2, grad_quantile: 'float' = 0.4, curv_quantile: 'float' = 0.4, savgol_window: 'int' = 11, savgol_poly: 'int' = 2, min_width: 'float' = 0.2, min_points: 'int' = 20)[source]
Bases:
object- Parameters:
- mioXpektron.baseline.flat_window_suggester.find_flat_segments(x, y, p)[source]
Return list of (lo, hi, n_points) flat segments for one spectrum.
- class mioXpektron.baseline.flat_window_suggester.AggregateParams(bin_width: 'float' = 0.1, coverage_threshold: 'float' = 0.5, top_k: 'int' = 6)[source]
Bases:
object
- mioXpektron.baseline.flat_window_suggester.aggregate_common_windows(segments_by_file, x_minmax, agg)[source]
Merge per-file segments into common windows via m/z bin coverage. Returns (windows, coverage_table_df[polars]).
- class mioXpektron.baseline.flat_window_suggester.ScanForFlatRegion(files: 'List[Union[str, Path]]'=<factory>, out_dir: 'Union[str, Path]'='flat_windows_out', n_jobs: 'int' = -1, flat_params: 'FlatParams' = <factory>, agg_params: 'AggregateParams' = <factory>, auto_tune: 'bool' = False)[source]
Bases:
object- Parameters:
n_jobs (int)
flat_params (FlatParams)
agg_params (AggregateParams)
auto_tune (bool)
- flat_params: FlatParams
- agg_params: AggregateParams