mioXpektron.normalization.normalization_eval
Normalization method evaluation for ToF-SIMS data.
Evaluates multiple normalization strategies on a set of labelled spectra using unsupervised, supervised and spectral-quality metrics, then ranks them with composite scores — following the approach established in xpectrass for FTIR data but adapted to the specifics of ToF-SIMS (Poisson counting statistics, high dynamic range, ion-yield variation).
Usage
>>> from mioXpektron.normalization import NormalizationEvaluator
>>> evaluator = NormalizationEvaluator(files=["spectra/*.txt"])
>>> summary = evaluator.evaluate()
>>> evaluator.plot()
Functions
|
Evaluate a single normalisation method on the spectra matrix. |
|
Spectral Angle Mapper (SAM) in radians; lower => more similar shape. |
|
Mean SAM across all pairs within each group (technical replicates). |
Classes
|
Evaluate normalization methods on labelled ToF-SIMS spectra. |
- mioXpektron.normalization.normalization_eval.spectral_angle(a, b, eps=1e-12)[source]
Spectral Angle Mapper (SAM) in radians; lower => more similar shape.
- mioXpektron.normalization.normalization_eval.within_group_mean_sam(X, groups)[source]
Mean SAM across all pairs within each group (technical replicates).
- mioXpektron.normalization.normalization_eval.evaluate_one_method(X_raw, groups, mz_values, method, method_kwargs=None, n_clusters=None, cluster_bootstrap_rounds=30, cluster_bootstrap_frac=0.8, random_state=0, compute_supervised=True)[source]
Evaluate a single normalisation method on the spectra matrix.
- Parameters:
X_raw (np.ndarray) – (n_samples, n_channels) raw intensity matrix.
groups (np.ndarray) – (n_samples,) label per sample.
mz_values (np.ndarray) – (n_channels,) m/z axis shared by all spectra.
method (str) – Normalization method name.
method_kwargs (dict, optional) – Extra keyword arguments forwarded to
normalize().n_clusters (int, optional) – Number of clusters. Defaults to number of unique groups.
cluster_bootstrap_rounds (int) – Bootstrap rounds for cluster stability.
cluster_bootstrap_frac (float) – Fraction of samples per bootstrap round.
random_state (int) – RNG seed.
compute_supervised (bool) – If True and scikit-learn is available, run supervised CV.
- Returns:
Keys include
method, all metric values, andcompute_time_sec.- Return type:
- class mioXpektron.normalization.normalization_eval.NormalizationEvaluator(files=<factory>, methods=None, method_kwargs_map=None, mz_min=None, mz_max=None, n_clusters=None, cluster_bootstrap_rounds=30, cluster_bootstrap_frac=0.8, random_state=0, compute_supervised=True, n_jobs=-1, group_patterns=None, group_fn=None)[source]
Bases:
objectEvaluate normalization methods on labelled ToF-SIMS spectra.
- Parameters:
files (list of str or Path) – Paths or glob patterns expanding to spectrum text files.
methods (list of str, optional) – Normalization method names. Defaults to a sensible subset.
method_kwargs_map (dict, optional) –
{method_name: {kwarg: value, ...}}for method-specific params.mz_min (float, optional) – m/z range to import.
mz_max (float, optional) – m/z range to import.
n_clusters (int, optional) – Number of clusters for KMeans evaluation. Auto-detected if omitted.
cluster_bootstrap_rounds (int) – Bootstrap rounds for stability metric.
random_state (int) – RNG seed for reproducibility.
compute_supervised (bool) – Run supervised classification (requires scikit-learn + >=2 groups).
n_jobs (int) – Parallel workers (joblib).
-1= all CPUs,1= sequential.cluster_bootstrap_frac (float)
group_fn (Any | None)
Examples
>>> evaluator = NormalizationEvaluator(files=["data/*.txt"]) >>> summary = evaluator.evaluate() >>> evaluator.plot()
- evaluate()[source]
Evaluate all methods and return a scored DataFrame.
- Returns:
One row per method, sorted by
score_combined(descending). Includes raw metrics, z-scored metrics, and four composite scores.- Return type:
pd.DataFrame
- plot(out_dir='normalization_selection_output', save=True)[source]
Generate evaluation plots (box plots, bar charts, radar).
- print_summary(top_n=5)[source]
Print a ranked summary of evaluation results.
- Parameters:
top_n (int, default 5) – Number of top methods to display per score variant.
- Return type:
None
- preview_overlay(file, methods=None, max_methods=5, mz_min=None, mz_max=None, save_to='normalization_selection_output')[source]
Plot raw vs normalised overlays for quick visual comparison.
- Parameters:
file (str or Path) – Single spectrum file to visualise.
methods (list of str, optional) – Methods to overlay. Defaults to top methods from evaluation.
max_methods (int) – Cap on the number of overlays.
mz_min (float, optional) – m/z window for the plot.
mz_max (float, optional) – m/z window for the plot.
save_to (str, Path, or None) – Save directory (relative to OUTPUT_DIR).
Noneskips saving.
- Return type:
None