mioXpektron.normalization.normalization

Normalization methods for ToF-SIMS mass spectrometry data.

Provides multiple normalization strategies ranging from simple scaling (TIC, max) to variance-stabilizing transforms (Poisson, sqrt, VSN) and robust methods (median, RMS, PQN). Each function operates on a single 1-D intensity array; batch helpers live in preprocessing.py.

All public functions share a common contract:
  • Accept a 1-D array-like of intensities.

  • Return a 1-D np.ndarray of the same length.

  • Replace NaN / negative artefacts with zero by default.

Functions

log_normalization(intensities[, pseudo_count])

Log(1 + intensity) transform for high-dynamic-range spectra.

mass_stratified_pqn_normalization(intensities)

Apply PQN separately across coarse m/z strata.

max_normalization(intensities)

Scale intensities so the maximum value equals 1.

median_normalization(intensities[, ...])

Scale intensities so the median equals target_median.

median_of_ratios_normalization(intensities)

DESeq2-style median-of-ratios normalization.

minmax_normalization(intensities[, ...])

Scale intensities to a fixed range (default [0, 1]).

multi_ion_reference_normalization(intensities)

Normalize using multiple reference ions and a robust median ratio.

normalization_method_names()

Return a sorted list of available 1-D normalization method names.

normalize(intensities[, method])

Apply a named normalization method to a 1-D intensity array.

pareto_normalization(intensities[, mean, ...])

Pareto scale a spectrum using dataset-level feature statistics.

poisson_scaling(intensities)

Poisson (square-root mean) scaling for count data.

pqn_normalization(intensities[, reference])

Probabilistic Quotient Normalization.

rms_normalization(intensities[, target_rms])

Scale intensities so the root-mean-square equals target_rms.

robust_snv_normalization(intensities[, ...])

Robust SNV using median and MAD instead of mean and standard deviation.

selected_ion_normalization(intensities[, ...])

Normalise to a single reference peak (e.g. substrate or matrix ion).

snv_normalization(intensities)

Standard Normal Variate: centre and scale to unit variance.

sqrt_normalization(intensities)

Square-root variance-stabilising transform.

tic_normalization(intensities[, target_tic])

Scale intensities so the total-ion current equals target_tic.

vector_normalization(intensities)

Scale intensities to unit L2 norm (vector length = 1).

vsn_normalization(intensities)

Variance-stabilising normalization via arcsinh transform.

mioXpektron.normalization.normalization.normalization_method_names()[source]

Return a sorted list of available 1-D normalization method names.

Return type:

List[str]

mioXpektron.normalization.normalization.normalize(intensities, method='tic', **kwargs)[source]

Apply a named normalization method to a 1-D intensity array.

Parameters:
  • intensities (array-like) – Raw intensity values (1-D).

  • method (str, default "tic") – Name of the normalization method. Call normalization_method_names() for the full list.

  • **kwargs – Method-specific keyword arguments forwarded to the underlying function (e.g. target_tic for TIC, reference_mz_idx for selected-ion normalization).

Returns:

Normalized intensity values.

Return type:

np.ndarray

Raises:

ValueError – If method is not recognised.

mioXpektron.normalization.normalization.tic_normalization(intensities, target_tic=1000000.0)[source]

Scale intensities so the total-ion current equals target_tic.

This is the most common normalisation in ToF-SIMS. Each spectrum is multiplied by target_tic / sum(intensities) so that all spectra share the same TIC.

Parameters:
  • intensities (array-like) – Raw ion counts or intensities.

  • target_tic (float or None) – Desired total-ion current after scaling. Pass None to skip.

Return type:

np.ndarray

mioXpektron.normalization.normalization.median_normalization(intensities, target_median=1.0)[source]

Scale intensities so the median equals target_median.

More robust than TIC when a few dominant peaks (e.g. substrate ions) inflate the total-ion current.

Parameters:
  • intensities (array-like)

  • target_median (float, default 1.0)

Return type:

np.ndarray

mioXpektron.normalization.normalization.rms_normalization(intensities, target_rms=1.0)[source]

Scale intensities so the root-mean-square equals target_rms.

A compromise between TIC (dominated by big peaks) and median (ignores peak structure).

Parameters:
  • intensities (array-like)

  • target_rms (float, default 1.0)

Return type:

np.ndarray

mioXpektron.normalization.normalization.max_normalization(intensities)[source]

Scale intensities so the maximum value equals 1.

Parameters:

intensities (array-like)

Return type:

np.ndarray

mioXpektron.normalization.normalization.vector_normalization(intensities)[source]

Scale intensities to unit L2 norm (vector length = 1).

Useful for comparing spectral shape irrespective of total signal.

Parameters:

intensities (array-like)

Return type:

np.ndarray

mioXpektron.normalization.normalization.snv_normalization(intensities)[source]

Standard Normal Variate: centre and scale to unit variance.

Commonly used before multivariate analysis (PCA, PLS-DA) to remove multiplicative scatter effects.

Parameters:

intensities (array-like)

Returns:

Mean-centred, variance-scaled spectrum. Note: values can be negative, which is expected for SNV.

Return type:

np.ndarray

mioXpektron.normalization.normalization.robust_snv_normalization(intensities, mad_scale=1.4826)[source]

Robust SNV using median and MAD instead of mean and standard deviation.

This is less sensitive to a few dominant ions than classical SNV and is therefore a better fit when substrate/matrix peaks dominate part of the spectrum.

Parameters:
  • intensities (array-like)

  • mad_scale (float, default 1.4826) – Consistency factor turning MAD into a robust standard deviation estimate for approximately Gaussian data.

Returns:

Median-centred, MAD-scaled spectrum. Negative values are expected.

Return type:

np.ndarray

mioXpektron.normalization.normalization.poisson_scaling(intensities)[source]

Poisson (square-root mean) scaling for count data.

Each channel is divided by sqrt(mean_intensity) across the spectrum. This equalises the weight of low- and high-count channels when ToF-SIMS data follow Poisson statistics. Widely used before PCA.

Parameters:

intensities (array-like)

Return type:

np.ndarray

mioXpektron.normalization.normalization.sqrt_normalization(intensities)[source]

Square-root variance-stabilising transform.

sqrt(intensity) stabilises the variance of Poisson-distributed ion counts. Often combined with mean-centering before PCA.

Parameters:

intensities (array-like)

Return type:

np.ndarray

mioXpektron.normalization.normalization.log_normalization(intensities, pseudo_count=1.0)[source]

Log(1 + intensity) transform for high-dynamic-range spectra.

Parameters:
  • intensities (array-like)

  • pseudo_count (float, default 1.0) – Added before taking the log to avoid log(0).

Return type:

np.ndarray

mioXpektron.normalization.normalization.selected_ion_normalization(intensities, reference_idx=None, reference_intensity=None, target=1.0)[source]

Normalise to a single reference peak (e.g. substrate or matrix ion).

Provide either reference_idx (index into the intensity array) or reference_intensity (the absolute value to divide by).

Parameters:
  • intensities (array-like)

  • reference_idx (int, optional) – Index of the reference peak in intensities.

  • reference_intensity (float, optional) – Absolute intensity value to normalise against.

  • target (float, default 1.0) – Target value for the reference peak after normalisation.

Return type:

np.ndarray

mioXpektron.normalization.normalization.multi_ion_reference_normalization(intensities, reference_indices=None, reference_values=None, target=1.0)[source]

Normalize using multiple reference ions and a robust median ratio.

Parameters:
  • intensities (array-like)

  • reference_indices (sequence of int) – Indices of stable reference ions in the spectrum.

  • reference_values (sequence of float, optional) – Expected intensities for the same reference ions. When provided the spectrum is scaled by the median observed/reference ratio. When omitted, the median observed intensity is scaled to target.

  • target (float, default 1.0) – Target robust centre when reference_values is omitted.

Return type:

np.ndarray

mioXpektron.normalization.normalization.pqn_normalization(intensities, reference=None)[source]

Probabilistic Quotient Normalization.

Designed for compositional data where a few species dominate. Divides each channel by the median quotient relative to a reference spectrum.

Parameters:
  • intensities (array-like)

  • reference (array-like or None) – Reference spectrum (e.g. median of a dataset). If None, falls back to TIC normalization with a warning.

Return type:

np.ndarray

mioXpektron.normalization.normalization.mass_stratified_pqn_normalization(intensities, mz_values=None, reference=None, strata=None)[source]

Apply PQN separately across coarse m/z strata.

This keeps a global TIC-normalised baseline while estimating local PQN size factors for different m/z regions.

Parameters:
  • intensities (array-like)

  • mz_values (array-like) – m/z axis shared with intensities.

  • reference (array-like) – Dataset-level reference spectrum on the same m/z grid.

  • strata (sequence of tuple(float, float), optional) – Inclusive/exclusive m/z windows [(lo, hi), ...]. Defaults to [(0, 100), (100, 400), (400, inf)].

Return type:

np.ndarray

mioXpektron.normalization.normalization.median_of_ratios_normalization(intensities, reference=None)[source]

DESeq2-style median-of-ratios normalization.

Computes the geometric mean spectrum as reference, then normalises each sample by the median ratio to that reference. Robust to compositional effects.

Parameters:
  • intensities (array-like)

  • reference (array-like or None) – Pre-computed geometric-mean reference. If None, falls back to TIC normalization with a warning.

Return type:

np.ndarray

mioXpektron.normalization.normalization.vsn_normalization(intensities)[source]

Variance-stabilising normalization via arcsinh transform.

arcsinh(x) behaves like log(2x) for large values but handles zeros and small values gracefully. Suitable for high-dynamic-range ToF-SIMS spectra.

Parameters:

intensities (array-like)

Return type:

np.ndarray

mioXpektron.normalization.normalization.minmax_normalization(intensities, feature_range=(0.0, 1.0))[source]

Scale intensities to a fixed range (default [0, 1]).

Parameters:
  • intensities (array-like)

  • feature_range (tuple of float, default (0.0, 1.0))

Return type:

np.ndarray

mioXpektron.normalization.normalization.pareto_normalization(intensities, mean=None, std=None, eps=1e-12)[source]

Pareto scale a spectrum using dataset-level feature statistics.

Pareto scaling is a dataset-level transform commonly used before PCA: each feature is mean-centred and divided by sqrt(std_feature). This down-weights very intense ions less aggressively than autoscaling while still reducing dominance by a few channels.

Parameters:
  • intensities (array-like)

  • mean (array-like) – Per-feature dataset mean with the same shape as intensities.

  • std (array-like) – Per-feature dataset standard deviation with the same shape as intensities.

  • eps (float, default 1e-12) – Numerical floor preventing division by zero.

Returns:

Mean-centred, Pareto-scaled spectrum. Negative values are expected.

Return type:

np.ndarray

Raises:

ValueError – If dataset-level mean/std arrays are not provided.