mioXpektron.normalization.preprocessing
Functions
|
Batch‑import and preprocess multiple ToF‑SIMS spectra, then save the (m/z, normalized_intensity) arrays for each file as a tab‑separated text file in output_dir. |
|
Import and preprocess ToF-SIMS data from a text file. |
|
Resample a spectrum onto a target m/z grid. |
Classes
|
Batch TIC normalization for multiple spectra files using Polars and concurrent.futures. |
- mioXpektron.normalization.preprocessing.resample_spectrum(mz_values, intensity_values, target_mz, method='linear')[source]
Resample a spectrum onto a target m/z grid.
The input axis is sorted, duplicate m/z positions are collapsed to their first occurrence, and values outside the native m/z range are filled with zero. Supported interpolation methods are
linear,pchip,akima,makima, andcubic.
- mioXpektron.normalization.preprocessing.data_preprocessing(file_path, mz_min=None, mz_max=None, normalization_target=1000000.0, verbose=True, return_all=False)[source]
Import and preprocess ToF-SIMS data from a text file.
Parameters:
- file_pathstr
Path to the ToF-SIMS data file
- mz_min, mz_maxfloat, optional
m/z range to import
- normalization_targetfloat or None
Target TIC for normalization, or None to skip
- verbosebool
Print progress if True
- return_allbool
If True, return all intermediate arrays
Returns:
mz_values : numpy.ndarray normalized_intensities : numpy.ndarray sample_name : str group : str (optionally: intermediate arrays)
- mioXpektron.normalization.preprocessing.batch_tic_norm(input_pattern, output_dir='normalized_spectra', mz_min=None, mz_max=None, normalization_target=1000000.0, verbose=False)[source]
Batch‑import and preprocess multiple ToF‑SIMS spectra, then save the (m/z, normalized_intensity) arrays for each file as a tab‑separated text file in output_dir.
- Parameters:
input_pattern (str) – Glob pattern (e.g. ‘spectra/*.txt’) that expands to the input files.
output_dir (str) – Folder where ‘<original‑name>_normalized.txt’ will be written; created if it does not already exist.
mz_min (float | None) – Passed through to :pyfunc:`data_preprocessing`.
mz_max (float | None) – Passed through to :pyfunc:`data_preprocessing`.
normalization_target (float | None) – Passed through to :pyfunc:`data_preprocessing`.
verbose (bool) – Passed through to :pyfunc:`data_preprocessing`.
- Returns:
Paths of the files written, in processing order.
- Return type:
List[str]
- class mioXpektron.normalization.preprocessing.BatchTicNorm(input_pattern, output_dir='normalized_spectra', normalization_target=1000000.0, n_workers=-1, verbose=True)[source]
Bases:
objectBatch TIC normalization for multiple spectra files using Polars and concurrent.futures.
Supports both CSV and TXT file formats: - CSV: Uses ‘corrected_intensity’ if available, otherwise ‘intensity’ - TXT: Tab-separated m/z and intensity values
Output files contain: channel, mz, intensity (normalized)
- Parameters:
- __init__(input_pattern, output_dir='normalized_spectra', normalization_target=1000000.0, n_workers=-1, verbose=True)[source]
Initialize BatchTicNorm processor.
- Parameters:
input_pattern (str) – Glob pattern for input files (e.g., ‘data/.csv’ or ‘data/.txt’)
output_dir (str) – Directory to save normalized files
normalization_target (float) – Target TIC value for normalization (default: 1e6)
n_workers (int) – Number of parallel workers (default: 16)
verbose (bool) – Print progress information