Utilities
The utils module provides file I/O, batch processing orchestration, and statistical analysis tools.
Data Import
Load ToF-SIMS spectra from text files:
from mioXpektron import import_data
mz, intensity, sample_name, group = import_data(
"spectrum.txt",
mz_min=1.0,
mz_max=300.0,
)
The importer:
Auto-detects separators (tab, comma, space)
Skips comment lines (
#,//)Infers sample names from filenames
Infers sample groups from filename patterns
Supports optional m/z range filtering
Batch Processing
Run parallel peak extraction and alignment across many spectra:
from mioXpektron.utils import batch_processing
peaks_df, intensity_df, area_df = batch_processing(
file_list,
max_workers=4,
mz_min=1.0,
mz_max=300.0,
normalization_target=1e6,
mz_tolerance=0.2,
)
Statistical Analysis
The analysis submodule provides tools for downstream statistical analysis of aligned peak matrices.
Benjamini-Hochberg FDR Correction
from mioXpektron.utils.analysis import bh_fdr
q_values = bh_fdr(p_values)
Univariate Testing
from mioXpektron.utils.analysis import compute_univariate_tests
results = compute_univariate_tests(intensity_df, groups)
Visualization Helpers
The analysis module includes plotting utilities for:
Volcano plots
PCA and UMAP projections
ROC curves
Heatmaps
API Reference
- mioXpektron.utils.import_data(file_path, mz_min=None, mz_max=None, group_patterns=None, group_fn=None)[source]
Import ToF-SIMS data from a spectrum file.
- Parameters:
file_path (str) – Path to the ToF-SIMS data file. Supports tab-delimited
.txtexports withm/z+Intensitycolumns and CSV exports withmz+corrected_intensityorintensitycolumns.mz_min (float, optional) – Minimum m/z value to be imported (inclusive).
mz_max (float, optional) – Maximum m/z value to be imported (inclusive).
group_patterns (dict[str, str], optional) – Mapping of
{regex_pattern: group_label}. Patterns are tested against the sample name (filename without extension) in order; the first match determines the group. Defaults to{'_CC...': 'Cancer', '_CT...': 'Control'}.group_fn (callable, optional) – A function
(sample_name: str) -> strthat returns the group label directly. When provided this takes priority over group_patterns.
- Returns:
mz (np.ndarray) – Mass-to-charge ratio values.
intensity (np.ndarray) – Intensity values.
sample_name (str) – Sample name extracted from file name.
group (str) – Group label derived from the filename.
- Return type: