Systematic Feature Filtering in Exploratory Metabolomics: Application toward Biomarker Discovery

Warning

This publication doesn't include Faculty of Education. It includes Faculty of Science. Official publication website can be found on muni.cz.

Authors	GADARA Darshak Chandulal COUFALÍKOVÁ Kateřina BOSÁK Juraj ŠMAJS David SPÁČIL Zdeněk
Year of publication	2021
Type	Article in Periodical
Magazine / Source	Analytical chemistry
MU Faculty or unit	Faculty of Science
Citation
web	https://pubs.acs.org/doi/10.1021/acs.analchem.1c00816
Doi	http://dx.doi.org/10.1021/acs.analchem.1c00816
Keywords	SPECTROMETRY-BASED METABOLOMICS; LC-MS METABOLOMICS; ANNOTATION; TOOLS; CHALLENGES; SOFTWARE; XCMS
Description	Exploratory mass spectrometry-based metabolomics generates a plethora of features in a single analysis. However, >85% of detected features are typically false positives due to inefficient elimination of chimeric signals and chemical noise not relevant for biological and clinical data interpretation. The data processing is considered a bottleneck to unravel the translational potential in metabolomics. Here, we describe a systematic workflow to refine exploratory metabolomics data and reduce reported false positives. We applied the feature filtering workflow in a case/control study exploring common variable immunodeficiency (CVID). In the first stage, features were detected from raw liquid chromatography-mass spectrometry data by XCMS Online processing, blank subtraction, and reproducibility assessment. Detected features were annotated in metabolomics databases to produce a list of tentative identifications. We scrutinized tentative identifications' physicochemical properties, comparing predicted and experimental reversed-phase liquid chromatography (LC) retention time. A prediction model used a linear regression of 42 retention indices with the cLogP ranging from -6 to 11. The LC retention time probes the physicochemical properties and effectively reduces the number of tentatively identified metabolites, which are further submitted to statistical analysis. We applied the retention time-based analytical feature filtering workflow to datasets from the Metabolomics Workbench (www. metaboloinicsworkbench.org ), demonstrating the broad applicability. A subset of tentatively identified metabolites significantly different in CVID patients was validated by MS/MS acquisition to confirm potential CVID biomarkers' structures and virtually eliminate false positives. Our exploratory metabolomics data processing workflow effectively removes false positives caused by the chemical background and chimeric signals inherent to the analytical technique. It reduced the number of tentatively identified metabolites by 88%, from initially detected 6940 features in XCMS to 839 tentative identifications and streamlined consequent statistical analysis and data interpretation.
Related projects:	CETOCOEN Plus Mapování interakcí mezi základními metabolickými pochody a střevní mikroflórou Transformative stem cell-based model of Alzheimer’s disease and advanced analytics to study the role of membrane lipids in the pathogenesis Klinicky relevantní biochemické, imunologické a buněčné biomarkery Alzheimerovy nemoci a stárnutí CETOCOEN Excellence CETOCOEN Excellence RECETOX RI