A critical analysis of Adaptive Box-Cox transformation for skewed distributed data management: metabolomics of Spanish and Argentinian truffles as a case study

dc.bibliographicCitation.stpage343704es_ES
dc.bibliographicCitation.titleAnalytica Chimica Actaen
dc.bibliographicCitation.volume1345es_ES
dc.contributor.authorSibono, Leonardoes_ES
dc.contributor.authorGrosso, Massimilianoes_ES
dc.contributor.authorTejedor Calvo, Raqueles_ES
dc.contributor.authorCasula, Mattiaes_ES
dc.contributor.authorMarco Montori, Pedroes_ES
dc.contributor.authorGarcĂ­a Barreda, Sergies_ES
dc.contributor.authorManis, Cristinaes_ES
dc.contributor.authorCaboni, Pierluigies_ES
dc.coverage.spatialCiencia vegetales_ES
dc.date.accessioned2025-02-03T11:51:08Z
dc.date.available2025-02-03T11:51:08Z
dc.date.issued2025es_ES
dc.date.updated2025-01-31T10:00:16Z
dc.description.abstractBackground Metabolic variations retrieved in metabolomic data are considered a benchmark for detecting biomatrix variability. Therefore, identifying target metabolites is crucial to keep track of any substrate modification and preserve it from any undesired alteration. Unfortunately, such a task can be negatively affected by detecting false positives, often triggered by complicated data distributions. In this work, we undertook an investigation of the metabolic profile of Spanish and Argentine truffles using a robust methodology. The issue of skewed data distributions has been effectively addressed through a normalisation preprocessing, enhancing biomarker identification and samples classification. Results A data normality-improved parametric test (ANOVA) was employed to define the target metabolites, which significantly vary between two regions of origin: Spain and Argentina. Specifically, Adaptive Box-Cox transformation was employed to improve the ANOVA test's performance so that data distributions were fitted to a Gaussian variable. Using the Bonferroni-Holm method for false discovery rate correction, we demonstrated the effectiveness of this transformation for the case under investigation. Results were compared with two non-parametric tests (Kruskall-Wallis and Permutation test), selected as a reference methodology, to provide a better understanding of non-normal distributions often encountered in metabolomic data analysis. 17 metabolites out of the 57 investigated metabolites exhibited notable variability across the two geographical regions. The validity of this methodology was supported through the discrimination of samples belonging to different groups. In this regard, both univariate and multivariate statistical models were tested through Monte Carlo simulations and yielded consistent results. Significance data analysis outcomes are sensitive to variables distributions. The present study shows an effective tool to increase data normality, thereby enhancing the statistical power for biomarker discovery and improving models’ classification performances. These results find justification from the current knowledge within the field of food sciences, enabling their application in advancing research in the truffle analysis domain.en
dc.description.otherMetabolomicsen
dc.description.otherFooden
dc.description.otherData preprocessingen
dc.description.otherMass spectrometryen
dc.description.otherGeographical originen
dc.description.otherBiomarker discoveryen
dc.description.statusPublishedes_ES
dc.identifier.citationSibono,L.; Grosso, M.; Tejedor-Calvo, E.; Casula, M-; Marco, P.; Garcia Barreda, S.; Manis, C.; Caboni, P. A critical analysis of Adaptive Box-Cox transformation for skewed distributed data management: metabolomics of Spanish and Argentinian truffles as a case study, Analytica Chimica Acta, 2025, 343704
dc.identifier.issn00032670
dc.identifier.urihttp://hdl.handle.net/10532/7473
dc.language.isoenes_ES
dc.relation.doihttps://doi.org/10.1016/j.aca.2025.343704es_ES
dc.relation.urihttps://doi.org/10.1016/j.aca.2025.343704es_ES
dc.rightsAtribución-NoComercial-SinDerivadas 3.0 Españaes_ES
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/es/es_ES
dc.subject.agrovocMetabolĂłmicaes
dc.subject.agrovocAlimentoses
dc.subject.agrovocEspectrometrĂ­a de masases
dc.subject.agrovocProcesamiento de datoses
dc.subject.agrovocProcedenciaes
dc.subject.agrovocMarcadores genéticoses
dc.subject.otherAlimentos
dc.subject.otherEspectrometrĂ­a de masas
dc.subject.otherMarcadores genéticos
dc.subject.otherMetabolĂłmica
dc.subject.otherProcedencia
dc.subject.otherProcesamiento De Datos
dc.titleA critical analysis of Adaptive Box-Cox transformation for skewed distributed data management: metabolomics of Spanish and Argentinian truffles as a case studyen
dc.typeJournal Contribution*
dc.type.refereedRefereedes_ES
dc.type.specifiedArticlees_ES

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
10128743.pdf
Size:
1.7 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
0 B
Format:
Item-specific license agreed upon to submission
Description: