Metabolomic Data Handling and Analysis
GC-MS provides a vast and complex three dimensional data-set for each sample. Principal components analysis (PCA) is ideal for the analysis of metabolomic data-sets as it helps to select the relevant components (compounds) from the many components in the samples. As we potentially have hundreds of metabolites in each of four chromatographic methods (GC, AQ and NP, LC pos and neg), we need to use methods which reduce the numbers of variables, but lose none of the variation within our data-sets. The output from PCA is in the form of scores and loadings which are best assessed graphically.
The benefits of this method are that:
- interactions between compounds can be identified
- compounds need not be significantly different between treatments to have a combined effect
- the method is not targeted so unknown interactions can be found (GM?).
Due to the complexity of the data-sets, often individual items of data that are significantly different do not have an influence in the top scores of PCA. Using analysis of variance, and comparing only data which is significant, we can compare each pair wise SEDs as input to cluster analysis which can potentially indicate pathway flux changes between our treatments. These methods are currently being developed using a subset of the Commonwealth Potato Collection held at SRCI.
Due to the complexity of the raw data set and the large numbers of components required to be quantified, we use software which deconvolutes co-eluting peaks by extracting chromatographs of individual mass ions underneath each peak. There are many packages which use deconvolution algorithms. We currently use the Automated Mass Spectral Deconvolution and Identification System (AMDIS) to view and identify a component within GC-MS. AMDIS in conjunction with Xcalibur has been the preferred system for building up our list of compounds searched for in our data files.
The following figure shows in the top window the total ion count (TIC) in white, and two masses from different compounds; the first is a less intense compound which is not visable in the TIC. The next window plots the masses that AMDIS has identified as abundant masses, and the last figure shows two spectra associated with the retention time of the less intense compound. The black trace represents the total scan at six minutes and the white is the extracted or deconvoluted spectrum with AMDIS has identified.


.jpg)





