Abstract 18329: Unsupervised Hierarchical Cluster Analysis of Combined Metabolomic and Proteomic Profiling Data Sets From Participants in a Community-Based Cohort Study
Introduction: The integration of metabolomic and proteomic profiling data sets from large cohorts promises to identify novel molecular pathways and therapeutic targets in cardiovascular disease. We hypothesized that unsupervised hierarchical cluster analysis (HCA) of combined proteomic and metabolomic data sets would provide a framework for unbiased identification of novel protein-metabolite associations.
Methods: We have previously measured 1129 proteins using the aptamer-based Somascan platform (Somalogic, Boulder, CO) and 239 metabolites using liquid chromatography tandem mass spectrometry in archived plasma samples collected from 899 participants of Exam 5 of the Framingham Heart Study Offspring Cohort. Data were normalized and standardized using the inverse transformation. HCA of all proteins and metabolites was performed with absolute Pearson correlation distance, followed by dendrogram formation using average agglomeration. This process was blinded to all clinical and demographic data. Adaptive branch pruning was then used to identify unique clusters. Analyses were conducted using R (Vienna, Austria) with packages MKmisc, stats, and dynamicTreeCut.
Results: HCA identified 45 unique clusters. Clusters ranged from 10 to 97 members with a median size of 24. As expected, several clusters contained proteins and metabolites with established associations. For example, one cluster contained proteins and metabolites associated with lipid metabolism: 19 triacylglycerols, 11 cholesterol esters, 4 diacylglycerols, 7 phosphatidylcholines, and 4 apolipoproteins. The protein PCSK9 was also a member, and correlated strongly with lipid metabolites (r = 0.29, P = 3х10-15). The analysis also identified novel associations with high biological plausibility. As an example, Mannose-binding lectin, whose expression is known to be induced in PCSK9 gain-of-function animal models, was an unexpected member of the cluster, with a significant correlation to both PCSK9 (r = -0.18, P = 8х10-8) and lipid metabolites (r = -0.20, P = 4х10-8).
Conclusions: HCA can be successfully applied to combined proteomic and metabolomic data sets in order to identify novel protein-metabolite associations, and may highlight novel pathways for investigation.
Author Disclosures: D.H. Katz: None. M.D. Benson: None. Q. Yang: None. M.J. Keyes: None. D. Shen: None. S. Sinha: None. J.E. Morningstar: None. D. Ngo: None. J.F. O’Sullivan: None. X. Shi: None. L.A. Farrell: None. R.S. Vasan: None. T.J. Wang: None. R.E. Gerszten: None.
- © 2016 by American Heart Association, Inc.