Single Nucleotide Polymorphisms in Multiple Novel Thrombospondin Genes May Be Associated With Familial Premature Myocardial Infarction
Background Recent advances in high-throughput genomics technology have expanded our ability to catalogue allelic variants in large sets of candidate genes related to premature coronary artery disease.
Methods and Results A total of 398 families were identified in 15 participating medical centers; they fulfilled the criteria of myocardial infarction, revascularization, or a significant coronary artery lesion diagnosed before 45 years in men or 50 years in women. A total of 62 vascular biology genes and 72 single-nucleotide polymorphisms were assessed. Previously undescribed variants in 3 related members of the thrombospondin protein family were prominent among a small set of single-nucleotide polymorphisms that showed a statistical association with premature coronary artery disease. A missense variant of thrombospondin 4 (A387P) showed the strongest association, with an adjusted odds ratio for myocardial infarction of 1.89 (P=0.002 adjusted for covariates) for individuals carrying the P allele. A variant in the 3′ untranslated region of thrombospondin-2 (change of thymidine to guanine) seemed to have a protective effect against myocardial in individuals homozygous for the variant (adjusted odds ratio of 0.31; P=0.0018). A missense variant in thrombospondin-1 (N700S) was associated with an adjusted odds ratio for coronary artery disease of 11.90 (P=0.041) in homozygous individuals, who also had the lowest level of thrombospondin-1 by plasma assay (P=0.0019).
Conclusions This large-scale genetic study has identified the potential of multiple novel variants in the thrombospondin gene family to be associated with familial premature myocardial infarction. Notwithstanding multiple caveats, thrombospondins specifically and high-throughput genomic technology in general deserve further study in familial ischemic heart disease.
Received October 15, 2001; accepted October 18, 2001.
We began a case-control study using high-throughput genomics technology to examine the role of common genetic variants in a large number of candidate genes for premature, familial coronary artery disease (CAD) and myocardial infarction. Candidate genes were chosen for their acknowledged role in endothelial cell biology, vascular biology, lipid metabolism, and the coagulation cascade. In the present article, we describe the results of our analysis of 72 single-nucleotide polymorphisms (SNPs) drawn from 62 candidate genes in 352 CAD cases and 418 controls, an effort that entailed generating >50 000 individual genotypes; this is, to date, the largest such genetic association study.
Case and Control Population
Fifteen medical centers in the United States (see Appendix) participated in the enrollment of probands and their affected siblings. Each proband was required to have developed CAD by 45 years if male or 50 years if female, as manifest by either a myocardial infarction, surgical or percutaneous coronary revascularization, or a coronary angiogram with evidence of at least a 70% stenosis in a major epicardial artery. At least one sibling who also had fulfilled these criteria had to be alive to qualify for inclusion, and the proband along with affected sibling(s) answered a health questionnaire, had anthropometric measures taken, and had blood drawn for measurement of serum makers and extraction of DNA. The protocol was approved by the institutional review board at each participating institution. All patients gave informed consent to participate. For the purpose of our case-control study, a series of unrelated singleton cases was selected such that only one affected individual from each family was represented, giving preference to the sibling with the earlier age of onset. The case series was limited to white families. Controls representing a general, unselected population of white Americans were identified through random-digit phone dialing in the Atlanta, Georgia, area.
Variant Allele Discovery, Validation, and Genotyping
Cell lines derived from an ethnically diverse population were obtained and used for SNP discovery by methods previously described in detail.1 Genomic sequences representing the coding and partial regulatory regions of genes were amplified by polymerase chain reaction and screened using 2 independent methods: denaturing high-performance liquid chromatography or variant detector arrays (Affymetrix). An average of 114 chromosomes were screened for each gene, providing 99% power to detect alleles of >5% frequency and 65% power to detect alleles of >1% frequency. Using these methods, the overall sensitivity of SNP discovery is >90%.1 Sequencing was performed to validate each putative SNP, and genotyping was performed with single-base extension using either fluorescence energy transfer or fluorescence polarization. A total of 85 SNPs were genotyped, including at least one from each of 62 genes related to vascular biology genes. SNPs were prioritized for genotyping on the basis of a preference for missense variation in protein sequence or high allele frequency in and around coding sequence; 17 SNPs were not identified at least once when assessed in a subset of 96 control individuals, and therefore, they were judged to be too rare to justify genotyping in the complete set of cases and controls. The final number of SNPs analyzed in the case-control study was 72.
All analyses were done using the SAS statistical package (Version 8.0, SAS Institute Inc). Differences between cases and controls were assessed with a χ2 statistic for categorical covariates and the Wilcoxon statistic for continuous covariates. Significance was determined using a continuity-adjusted χ2 or Fisher’s exact test for each genotype compared with the homozygous wild-type for that locus. Odds ratios were calculated and presented with 95% confidence intervals. Multivariate logistic regression was used to adjust for sex, presence of hypertension, diabetes, and body mass index using the LOGISTIC procedure in SAS.
After identifying thrombospondin-1 as a gene implicated in premature CAD, the plasma samples from 240 cases that were previously collected in citrate anticoagulant and stored at −70°C were evaluated with an ELISA assay developed from a published procedure.2 All samples were analyzed in duplicate. Repeat measures of plasma thrombospondin were strongly correlated over all ranges of thrombospondin (r2=0.96).
The demographic characteristics of the 352 cases and 418 controls are presented in the Table. Cases were more likely than controls to be male, older, diabetic, hypertensive, and have a higher body mass index. The most common event that led to the inclusion of a case into the study was myocardial infarction (54%). Cases were enrolled in the study, on average, 9 years after their qualifying event, suggesting a survivor bias.
Genotype distributions for cases and controls are shown in the Online Table (can be found at http://www.circulationaha.org) for all loci examined. Eleven SNPs in 9 genes showed statistically significant differences (P<0.05) between cases and controls for CAD, myocardial infarction, or both. A variant in only one of these genes, MTHFR (C677T), has been the subject of conflicting reports of association with CAD.3–7 Besides MTHFR, the other variants represent novel associations. Our study did not confirm statistical association for several other variants that have been previously linked with a risk of thrombotic cardiovascular disease or CAD, including the PLA2 allele (L33P) of the platelet glycoprotein IIb/IIIa receptor, factor V Leiden (R506Q), 8,9 prothrombin G20210A, 9 hemochromatosis (C282Y), 10 or P-selectin (T715P) 11 variants.
The THBS4 variant A387P in the third repeat type II unit may affect the secondary structure of the protein and disrupt the Ca2+ binding site. The THBS1 N700S variant occurs in the first type III unit. Patients who were homozygous for the THBS1 variant (SS) had the highest odds ratio for myocardial infarction (8.16) and a significantly lower plasma level of THBS1 than other genotypes (median levels, 88 ng/mL for SS, 235 ng/mL for NS, and 189 ng/mL for NN genotypes; P=0.0019). Correcting for 100 independent hypotheses (≈50 genes tested for two outcomes) resulted in none of the associations reaching P<0.05. Only THBS4 reached a level of significance of P<0.10 for the association with myocardial infarction.
Test for Population Stratification
Underlying population stratification, caused by unequal proportions of ethnicities in the case and control populations, or ethnic admixture, can lead to spurious associations in case-control studies. Because we used a general population control group with self-reported ethnicity, we tested for the presence of population stratification in our sample using the method described by Reich and Goldstein.12 We selected 96 evenly spaced, unlinked SNPs from the SNP consortium database and successfully genotyped 72 of the markers in 100 randomly selected cases and 100 of the controls. Given the extremely low probability that any of these markers are linked to a casual variant, we expect that in the absence of population stratification, the χ2 values would be distributed as a χ2 distribution with 1 degree of freedom. The value of this distribution is 1. The χ2 distribution generated by our markers has a mean value of 1.2, which was not significantly different from the null hypothesis (P>0.05). Thus, the results of our analysis show that the general population controls used in this study are well matched to the cases. According to the population genetic simulations performed by Pritchard and Rosenberg13 for similar study designs, the probability that spurious associations will be obtained at the candidate locus and not detecting stratification (at the 0.05 level) is ≈5%.
The work described here exploits high-throughput genomic technologies to perform a large-scale case-control genetic association study in patients with familial premature CAD. In total, some 72 SNPs were analyzed in ≈770 individuals, thus representing >50 000 genotypes. This study represents an attempt to use contemporary genomic technology to associate a large set of SNPs in a group of candidate genes implicated in arterial thrombosis and vascular biology with familial, premature coronary disease. Three novel SNPs from 3 distinct thrombospondin genes emerged as among the most highly associated variants. Each variant allele formed the basis of “at risk” genotypes that were each significantly associated with familial, premature myocardial infarction. The thrombospondin family of 5 extracellular matrix glycoproteins are known to play a pivotal role in cell adhesion: modulating vascular injury, coagulation, and angiogenesis and serving as a key ligand for CD36, an oxidized LDL receptor, and for integrins, including αvβ3. 14–16 Thrombospondins have been demonstrated in atherosclerotic plaque, 17,18 and thrombospondin deficiency has been associated with increased levels of matrix metalloproteinase-2, a protein linked to the vulnerability of atherosclerotic plaque.19 The common SNP variant in the type 3 repeat of thrombospondin-4 (A387P) is predicted to affect folding and secretion of the protein and disruption of the calcium binding site.20,21 Indeed, although not previously implicated, the thrombospondin protein family members play critical roles in vascular integrity and thrombosis and may be particularly susceptible to playing a role, if altered, in premature atherosclerosis and myocardial infarction.
It is clear that considerable additional work is needed to extend our findings. Replication of our novel statistical associations in independent populations of patients with familial, premature myocardial infarction, as well as nonfamilial, older onset disease will be critical to determining which of our observations may be generalizable. The identification of additional SNPs in and around these genes will be necessary to determine which individual variants or haplotypes are the true underlying cause of the observed associations. Defining the molecular mechanism linking the thrombospondin SNPs to adverse clinical outcomes will be key to understanding the pathophysiology of this pathway. Finally, multivariate models that incorporate the validated SNP associations and assess their interactive role with each other and additional covariates will provide the basis of comprehensive risk assessment models for ultimate use in clinical practice.
Given the large number of tests performed in our study, we anticipate that some of the apparent statistical associations we report will, in fact, have occurred by chance and that such associations will be indistinguishable from those that reflect some true underlying biological predisposition. Our initial attempts to replicate the association of the THBS-4 variant in 2 smaller series of patients with early-onset premature CAD have failed to confirm the generality of our observation. Such failures to replicate genetic associations in independent populations are a vexing aspect of genetic association studies22 and reflect the difficulty of defining uniform clinical end points and the effects of confounding environmental influence and population-specific genetic modifiers. Inclusion in our patient cohort required that 2 members of a family were both affected by premature CAD. However, neither of the independent populations we tested with early-onset CAD was defined using such a stringent inclusion criterion, and the subset of myocardial infarction cases was not rigorously evaluated; these differences may account for our failure to replicate. The coincidence of finding association with 3 distinct SNPs in thrombospondin family members and the functional correlation of low plasma levels with the highest risk genotype for thrombospondin-1 strengthens the hypothesis of a potential biological link between thrombospondin variation and early-onset CAD and warrants further study to validate the association and elucidate the biological mechanism.
Coronary atherosclerosis is still the most important cause of death, and one would expect that a continuum in genetic liability exists between premature and typical CAD and myocardial infarction. The thrombospondin variants identified are interesting, but the interpretation of their actual importance relies on considerable further study, requiring independent replication and proof of a cause-and-effect relationship for the variants directly influencing the disease. Further work with genome-wide scanning of our sibships may prove helpful, as has been the case of identifying potential genes in a Finnish population of premature coronary disease23 or in sibships associated with longevity.24 Our study highlights some of the limitations in high-throughput candidate gene investigation and emphasizes that such work should be considered as exploratory and hypothesis-generating.
GeneQuest Investigators and Collaborators
Cleveland Clinic Foundation, Cleveland, Ohio: Eric J. Topol (Study Chairperson), David J. Moliterno, Gurunathan Murugesan, Olga Stenina, Kandace Kottke-Marchant, Edward F. Plow, Ruth Cannata, Patricia Welsh, and Monique Rosenthal; Emory University Hospital, Atlanta, Ga: Spencer B. King III, William Anderson, Joe Jean Borowski, and Kris Anderberg; Mayo Clinic, Rochester, Minn: David R. Holmes, Jr, Charanjit Rihal, and Sharon McIntire-Langworthy; University of Alabama Medical Center, Birmingham: William Rogers and Ann Snider; Duke University Medical Center, Durham, NC: L. Kristin Newby and Laura Drew; the Lindner Center for Clinical Cardiovascular Research, Cincinnati, Ohio: Dean Kereiakes, Eli Roth, and Louise Wohlford; LeBauer Cardiovascular Research Foundation, Greensboro, NC: Anthony De Franco and Teresa Schrader; St Joseph Hospital, Savannah, Ga: Phillip Gainey and Sandra Arsenault; Lancaster Heart Foundation, Lancaster, Pa: Paul Casale and Joann Tuzi; Latter Day Saints Hospital, Salt Lake City, Utah: Jeffrey Anderson, Juli Jerman, Rob Pearson, and Ann Allen; Diabetes and Glandular Associates, San Antonio, Tex: Sherwyn Schwartz and Sue Beasie; St Louis University Hospital, St Louis, Mo: Frank Aguirre, Sandra Aubuchon, and Kristin Weisbrod; the Heart Group, Saginaw, Mich: Jeffrey Carney and Muriel Harris; Michigan Heart and Vascular Institute, Ypsilanti: Jim Bengtson and Mary Adolphson; Oregon Cardiology Clinic, Portland: John Rudoff and Sue Williams; Whitehead Institute, MIT Center for Genome Research, Cambridge, Mass: Stacey Bolk, George Q. Daley, and Eric S. Lander; Millennium Pharmaceuticals, Cambridge, Mass: Jennifer Metivier and Jeannette McCarthy.
↵*The names of the investigators, research coordinators, and all collaborators are presented in the Appendix.
This article was published Online on October 29, 2001(Circulation. 2001;104:r24-r27).
This article has a Data Supplement (Table), which can be found Online at http://www.circulationaha.org
Presented in part at the 73rd Scientific Sessions of the American Heart Association, New Orleans, La, November 13, 2000 and published in abstract form (Circulation. 2000;102[suppl II]:II-31).
Cargill M, Altshuler D, Ireland J, et al. Characterization of single-nucleotide polymorphisms in coding regions of human genes. Nat Genet. 1999; 22: 231–238.
Bergseth G, Lasppegard KT, Videm V, et al. A novel enzyme immunoassay for plasma thrombospondin: comparison with beta-thromboglobulin as platelet activation marker in vitro and in-vivo. Thromb Res. 2000; 99: 41–50.
Brattstrom L, Wilcken EL, Ohrvik J, et al. Common methylenetetrahydrofolate reductase gene mutation leads to hyperhomocysteinemia but not to vascular disease. Circulation. 1998; 98: 2520–2526.
Morita H, Taguchi J, Kurihara H, et al. Genetic polymorphism of 5,10-methylenetetrahydrofolate reductase (MTHFR) as a risk factor for coronary artery disease. Circulation. 1997; 95: 2032–2036.
Mager A, Lalezari S, Shohat T, et al. Methylenetetrahydrofolate reductase genotypes and early-onset coronary artery disease. Circulation. 1999; 100: 2406–2410.
Brugada R, Marian AJ. A common mutation in methylenetetrahydrofolate reductase gene is not a major risk of coronary artery disease or myocardial infarction. Atherosclerosis. 1997; 128: 107–112.
Abbate R, Sardi I, Pepe G, et al. The high prevalence of thermolabile 5–10 methylenetetrahydrofolate reductase (MTHFR) in Italians is not associated to an increased risk for coronary artery disease (CAD). Thromb Haemost. 1998; 79: 727–730.
Ridker PM, Hennekens CH, Lindpaintner K, et al. Mutation in the gene coding for coagulation factor V and the risk of myocardial infarction, stroke, and venous thrombosis in apparently healthy men. N Engl J Med. 1995; 332: 912–917.
De Stefano V, Martinelli I, Mannucci PM, et al. The risk of recurrent deep venous thrombosis among heterozygous carriers of both factor V Leiden and the G20210A prothrombin mutation. N Engl J Med. 1999; 341: 801–806.
Tuomainen TP, Kontula K, Nyyssonen K, et al. Increased risk of acute myocardial infarction in carriers of the hemochromatosis gene Cys282Tyr mutation: a prospective cohort study in men in eastern Finland. Circulation. 1999; 100: 1274–1279.
Kee F, Morrison C, Evans AE, et al. Polymorphisms of the P-selectin gene and risk of myocardial infarction in men and women in the ECTIM extension study: etude cas-temoin de l’infarctus myocarde. Heart. 2000; 84: 548–552.
Reich DE, Goldstein DB. Detecting association in a case-control study while correcting for population stratification. Genet Epidemiol. 2001; 20: 4–16.
Pritchard JK, Rosenberg NA. Use of unlinked genetic markers to detect population stratification in association studies. Am J Hum Genet. 1999; 65: 220–228.
Simantov R, Febbraio M, Crombie R, et al. Histidine-rich glycoprotein inhibits the antiangiogenic effect of thrombospondin-1. J Clin Invest. 2001; 107: 45–52.
Laherty CD, O’Rourke K, Wolf FW, et al. Characterization of mouse thrombospondin 2 sequence and expression during cell growth and development. J Biol Chem. 1992; 267: 3274–3281.
Bornstein P. Diversity of function is inherent in matricellular proteins: an appraisal of thrombospondin 1. J Cell Biol. 1995; 130: 503–506.
Wight TN, Raugi GJ, Mumby SM, et al. Light microscopic immunolocation of thrombospondin in human tissues. J Histochem Cytochem. 1985; 33: 295–302.
Riessen R, Fenchel M, Chen H, et al. Cartilage oligomeric matrix protein (thrombospondin-5) is expressed by human vascular smooth muscle cells. Arterioscler Thromb Vasc Biol. 2001; 21: 47–54.
Yang Z, Kyriakides TR, Bornstein P. Matricellular proteins as modulators of cell-matrix interactions: adhesive defect in thrombospondin 2-null fibroblasts is a consequence of increased levels of matrix metalloproteinase-2. Mol Biol Cell. 2000; 11: 3353–3364.
Bornstein P. Thrombospondins: structure and regulation of expression. FASEB J. 1992; 6: 3290–3299.
Geourjon C, Deleage G. SOPM: a self-optimized method for protein secondary structure prediction. Protein Eng. 1994; 7: 157–164.
von Kodolitsch Y, Pyeritz RE, Rogan PK. Splice-site mutations in atherosclerosis candidate genes: relating individual information to phenotype. Circulation. 1999; 100: 693–699.
Pajukanta P, Cargill M, Viitanen L, et al. Two loci on chromosomes 2 and X for premature coronary heart disease identified in early—and late—settlement populations of Finland. Am J Hum Genet. 2000; 67: 1481–1493.
Puca AA, Daly MJ, Brewster SJ, et al. A genome-wide scan for linkage to human exceptional longevity identifies a locus on chromosome 4. Proc Natl Acad Sci U S A. 2001; 98: 10505–10508.