(Circulation. 2008;118:1057-1063.)
© 2008 American Heart Association, Inc.
Statistical Primer for Cardiovascular Research |
From the Division of Statistical Genomics, Department of Genetics, and the Division of Biostatistics, Washington University School of Medicine, St Louis, Mo.
Correspondence to Ingrid B. Borecki, PhD, Division of Statistical Genomics, Washington University School of Medicine, Campus Box 8506, 4444 Forest Park Blvd, St Louis, MO 63108. E-mail iborecki{at}wustl.edu
Key Words: atherosclerosis epidemiology genetics inheritance patterns mapping meta-analysis statistics
| Introduction |
|---|
|
|
|---|
Pedigree studies have been used fruitfully to identify genes influencing a wide range of monogenic, highly penetrant traits of biomedical importance, including a variety of inborn errors of metabolism and other genetic diseases (eg, cystic fibrosis, Duchenne muscular dystrophy, Huntington disease). These have been documented extensively in Mendelian Inheritance in Man (V.A. McKusick; http://www.ncbi.nlm. nih.gov/sites/entrez?db=OMIM).1 In general, complex traits, such as coronary heart disease and its risk factors, are distinguished from these conditions in that (1) they are relatively common; (2) although they cluster in families, they do not demonstrate clean mendelian segregation patterns, suggesting the possibility of multiple underlying genes; (3) they are often influenced by >1 underlying pathway where several defects or mutations might contribute to phenotypic variation; (4) the marginal effect of any single gene on a relevant clinical end point such as atherosclerosis is likely to be small; and (5) alternative mechanisms or pathways may lead to a particular clinical outcome, that is, there may be genetic heterogeneity. These properties produce many challenges to the dissection of the genetic architecture of complex traits, some of which are advantageously met with family studies.
Family studies have several favorable features for gene discovery. Studies of extended pedigrees, or even nuclear families, are likely to represent a more homogeneous and limited set of causative genes and pathways. These features enhance statistical power for gene discovery. This approach has allowed discovery of novel loci and pathways; in a recent example, ascertainment of families with early coronary artery disease and apparent mendelian segregation led to the identification of a novel associated mutation in LRP6 (low-density lipoprotein receptor–related protein 6) (Mani et al, 2007).2 Clinical characteristics common to family members also can be used to reduce heterogeneity by defining subgroups of families for analysis (eg, early-onset breast cancer; Hall et al, 1990),3 or, in the domain of cardiovascular disease, families with maturity-onset diabetes of youth (Bowden et al, 1992)4 or familial combined hyperlipidemia (Badzioch et al, 2004).5 Analysis of trait segregation in carefully characterized pedigrees and demonstration of linkage with known genetic markers remain particularly robust and fruitful approaches to gene discovery.
Another favorable feature of family studies in contrast to studies of unrelated individuals rests in the issue of controls. The analysis of phenotypes among family members is controlled to some extent for both genetic background and environmental exposures. Because family members share a predictable proportion of their genes identical-by-descent, the background genetic variation is controlled to some extent as a function of the degree of relationship (or kinship coefficient), which can be modeled as a polygenic component. In the extreme, monozygotic twins have a strong control for genotype, leaving trait variation to epigenetic phenomena, environmental modifiers, or interactions. Similarly, (close) family members also tend to have more homogeneous environmental exposures, living in similar geographic locations with similar socioeconomic status, and perhaps even similar health-related habits such as diet, smoking, alcohol consumption, and habitual physical activity. Although these factors are not as strongly controlled as they might be in animal models, studies of families reduce residual noise variance, thereby enhancing power to detect relevant trait determinants.
Finally, on the technical side, family data allow a deeper level of genotyping quality control than is possible in studies of unrelated individuals. High rates of mendelian inconsistencies (see Glossary in the online-only Data Supplement) or markers that show significant deviations from Hardy-Weinberg equilibrium (see Glossary in the online-only Data Supplement) can be signs of genotype error, sample mixups, and quality problems.
On the other hand, family studies have disadvantages. It is more difficult and therefore more costly to identify, recruit, and enroll entire pedigrees than it is to study unrelated individuals, especially in a mobile society such as that of the United States. If one wishes to study the extremes of a distribution, such as hypertension versus hypotension or high versus low atherosclerosis as measured by intimal-medial wall thickness or coronary artery calcification, a case-control study will definitely be simpler and cheaper and may be a more efficient design, with all other factors being equal (eg, good matching, homogeneity of environmental exposures, and control of background genetic variation).
| Approaches to Gene Mapping |
|---|
|
|
|---|
Linkage
In linkage studies, we seek to identify trait loci that cosegregate with known genetic markers within families. Trait and marker loci will remain on the same gametic haplotype as a function of the distance between the 2 loci, which can be measured as the recombination frequency. Classic parametric linkage analysis explicitly models the linkage, estimating the recombination fraction under a variety of trait models (eg, dominant, additive, recessive) with appropriate penetrance functions. The support for linkage can be quantified as the log (to the base 10) of a likelihood ratio of a linkage hypothesis compared with a null model, called the logarithm of odds (LOD) score (Morton, 19557; Cottingham et al, 19938). In this approach, the effect of the locus influencing either a disease or a quantitative trait is explicitly modeled as a diallelic locus with a specific penetrance. This paradigm is quite powerful and is appropriate when good justification exists for the assumed genetic model. It has been used extensively to create the map of the human genome (among genetic markers whose genotype is equivalent to the phenotype) as well as for a number of diseases in which the mode of inheritance is well known (eg, fully penetrant, recessive inheritance for cystic fibrosis). However, in the case of complex traits, strong assumptions about a single underlying trait locus seem untenable, and alternative methods have emerged that avoid that pitfall of almost certainly incorrect assumptions about mode of inheritance.
Nonparametric methods obviate the need to characterize the trait locus by focusing on relative pairs (eg, sibs) and the correlation between the allele identity at specific locations and the similarity of their phenotypes. Thus, for a disease trait, affected sib pairs would be expected to share a greater proportion of alleles identical by descent at a marker that is linked to a trait locus than expected under the mendelian null (50% allele sharing). Likewise, the more alleles shared identical-by-descent at a linked marker locus, the more similar are the quantitative phenotypes under the alternative hypothesis of linkage; under the null, no relationship exists between the two. These general expectations have given rise to a number of statistical strategies for linkage analysis including nonparametric linkage scores (Kruglyak et al, 19969) and variance components models (Almasy and Blangero, 1998,10 Province et al, 2003,11 and Abecasis et al, 200312).
The power of nonparametric linkage has been extensively characterized. The actual effect size of a locus is a function of the allele frequencies, the penetrance, and the recombination fraction (where the latter 2 are confounded). In addition, all the usual factors affect power, including sample size, pedigree informativeness (size and variety of biological relationships), marker informativeness, the test statistic, and critical value for the statistical test. Critically, linkage analysis is inherently limited as to the smallest effect detectable when all other parameters are asymptotically at their maxima (eg, as large a sample size as could be realistically obtained) or fixed to their optimal values (eg, recombination fraction
=0). Risch13 (1990) explored the power characteristics of a nonparametric linkage statistic under various conditions and reported adequate power (
80%) to detect loci accounting for a minimal elevated sibling recurrence risk of
1.5 to 1.7. Similarly, simulation studies performed for the Family Heart Study with >3300 subjects in 510 extended families demonstrate that, with the use of variance components linkage models, only loci influencing a minimum of 8% to 10% of the trait variation are potentially detectable even allowing for a liberal critical significance level of 5% (Figure 1). For perspective, the apolipoprotein E locus accounts for
8% of variation in total cholesterol and
4% in low-density lipoprotein cholesterol (Boerwinkle et al, 1987),14 and the
4 allele is associated with an increased risk of coronary heart disease (odds ratio
1.25) compared with the
3 allele (Wilson et al, 1994).15 It is not clear whether apolipoprotein E would have been detected as an important coronary heart disease locus via linkage. Although locus discovery by linkage is a robust strategy, it is likely to be productive only for regions with a substantial effect on the trait variation, either via a single loci or a cluster of loci each with smaller effect.
|
Association
Association studies seek to directly correlate allelic variation with phenotypic variation, with the goal of statistically identifying putative genetic causes. If the relevant genetic variation is measured (eg, apolipoprotein E variants), then the power for discovery is simply a function of the effect size. However, genomewide association scans utilize panels of markers that either are anonymous and uniformly distributed or are tags for common haplotypes across the genome. Typically, these markers are common (minor allele frequency
5%) single nucleotide polymorphisms (SNPs), which serve well as markers but are not a catalog of all possible causal variants. Thus, these markers are not necessarily functional but may be in linkage disequilibrium with underlying causal variants. In this case, the power to identify relevant loci is a function of the linkage disequilibrium between the causative and measured variants.
Two general association strategies exist: simple statistical models to correlate risk genotypes with outcome (eg, contingency table analysis, logistic regression, regression, Cox proportional hazard models), which are typically applied to unrelated individuals, or family-based tests that rely on transmission patterns from parents to offspring. Quite different information is used in the latter, which seek to identify alleles with excess transmission to affected offspring compared with mendelian expectations (Spielman et al, 1993).16 The transmitted alleles to affected offspring form the "case" genotype, whereas the untransmitted alleles are the "control" genotype. Heterozygosity in both parents is necessary to render a particular trio fully informative for the transmission disequilibrium test, which means that, typically, some proportion of families is not used if allelic transmissions cannot be resolved. Although this leads to some sacrifice in power, these transmission tests are very robust for population stratification (see Glossary in the online-only Data Supplement) (Ewens and Spielman, 199517), which, if not accounted for, can lead to elevated type I error rates. The transmission disequilibrium test (TDT) is equivalent to the classic McNemar test, in which we look at the 2x2 table (Table) of transmitted versus untransmitted alleles in the parent-offspring trio in which the offspring is affected.
|
For subjects in cells A and D, we cannot tell which of the alleles was transmitted from parents to children (identical by descent) because both are identical by state. Thus, no information is available on transmission in the diagonal cells, and all of the information is in B and C. Because the children are all affected, under the null hypothesis allele1 and allele2 are equally likely to be transmitted. Thus, the expected proportions for each of allele1 and allele2 are (B+C)/2. On the other hand, if a true genotype-phenotype association exists, then one allele will be preferentially transmitted. This hypothesis can be tested by an (O–E)2/E
2 approach, where O is the observed count and E is the expected count: equation
|
|
which is the same as McNemars formula. Note that the TDT is a type of "case-only design" in that the parental phenotypes are not used, and only their genotypes are relevant. The TDT is a test of both linkage and association because, in effect, the measured genotype being tested is marking transmission of a haplotype from parents to children. Thus, the TDT (and its extensions) can pick up a genetic association signal from far away (perhaps megabases) from the observed marker.
A related extension of the TDT is the Family Based Association Test (FBAT; Laird et al, 200018). The basic FBAT statistic is U=S–E[S], where equation
|
|
and Xij is the genotype for the jth offspring in the ith family, Tij=(Yi–µij), and Yij denotes the phenotype. E[S] is calculated under the null hypothesis of no genotype-phenotype association, so that E[U]=0 under the null. Calculating V=Var(U)=Var(S) under the null, we get the standardized equation
|
|
as approximately distributed as N(0,1), which yields the
2 test:
2=U · V–1 · U with degrees of freedom equal to the rank of V. Like the TDT, FBAT does not use phenotypic information from parents and is actually a combined test of linkage and association because it also discounts families in which transmission from parents to children is ambiguously observed (because of homozygosity in the parents). This has caused much confusion in the literature because readers assume that FBAT must be a "pure" association test (it may be more properly termed FBLAT [Family Based Linkage and Association Test]).
Type I Errors and Association Tests in Families
Standard association tests also can be done in families. These approaches are generally more powerful because all subjects are informative regardless of genotype; however, a complication exists. If a standard genotype-phenotype generalized linear model holds in every family member, the residuals in this model are not independently and identically distributed, as is the case for unrelated individuals. Instead, the residual variance-covariance matrix is sparse (nonzero correlations only in family blocks). Ignoring the cluster correlation can inflate type I error, producing false inferences. The Huber-White "sandwich" estimator provides a robust variance-covariance matrix estimate for clustered sampling (Diggle et al, 199419). For S families, the sandwich variance estimator is as follows: equation
|
|
where X is fixed effects design matrix, V is variance covariance matrix, and matrices indexed by i for the ith family and the unindexed matrices are for the entire data set, so that equation
|
|
is the vector of ordinary least squares residuals for the ith family (ie, ignoring familial correlations), which gives an initial estimate of familial correlations. It is "sandwiched" like meat between 2 information matrices to give a more robust variance estimate.
The sandwich method allows for tests in family data without inflation of type I error arising from ignoring familial correlations (eg, Province et al, 200020) and uses all data from all subjects in all families, even those that FBAT (Laird et al, 2000)18 or quantitative TDT (QTDT; Abecasis et al, 200212) finds are "uninformative" because vertical transmission is ambiguous. It can also be used for qualitative phenotypes (Liu, 1998).21
Another genotype-phenotype association strategy in family data is a bootstrap procedure (eg, Province et al, 200022) creating an independently and identically distributed subsample of unrelated individuals by randomly choosing 1 subject per family. However, this greatly reduces power because the effective sample size in each subsample is the number of pedigrees rather than subjects. Bootstrap theory (Efron, 198223) suggests that it is possible to get "good" parameter estimates by bootstrap sampling of entire families with replacement and averaging results across samples, which restores the effective sample size to the number of individuals and not pedigrees. Bootstrapping families preserves the dependencies between the family genotype and phenotype vectors.
With the use of SNP data from the National Heart, Lung, and Blood Institute (NHLBI) Family Heart Study (n=2753), 1 SNP was arbitrarily designated as "causative," and a phenotype, Y, was simulated via regression Y=
+βxSNP+
with parameters (
, β, and
N(0,
), where
is the family variance-covariance matrix for a polygenic trait with 40% heritability (SEGPOWER; Province et al, 200320). The simulated regression model errors are not independently and identically distributed but are correlated within families via polygenic transmission. In Figure 2, the nominal P values are plotted against their ranks. Under H0 (top panel), all methods give the correct null distribution, tracking the identity line. In the middle panel, β is set to a locus-specific heritability of 5%. Under this alternative, both the sandwich and family bootstrap yield more power than either QTDT or FBAT, which eliminate "uninformative" families with ambiguous transmission. In these cases, the family-based approaches are less powerful than the analysis of individual subjects data.
|
Population stratification was generated in this simulated example by arbitrarily dividing our families into 2 equal strata, keeping β=0 for each stratum but offsetting the phenotypic means by strata-specific intercepts,
STRATA, and also swapping the minor with the major alleles of the causative locus in 1 stratum only. Analyzing the data in the usual way or using a method that does not protect against hidden stratification produces a false-positive overall significant regression because β=0 in each stratum (bottom panel, Figure 2). FBAT P values almost perfectly track at the expected uniform distribution, whereas QTDT is actually slightly overly conservative. The sandwich estimator is slightly liberal but affords some protection. The family bootstrap provides almost no population stratification protection, resulting in serious type I error inflation. Thus, family-based transmission tests have the advantage in the presence of stratification.
Families Can Be More Powerful Than Unrelated Subjects for Association
The conventional wisdom is that unrelated individuals are more powerful for genetic associations than families, but several investigators are now finding the opposite to be true (eg, Krull, 200724; Wessel et al, 200725). The argument against families is that adding a nonindependent subject to a sample does not add a whole1 extra persons worth of information. It only adds a fraction of information depending on the degree of familial correlation. Indeed, if genotype, phenotype, and residuals were perfectly correlated, then all data on the 2 subjects are identical and therefore redundant. However, this never actually happens in families. Even in the case of identical twins, complex phenotypes are never identical, and neither are residuals. Error variance is a critical determinant of power. The smaller the unexplained error variance, the greater is the power to estimate all model parameters. Families can be more powerful than unrelated individuals because extra information exists with which to explain variation, thus reducing error variance.
Unrelated individuals are sampled from larger family units. A sample of J sib pairs, with phenotype Yij correlated to genotype Sij, will have the siblings correlated for reasons beyond the genotype Sij (on average, sibs share half their genome identical by descent, at least some of which may affect the phenotype), so the "error" in this regression model on 2J subjects comes from 2 sources: (1) a variance component 
ij that is pairwise correlated in siblings by
and (2) an independent residual eij. equation
|
|
An equal sample of 2J unrelated individuals (selecting 1 from each of a sample of twice as many sib pairs 4J) will have the same genotype-phenotype correlation [same true (
,β)], but its residual will be the sum of the 2 variance components in parentheses in the first model. In the sib pairs, we can estimate the familial variance component 
ij (sandwich estimator, above), so the unexplained variance is only
ij. Thus, power to estimate the gene effect of interest (β) is increased in the family sample over that in the unrelated subjects when explicitly modeling the correlation among family members. Intuitively, this makes sense. Extra information is available in the family design that is unavailable in unrelated individuals, and extra information always reduces error and boosts power at the same sample sizes.
| Meta-analysis Combining Linkage/ Association Results |
|---|
|
|
|---|
The correlated meta-analysis is based on Fishers method of combining of P values (1925), in which independent studies test the same hypothesis. For nonparametric linkage, because all of the negative evidence is truncated at LOD=0, the P value distribution is a nonuniform discrete/continuous mixture, but this complication is overcome by interpreting LOD=0 as P=1/2ln(2)
0.72 in Fishers formula (Province, 200127). For nonindependent scans (as will occur when some of the same subjects have been used to generate both linkage and association), the mixture distribution is quite complex. The P values are transformed to a normal scale Zi=probit(pi), and the basic multivariate statistics theorem is used that if (Z1, Z2, ..., Zk)~N(0,
kxk), then
(Zi)~N[0, SUM(
kxk)]. It is only necessary to estimate
kxk (variance-covariance matrix of scans) to apply the method. Complications include the truncation of negative linkage evidence (LOD
0) as well as the fact that at least some genomic regions scan should be under the alternative. But since the majority of the genome is under the null, the contamination should be relatively minor. Both complications are minimized by dichotomizing the evidence at each locus around its natural balance point. Under H0, we expect linkage scans to be approximately half positive and half negative (LOD >0 versus LOD
0). Similarly, for a genomewide association scan under H0, we expect 50:50 P<0.5 and P
0.5. These critical points can be used to roughly dichotomize the evidence at each locus. Among K genomewide scans (linkage or association), for each pair the 2x2 table of dichotomized evidence is formed, and the underlying tetrachoric correlation is estimated to obtain estimates
KxK to discount inflated meta-analysis evidence. The correlated meta method works well in simulation and is being used by the GeneLink consortium of genomewide linkage studies (sponsored by the NHLBI; https://genelink.nhlbi.nih.gov/index.jsp). This meta-analysis approach represents a method by which numerous lines of evidence from the analysis of family data (or from other study designs) can be combined to achieve optimal inferences about the presence and location of complex trait genes.
| Conclusions |
|---|
|
|
|---|
| Acknowledgments |
|---|
Dr Borecki is principal investigator of 2 R01s of family genetic studies: R01DK068336, Mapping Adiposity QTLS in the NHLBI Family Heart Study; and R01DK075681, Genetic Epidemiology of Metabolic Diseases of Obesity. Dr Province is Principal Investigator of 3 R01/U01s of family genetic studies: R01HL087700, Family Health Scan (FHS-SCAN) Genome Wide Association Scan for Atherosclerosis Pathway Genes; U01AG023746, Extreme Longevity Family Study–DMCC; and U01HL088655, Program for Genetic Interactions (PROGENI) Network Data Coordinating Center.
| Footnotes |
|---|
| References |
|---|
|
|
|---|
2. Mani A, Radhakrishnan J, Wang H, Mani A, Mani MA, Nelson-Williams C, Carew KS, Mane S, Najmabadi H, Wu D, Lifton RP. LRP6 mutation in a family with early coronary disease and metabolic risk factors. Science. 2007; 315: 1278–1282.
3. Hall JM, Lee MK, Newman B, Morrow JE, Anderson LA, Huey B, King MC. Linkage of early-onset familial breast cancer to chromosome 17q21. Science. 1990; 250: 1684–1689.
4. Bowden DW, Akots G, Rothschild CB, Falls KF, Sheehy MJ, Hayward C, Mackie A, Baird J, Brock D, Antonarakis SE, et al. Linkage analysis of maturity-onset diabetes of the young (MODY): genetic heterogeneity and nonpenetrance. Am J Hum Genet. 1992; 50: 607–618.[Medline] [Order article via Infotrieve]
5. Badzioch MD, Igo RP Jr, Gagnon F, Brunzell JD, Krauss RM, Motulsky AG, Wijsman EM, Jarvik GP. Low-density lipoprotein particle size loci in familial combined hyperlipidemia: evidence for multiple loci from a genome scan. Arterioscler Thromb Vasc Biol. 2004; 24: 1942–1950.
6. Borecki IB, Province MA. Linkage and association: basic concepts. In: Rao DC, Gu CC, eds. Genetic Dissection of Complex Traits. 2nd ed. New York, NY: Academic Press (Elsevier); 2008.
7. Morton NE. Sequential tests for the detection of linkage. Am J Hum Genet. 1955; 7: 277–318.[Medline] [Order article via Infotrieve]
8. Cottingham RW, Idury RM, Schaffer AA. Fast sequential genetic linkage computation. Am J Hum Genet. 1993; 53: 252–263.[Medline] [Order article via Infotrieve]
9. Kruglyak L, Daly MJ, Reeve-Daly MP, Lander ES. Parametric and nonparametric linkage analysis: a unified multipoint approach. Am J Hum Genet. 1996; 58: 1347–1363.[Medline] [Order article via Infotrieve]
10. Almasy L, Blangero J. Multipoint quantitative-trait linkage analysis in general pedigrees. Am J Hum Genet. 1998; 62: 1198–1211.[CrossRef][Medline] [Order article via Infotrieve]
11. Province MA, Rice T, Borecki IB, Gu C, Rao DC. A multivariate and multilocus variance components approach using structural relationships to assess quantitative trait linkage via SEGPATH. Genet Epidemiol. 2003; 24: 128–138.[CrossRef][Medline] [Order article via Infotrieve]
12. Abecasis GR, Cherny SS, Cookson WO, Cardon LR. Merlin: rapid analysis of dense genetic maps using sparse gene flow trees. Nat Genet. 2002; 30: 97–101.[CrossRef][Medline] [Order article via Infotrieve]
13. Risch N. Linkage strategies for genetically complex traits, II: the power of affected relative pairs. Am J Hum Genet. 1990; 46: 229–241.[Medline] [Order article via Infotrieve]
14. Boerwinkle E, Visvikis S, Welsh D, Steinmetz J, Hanash SM, Sing CF. The use of measured genotype information in the analysis of quantitative phenotypes in man, II: the role of the apolipoprotein E polymorphism in determining levels, variability, and covariability of cholesterol, betalipoprotein, and triglycerides in a sample of unrelated individuals Am J Med Genet. 1987; 27: 567–582.[CrossRef][Medline] [Order article via Infotrieve]
15. Wilson PW, Myers RH, Larson MG, Ordovas JM, Wolf PA, Schaefer EJ. Apolipoprotein E alleles, dyslipidemia, and coronary heart disease: the Framingham Offspring Study. JAMA. 1994; 272: 1666–1671.
16. Spielman RS, McGinnis RE, Ewens WJ. Transmission test for linkage disequilibrium: the insulin gene region and insulin-dependent diabetes mellitus (IDDM). Am J Hum Genet. 1993; 52: 506–516.[Medline] [Order article via Infotrieve]
17. Ewens WJ, Spielman RS. The transmission/disequilibrium test: history, subdivision, and admixture. Am J Hum Genet. 1995; 57: 455–464.[Medline] [Order article via Infotrieve]
18. Laird NM, Horvath S, Xu X. Implementing a unified approach to family based tests of association. Genet Epidemiol. 2000; 19 (suppl 1): S36–S42.[CrossRef][Medline] [Order article via Infotrieve]
19. Diggle PJ, Liang KY, Zeger SL. Analysis of Longitudinal Data. Oxford, UK: Clarendon Press; 1994.
20. Province MA, Rice TK, Borecki IB, Gu C, Kraja A, Rao DC. Multivariate and multiocus variance components method, based on structural relationships to assess quantitative trait linkage via SEGPATH. Genet Epidemiol. 2003; 24: 128–138.[CrossRef][Medline] [Order article via Infotrieve]
21. Liu H. Robust standard error estimate for cluster sampling data: a SAS/IML macro procedure for logistic regression with huberization: SUGI23. 1998; 205.
22. Province MA, Arnett DK, Hunt SC, Leiendecker-Foster C, Eckfeldt JH, Oberman A, Ellison RC, Heiss G, Mockrin SC, Williams RR. Association between the alpha-adducin gene and hypertension in the HyperGENStudy. Am J Hypertens. 2000; 3: 710–718.
23. Efron B. The Jackknife, the Bootstrap, and Other Resampling Plans. Philadelphia, Pa: SIAM; 1982.
24. Krull JL. Using multilevel analyses with sibling data to increase analytic power: an illustration and simulation study. Dev Psychol. 2007; 43: 602–619.[CrossRef][Medline] [Order article via Infotrieve]
25. Wessel J, Schork AJ, Tiwari HK, Schork NJ. Powerful designs for genetic association studies that consider twins and sibling pairs with discordant genotypes. Genet Epidemiol. 2007; 31: 789–796.[CrossRef][Medline] [Order article via Infotrieve]
26. Roeder K, Bacanu SA, Wasserman L, Devlin B. Using linkage genome scans to improve power of association in genome scans. Am J Hum Genet. 2006; 78: 243–252.[CrossRef][Medline] [Order article via Infotrieve]
27. Province MA. The significance of NOT finding a gene. Am J Hum Genet. 2001; 69: 660–663.[CrossRef][Medline] [Order article via Infotrieve]
28. Province MA. Meta-analyses of correlated genomic scans. Genet Epidemiol. 2005; 29: 137.
29. Amos CI. Successful design and conduct of genome-wide association studies. Hum Mol Genet. 2007; 16: 220–225.[CrossRef]
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Circulation Home | Subscriptions | Archives | Feedback | Authors | Help | AHA Journals Home | Search Copyright © 2008 American Heart Association, Inc. All rights reserved. Unauthorized use prohibited. |