A Possible Explanation of Statistical Interaction, With Application to the Effects of Soluble Thrombomodulin and Soluble Intercellular Adhesion Molecule-1 on Coronary Heart Disease
Wu et al1 estimated the relative risks of coronary heart disease (CHD), according to tertile of soluble thrombomodulin (sTM) and tertile of soluble intercellular adhesion molecule-1 (sICAM). High sTM and low sICAM were associated with lower risk, but there was an important interaction. The logarithms of the relative risks in the 9 groups are shown in Table 1. (The logarithm has been used because a relative risk is a ratio, and thus 0.5 is as far away from 1 as 2 is.)
An interaction is a difference of differences, eg, the difference between the “Middle” and “Upper” columns is different in the “Upper” row (0.48) from what it is in the “Lower” row (−0.34).
I suggest this interaction may arise from additivity plus nonlinearity, as follows. (1) Contributions from sTM and sICAM add together to give a total quantity x of the important independent variable. (2) Risk does not steadily decrease with x or steadily increase with x. Rather, it depends on x nonlinearly—there is an optimal level of x at which risk is a minimum, with increased risk associated with both lower and higher x.
Let us give scores of −1, 0, and +1 to the lower, middle, and upper tertiles of sTM, and scores of −a, 0, and +a to the lower, middle, and upper tertiles of sICAM, with a being an unknown that we will fit to the data. The total x for any combination of sTM level and sICAM level will be the sum of the appropriate two components. The dependent variable (logarithm of relative risk) will be assumed to have a quadratic dependence on this, ie, b0+b1x+b2x2, where the bs are unknowns that we will fit to the data.
What are the best-fitting values of a, b0, b1, and b2? How well do they reproduce the data? I found the best-fitting values were, respectively, −0.69, −0.07, −0.29, and 0.14. That a is negative means there is less of x (whatever this is) when sICAM is high. Or, putting this another way, the difference between sTM and sICAM, with these each being in appropriately scaled units, is what is important for the risk, not the sum. A difference suggests a chemical combination of an active molecule with a reagent to form an inactive product (and knowledge of the relative molecular weights would enable the number of one molecule that combine with the other to be estimated), but this is only speculation. The predicted values are shown in Table 2.
The model has successfully reproduced the general features of the data. (If the scores for the tertiles of sTM and for the tertiles of sICAM are permitted to be unequally spaced, the fit is only slightly better.)
At best, the results suggest the scientific mechanism through which interaction arises. And at the least, they are a concise description of the interaction (there is only a single parameter more than with a noninteraction model).
Wu KK, Aleksic N, Ballantyne CM, et al. Interaction between soluble thrombomodulin and intercellular adhesion molecule-1 in predicting risk of coronary heart disease. Circulation. 2003; 107: 1729–1732.
We thank Dr Hutchinson for his comment on the analysis of interaction between soluble thrombomodulin (sTM) and soluble intercellular adhesion molecule-1 (sICAM-1) in predicting risk of coronary heart disease (CHD). He suggests that the interaction may arise from additivity plus nonlinearity and has used a quadratic model to estimate the relative risk. The predictive values derived from his model are in general agreement with our results. Dr Hutchinson’s model may be useful in assessing relative risks of interactive variables that are independent of confounding effects and do not require weighted adjustment. However, we do not think that the model proposed by Dr Hutchinson is suitable for analyzing relative risk in our case-cohort study.
In our study, plasma sTM and sICAM-1 were determined in 317 incident CHD cases and 726 noncases from a random cohort sample. Eight strata were defined when sampling the cohort. To account for the stratified sampling design, we weighted each observation within each stratum, thereby recreating the original frequency distribution of the strata in the entire cohort. Furthermore, confounding factors were adjusted. The model proposed by Dr Hutchinson does not take into consideration the effect of confounding risk factors and the weighting schemes. Estimates of relative risk by the model proposed by him will be influenced by incorporating into this model confounding effects and weighting schemes. The relative risk is best determined by a weighted proportional-hazard model, which incorporates the weighting schemes and adjusts for the effects of age, race, gender, and other confounding variables.
His presentation underscores a well-known truism in applied statistics, ie, that multiple models will fit the data similarly. However, it is insufficient at this stage only to describe the relationship among CHD, sICAM, and sTM. Rather, it is important to understand the mechanisms by which sICAM and sTM are combining to influence disease risk. And here, there is considerable work to be done.