Imagine a hypothetical community trial in which a woefully underfunded and wholly ineffective smoking-cessation program in community A is compared with no intervention at all in community B. For simplicity, also assume that community A's program is evaluated based on a comparison of quit rates in a random sample of smokers in each of the two communities after the intervention in community A has been in place for a while. What is the probability of finding a statistically significant difference in quit rates between the two communities?
Given that the program in community A is ineffective, we might be tempted to think that this probability is simply a, the probability of a type I
error when the null hypothesis is true, usually chosen to be 0.05. In fact, the chance of finding a significant difference is probably considerably higher than a. The reason is related to a very basic idea in epidemiology: namely, that diseases do not occur at random in populations but vary systematically in relation to personal characteristics, time, and (importantly) place. Often such geographic differences are the source of hypotheses, and often they remain unexplained, but there is generally no denying that they exist. Epidemiologists usually focus on disease occurrence, but the same observation applies to other health-related phenomena, including health behaviors. Diehr et al. (1993), for example, studied the prevalence of smoking, alcohol use, dietary fat consumption, and seatbelt use among 13 communities participating in a health promotion grants program before any intervention had been implemented. Many statistically significant differences among communities were found, and the variation between sites remained large and occasionally even increased after adjustment for a wide range of sociodemographic and health characteristics. LaPrelle et al. (1992) reported similar findings in a 10-community study on adolescent cigarette smoking.
Statistically, we can think of overall variability in some outcome measure, Y, in the large population formed by combining individuals across several study communities. The total variance of Y can be partitioned into two components: (1) the variance in Y between individuals within the same community, denoted ct2 and assumed for simplicity to be the same for all communities; and (2) the variance in the community-specific mean value of Y among communities, denoted ctc2. The total variance of Y is crc2 + cr2.
In the community-trial literature, the phenomenon of community-level variation is discussed in two equivalent ways: (1) as greater variability in Y among community means than would be expected based on observed within-community variation—that is, ctc2 > 0; or (2) as less variability in Y within communities than would be expected from its total variation among individuals pooled across all communities—that is, cr2 < crc2 + a2, which just amounts to saying o-c2 > 0 in another way.
Depending on which perspective is taken, community-level variation may be quantified differently. One good measure of community-level variation is simply (tc2 itself. Another measure that appears commonly in the literature is the intraclass correlation coefficient (ICC), defined in our notation as q-c2
crc2 + cr2
which expresses the size of crc2 as a proportion of the total variation in Y. The ICC can also be interpreted as the degree of correlation in Y among indi viduals within the same communities. Donner (1981) describes a variance inflation factor, calculated from the ICC, that reflects the amount of increase in the variance of a treatment-group-specific mean over what would have been observed if individuals had been randomized.
Although epidemiologists involved in community trials must deal with community-level variation whether its causes are known or not, it helps to be aware of some generic mechanisms by which it can come about (Donner et al. 1990). One is self-selection: individuals often choose to reside in a given community because they have characteristics in common with other community residents and thus "fit in." Those characteristics may, in turn, be associated with the health behaviors of interest. Another mechanism is that residents of the same community share exposure to a common physical and sociocultural environment, which influences their behavior. Yet another mechanism is a type of contagion: just as infectious agents can be spread from person to person, so, too, may attitudes, norms, and behaviors be transmissible among people who are in regular contact, resulting in behavioral homogeneity. Heroin abuse, for example, has been investigated as a contagious disease (De Alarc'on 1969).
The situation becomes a little more complex when changes over time are also considered. Figure 6-1 shows two contrasting patterns of change in the prevalence of smoking in three hypothetical communities. In panel A, the vertical separation of the three community-specific lines implies communitylevel variation in smoking prevalence (a main effect of community), and there is an obvious downward trend in smoking prevalence over time in each site (a main effect of time). However, the time trends in the three communities are parallel (no community-by-time interaction). Panel B, in contrast, illustrates substantial community-by-time interaction variance: It is the degree of non-parallelism in the time paths observed for different communities, sometimes denoted as Vct2- It> too, can arise for a variety of reasons, including different preexisting secular trends in different communities, sociocultural variation
Figure 6-1. Illustration of community-by-time variation
Figure 6-1. Illustration of community-by-time variation among communities with regard to amenability to behavior change, and local current events that may act as cues to behavior change.
As described below, community-level variation plays an important role in the planning and analysis of community trials. Investigators are accustomed to estimating ct2 when deciding on sample-size requirements for an ordinary clinical trial on individuals. In community trials, there are at least two kinds of sample sizes to consider (number of communities and number of individuals per community) and two kinds of variability to be estimated in study planning and accommodated during data analysis (cr2 and either crc2 or crC7^, depending on the design). In addition, some study designs preclude accounting properly for community-level variation in the analysis. These designs are not well suited to evaluation of community-level interventions because they can lead to exaggerated claims of statistical significance and thus to erroneous conclusions about program effectiveness.
Was this article helpful?