Epidemiologic Challenges in Cluster Investigations

Several of the difficulties encountered in cluster investigations parallel those discussed earlier for investigations of outbreaks and occupational clusters; others are more applicable to cluster studies due to their noninfectious nature. Limitations have been summarized by several researchers (CDC 1990; Roth-man 1990; Neutra 1990). Rothman (1990) has suggested that there is "little scientific or public health purpose to investigate individual disease clusters at all." While it is likely that scientific potential is limited, cluster investigations can serve important public health functions. The following section outlines a few of the most important challenges one may encounter in a cluster investigation. The difficulties stem in part from the constraints of available information in a typical cluster investigation (Reynolds et al. 1996).

Rare Health Events. Cluster investigations tend to focus on relatively rare health events (e.g., those with incidence rates less than 10 per 100,000 persons). Due to these relatively small numbers, standard statistical methods that rely on normal or near-normal distributions cannot be used in most cluster investigations. Therefore, alternative statistical tests are needed (see the next section, "Statistical Methods for Cluster Investigations").

Vague Definition and/or Heterogeneity of Cases. Accurate definition and enumeration of cases are essential elements of cluster investigations. Frequently, reported clusters have vague case definitions, or cases that may appear to represent one disease may actually represent many diseases. For example, a citizen may be concerned that "we have an excess of cancer." After investigation, it may be revealed that many types of cancer are present, each with a distinct etiology.

Lack of a Population Base for Rate Calculation. A population at risk must be determined in order to calculate rates. In a cluster investigation, the geographic distribution of the reported cases commonly does not coincide with boundaries for population (denominator) data such as counties, zip codes, or census tracts. This makes accurate calculation of rates difficult and sometimes impossible.

Weak Associations and Multiple Risk Factors. The difficulties in measuring weak associations are discussed elsewhere in this book (Chapters 1 and 2). Many noninfectious disease clusters are purported to be environmentally related, and if an association exists, relative risk estimates are likely to be less than three. Statistical power calculations for many chronic disease clusters have suggested that relative risk estimates must be 8 or larger to achieve statistical significance (Neutra 1990). Such a large risk estimate in most cluster investigations is unlikely.

In addition, noninfectious diseases, in particular chronic diseases, have multifactoral etiologies (Brownson et al. 1993). As noted in Chapters 1 and 2, these risk factors can be especially subject to methodologic biases. In practice, this combination of weak, multiple risk factors means that an analytic study to assess causation may require hundreds or thousands of cases to detect a statistically significant relative risk estimate.

Long Induction Periods. The induction period is the interval from the time from the exposure and causal action of a risk factor to the initiation of the disease (Last 1995). With infectious diseases and chemical outbreaks, the relevant exposures occurred only hours or days earlier. Yet for many chronic diseases with long induction periods (years or decades), assessment of relevant exposure is extremely difficult.

Multiple Comparisons. An epidemiologic study has a higher probability of producing a statistically significant result when a large number of comparisons or associations are examined. Therefore, it may be relatively easy to detect statistical clusters because of the vast number of comparisons by geographic area, number of conditions (e.g., types of cancer), and temporal options. Therefore, active public health surveillance for clusters may lead to false positives.

Low-Level, Long-Term, Heterogeneous Exposures. As noted earlier, relevant exposures in noninfectious disease clusters may have occurred years or decades prior to disease initiation. In addition, cluster investigations with a suspected environmental etiology commonly involve exposures that are much too low to produce measurable effects. For example, it is estimated that one would need about 70% of a maximum tolerate dose of a carcinogen for a full year to produce a sevenfold increase in a moderately rare cancer (Neutra et al. 1989; Neutra 1990). Advances in molecular epidemiology cited in Chapter 1 may assist in determining appropriate biomarkers to measure exposure.

Intense Publicity. As noted earlier for outbreak investigations, media publicity and public controversy can make the unbiased investigation of clusters difficult.

Resource Intensiveness of Full Investigations. Conducting an analytic epidemiologic study of a disease cluster requires considerable resources and takes months or years to complete. These investigations can seriously impact the resources of state and local health departments and are not always the best use of public resources. In addition, the time needed to complete a sound investigation may be perceived as "foot-dragging" by members of the public who desire a more immediate answer. It has been proposed that full investigations should be limited to clusters that have potentially large relative risk estimates, at least five confirmed cases, and good estimates of personal exposure (Neutra 1990).

Was this article helpful?

0 0

Responses

  • Olli
    What is a cluster investigation?
    2 months ago

Post a comment