O'Brien, Chief of the Laboratory of Genomic Diversity at the National Cancer Institute in Frederick, Maryland. Dr. O'Brien, who leads a team that is developing a linkage map of cat genes (organizing their linear order on the cat chromosomes), had just published a paper characterizing the location of 400 DNA markers known as short tandem repeats (STRs) on the cat genome.
An STR is simply a short stretch of repeated units of DNA letters. For example, CACACACACA is a five-dinucleotide CA repeat (the C and A are the symbols for two of the four chemical building blocks of DNA). In cats, humans, and virtually every other higher organism, there are many stretches of road on the vast DNA highway that do not code for proteins (some people call these noncoding regions "junk" DNA), but which contain STRs. STR loci are highly variable. At a locus where you might have 9 CAs in a row, your next-door neighbor might have 12, and I might have 7. Because the technology to demonstrate the difference between samples due to variation in the length of the STR is well-established, this diversity offers a way to determine whether or not two samples derive from the same individual. Because there are many STR loci at different addresses on the chromosomes, it is possible to use a consensus set of them to determine whether two DNA samples come from the same individual. Much work has been done in assembling a set of STR loci to use in matching human samples. Except in the case of identical twins, the odds of two randomly selected persons having the same flavor of STR at just 5 loci is vanishingly small. In general, if two samples match at even 3 of the 13 loci currently used in human identification studies, they are highly likely to match at as many more as are tested. Conversely, if two samples do not show exactly the same STR number at even 1 locus, it is almost certain that they are not from the same individual.
The harder question is to decide how many exact matches one must demonstrate to argue convincingly that the samples are identical. During the years from 1987 to 1996, this was the central question of DNA foren-sics. The debate heated up in 1992 when a Committee on DNA Forensic Science that was convened by the National Research Council (on which I served) issued a report that took an extremely cautious approach to the use of DNA evidence in criminal trials. At the time, we were concerned that we did not yet know enough about the independence of DNA loci used in forensic testing. In particular, we worried that not enough was yet known about the distribution of variations in the genome across different populations of the human family to permit us to assume that they were inherited independently. We worried that we might be relying too heavily on the so-called "product rule."
Simply put, the product rule supposes that the result from each DNA locus that is tested is truly independent of the results at every other. If that is the case, then one can determine the odds of a random match at each locus and multiply them together. Applying the product rule, the possibility that two DNA samples match merely by luck quickly gets very small, so the argument for the identity of the samples being compared becomes extremely strong. To illustrate, imagine that your DNA has been tested at five loci and that at these five points it exactly matches the DNA profile of some unknown sample. Must the unknown sample belong to you? It depends. If studies of a randomly selected reference population show that the particular type of DNA (in Snowball's case the length of the STRs) you have at each locus is only found in 1 out of every 100 persons, then the odds of the unknown sample deriving from someone else are very small. It is on the order of 1 in 10,000,000,000 (1/100 X 1/100 X 1/100 X 1/100 X 1/100), a number that is less than the total number of people on the planet! Of course if the set of independent STRs in question includes a set of flavors each of which is found randomly in one out of five persons, the odds of a chance match are much higher—about one in 3125 (1/5 X 1/5 X 1/5 X 1/5 X1/5). This would be a rare event,but,unlike the first calculation, it is imaginable.
How can one prove that the DNA loci selected for the test system are independent? It helps to use markers located on different chromosomes which are by definition not linked, that is, are transmitted independently of each other during the formation of egg and sperm. The real task, however, is to amass a reference population in which one can study the distribution of the various loci and assess whether they are independent of each other. If they are independent, one can compile frequency data and use the product rule as above. The odds of your particular version of a DNA sequence matching that of a sequence taken from a randomly selected sample depends on what reference population was created. A key issue is whether the reference population is an appropriate one to use in determining the odds of a random match. At the least it is necessary to create several populations that account for (usually small) differences in the fre quency of DNA loci among major racial groups. Put another way, if you are a member of a subset of the human family that has for centuries tended to marry within a particular subpopulation (say Finns), the standard reference population may not be appropriate to use in calculating the odds that your DNA and a crime scene sample match by chance.
The debate over the degree to which DNA varied at specific points among different ethnic groups waxed hot for several years and generated numerous papers by population geneticists. In the last few years we have learned a lot more about the remarkable similarity of human subgroups when they are compared at most DNA loci and, along the way, most of the concerns about reference populations have been laid to rest. Today, those who want to attack the use of DNA identification evidence in court must focus on whether the laboratory that did the analysis did it correctly (as in the Simpson case). Few, if any, courts will deny admission to DNA evidence based on concerns about population substructure.
When he agreed to attempt to perform DNA identity testing on blood from Snowball and DNA extracted from crime scene cat hairs, Dr. O'Brien realized right away that he faced a feline version of the population substructure problem. Prince Edward Island is a relatively unpopulated, geographically isolated, community. O'Brien knew nothing about the history of the island's cat population. It was certainly possible that most of the cats descended from a small number of founding ancestors. If that was the case, then two randomly selected island cats could well match at several loci. If the island cats were highly inbred, finding a match between Snowball's DNA and that of the crime scene hairs would be of little forensic value because there would be a fair chance that Snowball's DNA would match with that of most other local cat DNA.
If he found that the two forensic samples matched, Dr. O'Brien's task was only beginning. He would still have to satisfy himself so he could eventually testify in court of his certainty that the chances of a match were so low as to allow a jury to conclude that the white hairs on the jacket came from Snowball. O'Brien was able to extract enough DNA from one cat hair to do the identity test. The two samples matched at 10 STRs, double the number normally used in human DNA testing. He next took on the problem of characterizing the genetic diversity of the island's cat population.
The only way to do this was to look at a lot of cat DNA. O'Brien collected blood samples from 19 cats on Prince Edward Island that were thought to be unrelated and from nine cats in different parts of the United States. By studying STRs in these two groups he was able to show that the cats on the island were not highly inbred and that their genetic structure was about as diverse as that of cats in the United States. Once he was convinced that there was sufficient diversity to justify using the product rule, O'Brien used seven newly discovered STRs to calculate the odds that Snowball's DNA matched the DNA from the white hairs found on the bloodstained jacket just by chance. He concluded that the chance was less than 1 in 40,000,000!
O'Brien's forensic report was the crucial piece of evidence in the prosecution's case. There were no witnesses to the crime, no murder weapon was found, and Beamish steadfastly proclaimed his innocence. The DNA evidence placing Shirley Duguay's blood and Snowball's hairs on the same coat was all that convincingly linked Beamish to the crime. The forensic evidence was evaluated by the Supreme Court of Prince Edward Island, which permitted its use. On July 19,1996, Douglas Beamish was convicted of second-degree murder and sentenced to an 18-year prison term with no possibility of parole.
Was this article helpful?