The development of STR multiplexes

The forensic community has selected STR loci to incorporate into multiplex reactions based on several features including:

• discrete and distinguishable alleles;

• amplification of the locus should be robust;

• a high power of discrimination;

• an absence of genetic linkage with other loci being analysed;

• low levels of artefact formation during the amplification (see Chapter 7);

• the ability to be amplified as part of a multiplex PCR.

Table 6.1 The development of STR systems. Two STR systems, the quadraplex (QUAD) and second generation multiplex (SGM) were developed by the Forensic Science Service in the UK. The AmpF/STR® SGM Plus® became commercially available in 1999 and has been adopted by a large number of laboratories for routine forensic casework. The AmpF/STR® Identifiler® and PowerPlex® 16 both analyse 15 STR including the 13 lociCODIS loci that are required to be analysed for forensic casework in the USA. The two kits are used widely worldwide, particularly for kinship testing

Table 6.1 The development of STR systems. Two STR systems, the quadraplex (QUAD) and second generation multiplex (SGM) were developed by the Forensic Science Service in the UK. The AmpF/STR® SGM Plus® became commercially available in 1999 and has been adopted by a large number of laboratories for routine forensic casework. The AmpF/STR® Identifiler® and PowerPlex® 16 both analyse 15 STR including the 13 lociCODIS loci that are required to be analysed for forensic casework in the USA. The two kits are used widely worldwide, particularly for kinship testing

QUAD

SGM

SGM Plus®

Identifiler®

PowerPlex® 16

vWA

Amelogenin

Amelogenin

Amelogenin

Amelogenin

THO1

vWA

D2S1338

D2S1338

D2S1338

F13A1

D8S1179

vWA

vWA

vWA

FES

D21S11

D16S359

D16S359

D16S359

D18S51

D8S1179

D8S1179

D8S1179

THO1

D21S11

D21S11

D21S11

FGA

D18S51

D18S51

D18S51

THO1

THO1

THO1

FGA

FGA

FGA

D13S317

D13S317

CSF1PO

CSF1PO

D7S820

D7S820

TPOX

TPOX

D5S818

D5S818

D2S1338

D2S1338

Penta D

D19S433

D19S433

Penta E

An essential feature of any STR used in forensic analysis is that biological material should give an identical profile regardless of the individual or laboratory that carries out the analysis. Without this standardization it would not be possible to compare results between laboratories and developments like national DNA databases would not be possible [7-11]. All new multiplexes have to be vigorously validated before they are used for the analysis of casework [12-19].

In the UK the Forensic Science Service (FSS) developed the first STR-based typing system that was designed for forensic analysis. Four STR loci were amplified in the same reaction [20-22]. This was replaced by the SGM (second generation multiplex) that was also developed by the FSS [23-25]. Two commercial companies, Applied Biosystems and Promega Corporation, have developed a series of multiplexes that are now used by most laboratories. The AmpF/STR® SGM Plus® that is produced by Applied Biosystems replaced the SGM in the UK and has been adopted by many other countries around the world as one of their standard multiplex kits [17]. In the USA, STR technology was adopted into forensic casework following a survey of 17 previously characterized STR loci and in 1997 13 loci were selected as the CODIS (Combined DNA Index System) loci [8, 26]. These loci can be analysed in one PCR using one of two commercially available kits; the AmpF/STR®® Identifiler®® produced by Applied Biosystems [27] and the PowerPlex®R 16 produced by Promega Corporation [13]. The STR loci that are incorporated into different multiplexes are shown in Table 6.1.

THO1

THE DEVELOPMENT OF STR MULTIPLEXES - Simple repeat with a non-consensus allele

AATG | AATG | AATG | AATG | AATG | AATG | AATG | allele 7

AATG I AATG I AATG I AATG I AATG I AATG IATG I AATG I AATG I AATG I allele 9.3

FGA - Compound repeat allele 17

D21S11 - Complex repeat sequence

I—I—I—I—I—I—I—I—I—I—I—I—I—I—LI—I—I—I—I—I—I-1—I—I—I—I—I—I—I—I—I—LI—I allele 30.2

| | TCTA | | TCTG [] TA Sequence not included in repeat

Figure 6.1 The structure of three commonly used STR loci, THO1, FGA and D21S11. The THO1 locus has a simple repeat with a non-consensus allele; in the example the 9.3 allele is missing an A from the seventh repeat*. The FGA locus is a compound repeat composed of several elements. The D21S11 allele is an example of a complex repeat; the three regions not included in the FGA nomenclature are an invariant TA, TCA and TCCATA sequence. *The AATG nomenclature is commonly used but breaks the adopted conventions for STR nomenclature as it represents the bottom strand of the first sequence described in GenBank.

In addition to STR loci, the amelogenin locus which is present on the X and Y chromosomes has been incorporated into all commonly used STR multiplex kits. The amelogenin gene encodes for a protein that is a major component of tooth enamel matrix; there are two versions of the gene, the copy on the X chromosome has a 6 bp deletion and this length polymorphism allows the versions of the gene on the X and Y chromosomes to be differentiated (Figure 6.2) [28].

Y chromosome nzn

Y chromosome

6 bp deletion

X chromosome

I

i

Figure 6.2 The amelogenin locus is present on both the X and Y chromosomes. The gene that is present on the X chromosome has a 6 bp deletion. The primers (schematically shown by the arrowed lines) that were reported by Sullivan et ai. (1993) [28] lead to products of 106 bp from the X chromosome and 112 bp from the Y chromosome

Detection of STR polymorphisms

After STR polymorphisms have been amplified using PCR, the length of the products must be measured precisely - some STR alleles differ by only one base pair. Gel electrophoresis of the PCR products through denaturing polyacrylamide gels can be used to separate DNA molecules between 20 and 500 nucleotides long with single base pair resolution [29]. Early systems detected the PCR products after electrophoresis on polyacrylamide slab-gels using silver staining [30, 31] but this limited the number of loci that could be incorporated into the multiplexes because the allelic size ranges of the different loci could not overlap. To overcome this limitation, fluorescence labelling of PCR products followed by multicolour detection has been adopted by the forensic community. A series of fluorescent dyes has been developed that can be covalently attached to the 5' end of one of the PCR primers in each primer pair and detected real-time during electrophoresis. Up to five different dyes can be used in a single analysis which allows for considerable overlap of loci (Figure 6.3). The electrophoresis platforms have evolved from systems based on slab-gels to capillary electrophoresis (CE) that use a narrow glass tube filled with an entangled polymer solution to separate the DNA molecules [32-36]. Applied Biosystems provide the most commonly used

AmpFISTR® SGM Plus®

Green

NED I D19S433 I I" Yellow 1-1

AmpFISTR® Indentifiler®

6-FAM

Blue

D21S11

AmpFISTR® Indentifiler®

D21S11

Green

I D3S1358 || THO1 |

| D13S317 | D16S359 D2S1338

PET Ol I D5S818 I

PET Ol I D5S818 I

Blue

Green

Tamra Yellow

D21S11

D13S317 D7S820

XIYC

I D8S1179 I I TPOX 1 Q

100 bp

200 bp

300 bp

400 bp

500 bp

Figure 6.3 PCR multiplexes use up to five different dyes to label PCR products. The allelic ranges of three commonly used multiplexes, the AmpF/STR® SGM Plus®, AmpF/STR® Identifiler® and the PowerPlex® 16 are shown. The use of multiple dyes allows the detection of the internal-lane size standard (ROX in SGM Plus®, LIZ in Identifiler® and CXR in PowerPlex® 16) and three to four overlapping STR loci, where the use of different dyes allows the alleles to be assigned to the correct locus

Blue

Green

Orange

Penta E

Penta D

DETECTION OFSTR POLYMORPHISMS

capillary electrophoresis systems and all these have multicolour detection capacity. The ABI PRISM® 310 Genetic Analyzer that has a single capillary and analyses up to 48 samples per day, the ABI PRISM®® 3100 and Applied Biosystems 3130xl Genetic Analyzers, which have 16 capillaries and can analyse over 1000 samples per day, and the ABI PRISM®® 3700 and Applied Biosystems 3730xl Genetic Analyzers, which can have up to 96 capillaries that can analyse over 4000 samples per day.

Before electrophoresis, the PCR sample is prepared by mixing approximately 1 ¡xl of the reaction with 10-20 ¡xl of deionized formamide. The internal-lane size standard is also added at this point. The deionized formamide denatures the DNA, heating the samples to 95 °C is routinely done to ensure that the PCR products are single stranded. The samples are transferred into the capillary using electrokinetic injection, a voltage is applied and charged molecules, including the amplified DNA fragments and the internallane size standards, migrate into the capillary. After injection, a constant voltage is applied across the capillary and the PCR products migrate towards the positively charged anode, travelling through the polymer, which fills the capillary and acts as the sieving matrix. Urea and 2-pyrrolidinone in the gel polymer and a temperature of 60 °C help to prevent the formation of any secondary structure during electrophoresis [37]. Throughout the period of electrophoresis, an argon ion laser is shone through a small glass window in the capillary and as PCR products labelled with fluorescent dyes travel past the window they are excited by the laser, emit fluorescence that is detected by a charged coupled device camera (CCD), and then are recorded by collection software [38] (Figure 6.4). The electrophoresis of a sample takes up to 30 minutes after which the polymer in the capillary is replaced with fresh polymer and the next sample can be analysed.

Figure 6.4 During electrophoresis an argon laser is shone through the window in the capillary. As the labelled PCR products migrate through the gel towards the anode they are separated based on their size. When the laser hits the fluorescent label on the PCR products, the lable is excited and emits fluorescent light that passes though a filter to remove any background noise, and then on to a charged coupled device camera that detects the wavelength of the light and sends the information to a computer where software records the profile (see plate section for full-colour version of this figure)

Figure 6.4 During electrophoresis an argon laser is shone through the window in the capillary. As the labelled PCR products migrate through the gel towards the anode they are separated based on their size. When the laser hits the fluorescent label on the PCR products, the lable is excited and emits fluorescent light that passes though a filter to remove any background noise, and then on to a charged coupled device camera that detects the wavelength of the light and sends the information to a computer where software records the profile (see plate section for full-colour version of this figure)

Interpretation of STR profiles

The spectra of the dyes used to label the PCR products overlap and the raw data contains peaks that are composed of more than one dye colour. After data collection the GeneScan® or GeneMapper™ ID software removes spectral overlap in the profile and calculates the sizes of the amplified DNA fragments. The software calculates how much spectral overlap there is between each dye and subtracts this from the peaks within the profile (Figure 6.5). A good matrix file, which contains information on the amount of overlap in the spectra, will produce peaks within the profile that are composed of only one colour. The height of the peaks is measured in relative fluorescent units (rfu) - the height is proportional to the amount of PCR product that is detected.

To be able to size the PCR products an internal-lane size standard is used. The internal-lane size standards contain fragments of DNA of known lengths that are labelled

Figure 6.5 The application of a matrix file, using the GeneScan® or GeneMapper™ ID software removes the spectral overlap from the raw data (a) to produce peaks within the profile that are composed of only one colour (b) (see plate section for full-colour version of this figure). The scale of the X-axis is relative fluorescent units (rfu)

Figure 6.5 The application of a matrix file, using the GeneScan® or GeneMapper™ ID software removes the spectral overlap from the raw data (a) to produce peaks within the profile that are composed of only one colour (b) (see plate section for full-colour version of this figure). The scale of the X-axis is relative fluorescent units (rfu)

1400 1200 1000 800 600 400 200 0

60 90 120 150 180 210 240 270 300 330 360 390 420 450 480 510 540 570

1400 1200 1000 800 600 400 200 0

60 90 120 150 180 210 240 270 300 330 360 390 420 450 480 510 540 570

225 275 250

' L.X___LAIJL____JL- JL_I.. JujiI_l-.JL.JL__—__.___

500 600

Figure 6.6 Internal-lane size standards are used to size the PCR products precisely. Two commonly used internal-lane size standards are (a) the GeneScan™-500 (Applied Biosystems) and (b) the ILS600 (Promega) (see plate section for full-colour version of this figure)

with a fluorescent dye, and the fragments are detected along with the amplified PCR products during capillary electrophoresis [39]. Commonly used commercial internallane size standards are the GeneScanTM-500 standards that can be labelled with either ROXtm or LIZ™ dyes (Applied Biosystems) and the ILS600 (Promega Corporation) (Figure 6.6)

Because the internal-lane size standard is analysed along with each PCR any differences between runs that could affect the migration rates during electrophoresis, such as temperature, do not impact significantly on the analysis [40]. The software generates a size calling curve from the internal-lane size standards - the data point of the unknown fragments are compared to the size calling curve. Different algorithms have been developed to measure the size of DNA molecules, the most common one is the local Southern method [41] (Figure 6.7).

After analysing the raw data with the software, the end result is an electropherogram with a series of peaks that represent different alleles: the size, peak height and peak area is also measured by the software (Figure 6.8). The final stage of generating a STR profile is to assign specific alleles to the amplified PCR products. Each peak in the profile is given a number that is a description of the structure of that allele - this is straightforward when naming simple repeats but is more problematic with complex repeat sequences [6].

3000 3500 4000 4500 5000 5500 6000 6500 7000

Data Point

3000 3500 4000 4500 5000 5500 6000 6500 7000

Data Point

Data Point

Figure 6.7 During electrophoresis the computer software records the fluorescence levels at regular time points and these are recorded as data points. The DNA fragments that make up the internallane size standards are plotted against the data points. An example of the sizing curves that are produced from (a) the GeneScan™-500 standard (Applied Biosystems) and (b) the ILS600 (Promega) are shown using the local Southern method to generate the size calling curve

Data Point

Figure 6.7 During electrophoresis the computer software records the fluorescence levels at regular time points and these are recorded as data points. The DNA fragments that make up the internallane size standards are plotted against the data points. An example of the sizing curves that are produced from (a) the GeneScan™-500 standard (Applied Biosystems) and (b) the ILS600 (Promega) are shown using the local Southern method to generate the size calling curve

The loci used in forensic casework have been well characterized and multiple alleles have been sequenced to determine the allelic structure and verify that the size of the peaks is a good indicator of the alleles they represent. However, because the migration of PCR products and internal-lane size standard varies slightly with factors such as temperature and the electrophoretic conditions, and because some STR alleles differ by only one base pair, the use of allelic ladders that contain all the common alleles (Figure 6.9) at each locus has been adopted by the forensic community to ensure accurate profiling [22, 42]. Unlike the internal-lane size standards the allelic ladders cannot be analysed in the same injection as the samples but are run periodically during the analysis of a batch of samples.

When assigning the alleles, the unknown peaks are compared to the allelic ladder and should fall within a one base-pair window that is +/- 0.5 bp of the allelic ladder size -if the unknown alleles differ by more than this then they are classified as off-ladder (OL)

ftK*

Mr.Lrtes

Sin

Peat Height

Peak

Oau Point

BG,

24

13.13

103.33

20712

3670

I

8 G

29

13.32

10901

4017

20507

3633

"1

SB,

46

14.22

135 99

2784

17224

3877

1

8G

51

14 48

144 74

3550

1995©

3950

"1

8G,

89

1650

208 88

2820

17167

4469

1

«ri

ft?

1AAO

•>14 na

rmn

4AVI

1

8G,

135

19.00

290 90

2 US

16332

5180

8G

13E

18.12

295 OS

2144

16104

5213

} amelogenin } D8S1179 } D21S11 } D18S51

4620 4200 3738 3360 2940 2520 2100 1680 1260 840 420 0

} amelogenin } D8S1179 } D21S11 } D18S51

4620 4200 3738 3360 2940 2520 2100 1680 1260 840 420 0

Figure 6.8 The green Loci from a profile produced using the AmpF/STR® SGM Plus® kit. The size of each peak has been calculated along with the peak heights and areas. The first amelogenin peak was detected after 13.13 minutes (which is when data point 3570 was taken) and is estimated to be 103.33 bp long, the peak area is 20712 rfu and the peak height 4058 rfu

Figure 6.8 The green Loci from a profile produced using the AmpF/STR® SGM Plus® kit. The size of each peak has been calculated along with the peak heights and areas. The first amelogenin peak was detected after 13.13 minutes (which is when data point 3570 was taken) and is estimated to be 103.33 bp long, the peak area is 20712 rfu and the peak height 4058 rfu and require further analysis. This comparison of unknown peaks to the allelic ladder can be done manually or by using the Genotyper1® (Figure 6.10) or GeneMapper™ ID software (Applied Biosystems), which will compare all the unknown alleles in the profile to the allelic ladder.

D3S1358

D\6S539

D2S1338

mEiEgEaEainia 5 ® na m US QMgjiijBjijpgja H2 ES BS ES H Hü 13 H El E3 E5

15 11a 19

113 16 18 IS 22 13

D8S1179

D21S11

D18S51

Figure 6.9 The allelic ladder of the AmpF/STR® SGM Plus® kit contains all the common alleles (see plate section for full-colour version of this figure)

Was this article helpful?

0 0

Responses

  • uranio
    Why is THO1 STR commonly used?
    12 months ago

Post a comment