High Throughput Analysis of Alternative Splicing Using Microarrays

Introduction

The techniques designed to monitor the structural alterations in mRNA generated through alternative splicing that have been reviewed so far do not have the capacity to investigate large numbers of genes and variants. DNA microarrays have become a widely used technology for large scale gene expression studies. Custom designs are becoming a routine, and cost-effective means to investigate a large number of genes within a single experiment. Several companies have now established versatile products that are robust and sensitive. They have been used successfully for determining general expression profiles, which, in turn, have been shown to contain information to classify tumors for their response to chemotherapeutics or for their ability to metastasize (Pomeroy et al. 2002; Hedenfalk et al. 2003; Brunet et al. 2004), predicting drug response, on-target or off-target responses (Clarke et al. 2004; Robert et al. 2004) or predicting the potential safety of compounds and the mechanisms of toxicity (Hamadeh et al. 2002; Liguori et al. 2005). However, the design of the microarrays used in all these studies contain one basic flaw: the majority of the probes are not specific for different products from the same gene. Probes are usually designed against the 3' region of the gene, and in many cases do not even cover the coding region in genes with large 3' untranslated regions.

Profiling alternative splicing on microarray platforms has been initially described in a limited number of studies (Hu et al. 2001; Clark et al. 2002; Modrek and Lee 2002; Yeakley et al. 2002; Castle et al. 2003; Johnson et al. 2003; Wang et al. 2003). Those describe the feasibility of monitoring splicing events on microarray platforms, but quantification methods for the absolute and relative levels of expression of splice variants have not been extensively developed. There are several aspects of microarray parameters that need special consideration when the platform is applied to the problem of detecting alternative splicing, including probe design, target labeling approaches, and analytical methods. These topics are considered in this section as they specifically apply to the identification of alternative splice events on microarrays.

Microarray Configuration and Probe Design

Microarrays originally came in several different flavors, depending on what was actually placed on the support. The early prototypes contained full length cDNAs, EST clones, or PCR products. These sequences were generally long and had the potential to cross-hybridize with gene family members or genes encoding similar protein domains. Oligonucleotides offered the ability to carefully define the probe sequence for a more specific hybridization. Subsequently, oligonucleotides were spotted onto slides, and binding chemistries became very important. More efficient methods were developed to build oligonucleotides onto the substrate through in situ synthesis (Chee et al. 1996; Hughes et al. 2001) and constitute the basis for some of the major commercial products, such as the ones marketed by Affymetrix (Santa Clara, CA, USA) or Agilent Technologies. (Palo Alto, CA, USA).

There are potentially several strategies to design oligonucleotides for the detection of alternative splicing (Fig. 6). The traditional labeling protocols necessitated the design of the probes toward the end of the transcript. Samples labeled with oligo dT protocols produced fluorescent targets from the transcripts present in the RNA sample that were biased toward the 3' end of the transcript. With this knowledge, the probes were also designed toward the 3' end of target in order to optimize the match of the labeled targets with the probes (Fig. 6B) and thus were not at all suited to detect and monitor alternative splicing events. Another confounding factor was that the oligonucleotides were designed against EST sequence information. It is well established that the EST data is biased towards the 5' and 3' ends of the transcript (Strausberg et al. 1999).

One of the earliest attempts to gain information on alternative splicing was based on a microarray with a probe design consisting of 25-mer oligonucleotide designed against the 3' end of the gene (Fig 6B). Twenty probes were designed with a companion single base mismatch as a control (for a total of forty probes for each gene) to monitor 1,600 rat genes (Hu et al. 2001). The standard Affymetrix analysis calculated the average difference between the perfect match probe and the mismatch probe to determine the expression level. An algorithm was developed to analyze individual oligonucleotide probes, rather than the collective set of twenty probes, for differential expression. Differences in expression values would indicate different levels of expression for different sections of the 3' end of the gene, suggesting that an alternative splice event was present in the sequence monitored by the probes. Confirmation of the results indicated that 50% of the events detected by this method were confirmed by RT-PCR studies.

Subsequent designs were made that focused specifically on the process of splicing with probes designed against exons and specific exon-exon junctions at the splice sites. In one report, oligonucleotide probes were designed around the splice event to detect differences in splicing for intron-containing genes in yeast (Clark et al. 2002; Fig. 6A). Probe length was chosen as 40 nucleotides, and was designed against an exon, the intron, and at the junction of the two exons. Therefore, three probes were designed for each splice event. With specific interest in how splicing has been integrated into the genome through function, several yeast temperature-sensitive mutants with defects in splicing

p2 p3 p4

Fig. 6A-F. Comparison of existing microarray configurations and their ability to monitor splice variant. An example of probe location for a 5 exon transcript with 3 isoforms is shown (WT: wild-type; ES: exon skipping; IR: intron retention; ASD: alternative splice donor). A: Probes designed to specifically address splicing events. Note the presence of junction probes or exon probes to differentiate the wild-type and the known variants. B: Probes located on the 3' end of the transcripts (could be a single probe or a set of tiling probes). Most commercial arrays or custom arrays fit with this design and will miss detection of ES, IR, and ASD. C: Probes are designed in all known exons (single or multiple probes/exon). This configuration would only detect exon-skipping events by absence of hybridization for probes designed to this exon. D: Probes are designed only over the known exon junctions. Skipping of exon 2 could be detected by absence or weak binding of probes pi and p2. ASD and IR would be more difficult to predict with this configuration. E: Exhaustive probe coverage of all exons and introns with all possible exon-exon junction. (For display purposes, only junction probes between exon 1 and all other exons are indicated) This approach virtually monitors any splicing event but requires huge number of probes for the analysis of a single gene. F: Tiling probes designed to scan the whole locus for a given gene (exons + introns)

factors were used to determine the effect of these mutations on splicing from a genomic viewpoint. Again, RT-PCR was used to confirm the findings that there are specific factors required for the removal of introns. This study also demonstrated that oligonucleotide probes can be used to detect and identify splicing alterations on a microarray.

More recently, several groups have reported using more extensively designed probes around splice events (Castle et al. 2003; Johnson et al. 2003; Wang et al. 2003; Pan et al. 2004; Le et al. 2004; Fehlbaum et al. 2005) (configurations corresponding to Figs. 6A, 6D, 6E). In one case, two types of probes, exon-specific and exon-exon junction-specific probes were designed for a set of 21 genes (Wang et al. 2003) or 316 human genes (Le et al. 2004). Several probes were designed over the same target sequence such that they overlapped each other by varying amounts of bases. In several other cases, single probes were designed across junctions or designed to exon or intron sequences for monitoring splice event on a genome-wide scale (Johnson et al. 2003; Pan et al. 2004). Johnson et al. (2003) used systematic exon-exon probes for more than 10,000 human genes while Pan et al (2004) combined information from exon and junction probes for 3,126 alternative splice events from 2,647 mouse genes. A common theme for all of these reports was that new analytical methods are required to analyze these new designs; they will be discussed in more detail below. In addition, these groups have found that there are significantly more alternatively spliced events than previously thought (about 70% of all genes appear to be susceptible to some form of alternative splicing).

Finally, several groups have used a tiling approach to monitor the expression of transcripts in the genome, in which probes have been designed against genomic regions linearly, over a predefined interval of bases (Shoemaker et al. 2001; Kapranov et al. 2002; Kampa et al. 2004; Bertone et al. 2004; Cheng et al. 2005). This approach requires very large number of probes (Fig. 6F) and can only be applied to selected loci. This strategy will be very informative with respect to "insertion" type splice events, such as intron retention, novel exon, or alternative usage of 3' or 5' splice sites creating exon extensions. It will not perform very well in characterizing "deletion"-type splice events such as exon skipping or alternative usage of 3' or 5' splice sites shortening exons.

Which configuration is best to monitor alternative splicing? The selected configuration should include probes that are specific for every type of splicing event (Fig. 1). Thus, exon-skipping events or other "deletion"-type events as just described above can only be monitored by junction probes. For instance, a probe spanning the second and fourth exon of a gene will monitor the skipping of exon 3 in that gene. Having a probe designed within exon 3 would not be sufficient in most instances where the wild-type isoform containing exon 3 will be co-expressed with the exon-3-deleted isoform. Some short insertional splice events (less than 30 nucleotides) such as the NAG insertion-deletions at splice acceptor sites (Hiller et al. 2004) will also require junction probes. Monitoring constitutive exons and larger insertional events will be best achieved by standard probes optimized on the full length of the exonic sequence.

After the general design of the localization of the target sequences, there are other parameters that have been considered in selecting probes. Most methods have included a masking step where sequences have been identified for vector, interspersed repeats, low complexity sequences, and mitochondrial DNA contamination (Zhang et al. 2004; Schadt et al., 2004). Probes are next selected according to traditional parameters such as melting temperature, GC content, and secondary structure formation. The selected sequences are usually next aligned against genomic sequences and ESTs to identify and exclude potential cross-hybridizing probes. Several reports have looked at the constraints generated by the junction probes (Castle et al. 2003; Wang et al. 2003; Le et al. 2004; Fehlbaum et al. 2005). They concluded that short probes (24-36 mers) more or less centered on the splice junction would work best to provide the required specificities.

Labeling Protocols

It is generally acknowledged that standard labeling technologies are not sufficient to generate targets representative of complete transcripts. Current protocols utilize a 3' biased labeling protocol, and the probes on the array are designed to account for this issue. Unfortunately, these approaches put severe limitation on the ability to detect alternative splicing events. One group specifically recognized this issue and developed a random primed protocol that produced amplified material from mRNA (Castle et al. 2003). Other groups utilized a non-amplified, random prime protocol that requires significantly more starting material (Zhang et al. 2004). Several commercial kits are now available or in development.

Data Analysis

Monitoring alternative splicing on microarrays presents specific challenges that are not present with microarray analysis as it has been performed so far. Probes for general expression studies are all considered equivalent, and are usually averaged to quantify overall gene expression. Assessment of alternative spliced events requires a higher resolution at the sequence level than previously attainable by standard format designs, as only a few oligonucleotides monitor the entire transcript. As described in the design section, oligonu-cleotides can be designed to detect small changes in sequence.

The detection of alternative splice variants in a sample is a difficult problem as the transcripts produced from a single gene locus are structurally different, but contain large amounts of common sequence. This requires that the oligonucleotide probes are focused on the sequence of the transcripts that are different among variants. These designs were detailed in the section above. In addition, specific data analysis processes and algorithms are required to determine the level of expression and/or differential expression of splice variants within different samples.

In analyzing alternative splicing, there is a need to define terminology for the issue of the comparison. Usually a RefSeq sequence is selected and comparisons of different RefSeqs, mRNAs or ESTs are made against the selected RefSeq, which will be named as the reference sequence for this discussion. Any sequence that contains a different sequence structure from the reference is termed the variant. Sequence differences between the reference and the variant are termed splice events. Variants can contain one or more splice events that differentiate the variant from the reference. A splice event is defined as a single difference of sequence between the reference and the variant. Thus, a splice event may take the form of a novel exon, exon skip, intron retention, or an alternative usage of a donor or acceptor site. Each one of these is a distinct event, and the variants or spliced isoforms consist of one or more splice events when the sequence is compared to the reference sequence. An important factor in the analysis of alternative spliced variants is that by using short oligonucleotides as probes, detection of only the splice event is possible and the presence of an isoform or variant must necessarily be inferred. In one approach, a matrix algorithm has been applied to determine the presence of variants from oligonucleotides' probeexpression data (Wang et al. 2003). However, the authors stated that there are limitations to the algorithm, particularly when the gene structure is not known or is incorrect. Importantly, they say that "the algorithm is intended for splice variant typing, not discovery." The matrix may also not yield a unique solution, in which case variant detection is not unique. The robustness of this method remains to be tested.

Ratio calculations have been employed to determine the extent of splicing detected by oligonucleotide probes on microarrays. In one instance, a ratio of two test samples was calculated for the junction probe and the exon probe. The log2 ratios were then subtracted to define a splicing index, which was used to determine that two yeast genes, Prp17p and Prp18p, are required for intron removal in cases in which short branchpoints to 3' splice sites are present. In addition, a ratio method was used to calculate the relative amounts of different variants present between different samples to determine the relative abundance of the reference and variant (Fehlbaum et al. 2005). This was made possible by the design of both exon and junction probes, which represents the only configuration providing direct access to expression data related to the two molecular species generated by a splice event. Additional approaches have recently been applied to the detection of alternative splicing. One report included the experimental protocol in designing the algorithm for the detection of qualitative changes in alternative splicing (Le et al. 2004). A theory is developed that discerns between general gene expression and changes in alternative splicing, uses log ratio correlation coefficients for the designed probes, and analyzes the group with clustering and graphical methods. Finally, using a similar probe design, a complex analysis was performed based on Bayesian inference and unsupervised learning algorithms (Pan et al. 2004). These last two reports clearly indicate the complexity of analyzing alternative splicing and the needs to develop specific tools.

Commercially Available Products

Custom Arrays

The investigators who have published on splicing-related microarrays have mostly used the custom services of major chip manufacturers. After selection of the splice events and of one of the probe configurations described previously, a probe design file was sent to the manufacturer. One recent study describing the function of four splicing regulators on annotated alternative splicing events in Drosophila made use of a 44K custom array produced by Agilent Technologies (Blanchette et al. 2005). Others reports that have already been cited in this chapter are also based on microarrays produced by the same company (Le et al. 2004; Pan et al. 2004; Johnson et al. 2003, Shoemaker et al. 2001). Custom microarrays are also produced by Affymetrix custom services ( Hu et al. 2001; Wang et al. 2003; Kampa et al. 2004). Custom arrays and design services are also available from others companies.

Catalog Arrays

The two major array manufacturers mentioned just above have had an interest in alternative RNA splicing for some time now, and it is likely that this should be translated into commercial products in the near future. Besides its publications in the field (Wang et al. 2003; Kampa et al. 2004; Cline et al. 2005), Affymetrix is working on a next generation of whole-genome arrays dedicated to provide exon expression data (Blume 2005). Probe sets have been designed against every known and predicted human exon to produce microarrays taking advantage of a reduced feature size format of 5 microns. (www.affymetrix.com).

In addition to being the provider of custom microarrays to several industrial and academic groups (see 4.4.1 Custom Arrays), Agilent

Technologies has been collaborating with ExonHit Therapeutics (Paris) to produce splice arrays dedicated to monitor all transcripts within specific gene families, such as G-protein-coupled receptors, ion channels, and nuclear receptors, (www.splicearray.com). These arrays monitor the expression of the reference transcripts and of every known and potential splice event extracted from public databases via probe sets of exonic and junction probes specific for each event.

Conclusion

Microarrays are naturally evolving toward the inclusion of splice-related content. These products have the ambition of providing a global picture of the transcriptomes. This ambition comes with several underlying challenges with respect to probe design/configuration and data analysis. The tools are available today to address these challenges and to start providing more thorough pictures of transcriptomes. The combination of exon and junction probes will provide the most robust platforms to monitor known splice events. Microarrays could also be developed to identify novel splice variants. Tiling arrays fit in this category but suffers from the requirement of a large number of probes. A compromise may lie in a microarray that, in addition to probe sets designed against known splice events, would also include junction probes designed against every exon-intron region and against every neighboring exon. Such sets of probes could then identify intron retention events, single exon-skipping events, and exon extensions at the 5' or 3' ends via alternative splice site usages. The precise molecular structure of the novel splice events would then need to be determined by standard RT-PCR assays.

Was this article helpful?

0 0
Single Parenting

Single Parenting

Finally! You Can Put All Your Worries To Rest! You Can Now Instantly Learn Some Little-Known But Highly Effective Tips For Successful Single Parenting! Understand Your Role As A Single Motherfather, And Learn How To Give Your Child The Love Of Both Parents Single Handedly.

Get My Free Ebook


Post a comment