“…Of the three major mechanisms that contribute to this complexity, alternative initiation of transcription, splicing, and polyadenylation, the latter seemed most immediately amenable to analysis because of the wealth of data about transcript 3Ј ends provided by the expressed sequence tag (EST) sequences generated by the NCI Cancer Genome Anatomy Project (Strausberg et al 2000) (at Washington University, the NIH Intramural Sequencing Center, and Incyte Pharmaceuticals), the Merck Gene Index (Aaronson et al 1996), and the NIH Mammalian Gene Collection (Strausberg et al 1999). Although alternative polyadenylation of transcripts has been known to occur for a long time, the proportion of transcripts affected, the number of sites per transcript, and the distances over which alternative sites are spread have been explored exclusively using EST clustering techniques (Gautheret et al 1998;Beaudoing and Gautheret 2001;Pauws et al 2001) and have relied on the poly(A) being documented in the EST sequences. The most recent of these studies have concluded that >40% of human transcripts may undergo alternative polyadenylation, but that most of the observed variation is over a short range (<50 nt) and driven by a single polyadenylation signal (Beaudoing and Gautheret 2001;Pauws et al 2001).…”