Mavandadi, Sepand scite author profile

Alternative splicing (AS) plays a crucial role in the diversification of gene function and regulation. Consequently, the systematic identification and characterization of temporally regulated splice variants is of critical importance to understanding animal development. We have used high-throughput RNA sequencing and microarray profiling to analyze AS in C. elegans across various stages of development. This analysis identified thousands of novel splicing events, including hundreds of developmentally regulated AS events. To make these data easily accessible and informative, we constructed the C. elegans Splice Browser, a web resource in which researchers can mine AS events of interest and retrieve information about their relative levels and regulation across development. The data presented in this study, along with the Splice Browser, provide the most comprehensive set of annotated splice variants in C. elegans to date, and are therefore expected to facilitate focused, high resolution in vivo functional assays of AS function.[Supplemental material is available for this article. The sequence data from this study have been submitted to the NCBI Sequence Read Archive (http://www.ncbi.nlm.nih.gov/Traces/sra/sra.cgi) under accession no. SRA009279. The microarray data from this study have been submitted to the NCBI Gene Expression Omnibus (http://www.ncbi.nlm.nih.gov/ geo) under accession no. GSE25927.] Alternative splicing (AS) is the process by which multiple mRNA transcripts are produced from a single precursor transcript through the differential utilization of splice sites. Alternative splicing is one of the key mechanisms that have evolved in metazoans to generate increased transcriptome complexity and recent studies estimate that greater than 95% of human multi-exon genes express multiple splice isoforms (Pan et al. 2008;. Moreover, alternatively spliced exons are often differentially regulated across tissues and during development, suggesting that individual isoforms may serve specific spatial or temporal roles (Hartmann and Valcarcel 2009;Licatalosi and Darnell 2010;Nilsen and Graveley 2010).The importance of proper regulation of AS during development has been demonstrated in many different instances; one particularly well-studied example is that of the sex determination pathway in Drosophila. In this pathway, the female-specific expression of a splicing regulator transformer stimulates the inclusion of exons in transcripts of the doublesex and fruitless transcription factor genes (Lopez 1998;Forch and Valcarcel 2003). The femalespecific isoforms of these transcription factors subsequently activate the expression of genes required for female development, while the male-specific variants induce a gene expression program important for male differentiation (Dulac 2005;Shirangi and McKeown 2007). Similar spatio-temporally regulated AS networks are likely to exist in metazoans. The characterization of these AS networks, and their integration with other layers of gene regulation, will be necessary for a more compl...

show abstract

An Efficient Streaming Non-Recurrent On-Device End-to-End Model with Improvements to Rare-Word Modeling

Sainath

Narayanan

et al. 2021

View full text Add to dashboard Cite

Improving Tail Performance of a Deliberation E2E ASR Model Using a Large Text Corpus

Peyser

Sepand

Sainath

et al. 2020

View full text Add to dashboard Cite

End-to-end (E2E) automatic speech recognition (ASR) systems lack the distinct language model (LM) component that characterizes traditional speech systems. While this simplifies the model architecture, it complicates the task of incorporating textonly data into training, which is important to the recognition of tail words that do not occur often in audio-text pairs. While shallow fusion has been proposed as a method for incorporating a pre-trained LM into an E2E model at inference time, it has not yet been explored for very large text corpora, and it has been shown to be very sensitive to hyperparameter settings in the beam search. In this work, we apply shallow fusion to incorporate a very large text corpus into a state-of-the-art E2E ASR model. We explore the impact of model size and show that intelligent pruning of the training set can be more effective than increasing the parameter count. Additionally, we show that incorporating the LM in minimum word error rate (MWER) fine tuning makes shallow fusion far less dependent on optimal hyperparameter settings, reducing the difficulty of that tuning problem.

show abstract

Streaming End-to-End Multilingual Speech Recognition with Joint Language Identification

Zhang¹,

Li²,

Sainath³

et al. 2022

View full text Add to dashboard Cite

Language identification is critical for many downstream tasks in automatic speech recognition (ASR), and is beneficial to integrate into multilingual end-to-end ASR as an additional task. In this paper, we propose to modify the structure of the cascadedencoder-based recurrent neural network transducer (RNN-T) model by integrating a per-frame language identifier (LID) predictor. RNN-T with cascaded encoders can achieve streaming ASR with low latency using first-pass decoding with no right-context, and achieve lower word error rates (WERs) using second-pass decoding with longer right-context. By leveraging such differences in the right-contexts and a streaming implementation of statistics pooling, the proposed method can achieve accurate streaming LID prediction with little extra testtime cost. Experimental results on a voice search dataset with 9 language locales shows that the proposed method achieves an average of 96.2% LID prediction accuracy and the same secondpass WER as that obtained by including oracle LID in the input.

show abstract

A Deliberation-Based Joint Acoustic and Text Decoder

Sepand¹,

Sainath²,

Hu³

et al. 2021

View full text Add to dashboard Cite

We propose a new two-pass E2E speech recognition model that improves ASR performance by training on a combination of paired data and unpaired text data. Previously, the joint acoustic and text decoder (JATD) has shown promising results through the use of text data during model training and the recently introduced deliberation architecture has reduced recognition errors by leveraging first-pass decoding results. Our method, dubbed Deliberation-JATD, combines the spelling correcting abilities of deliberation with JATD's use of unpaired text data to further improve performance. The proposed model produces substantial gains across multiple test sets, especially those focused on rare words, where it reduces word error rate (WER) by between 12% and 22.5% relative. This is done without increasing model size or requiring multi-stage training, making Deliberation-JATD an efficient candidate for on-device applications.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.