Chiyoun Park scite author profile

Architecture optimization, which is a technique for finding an efficient neural network that meets certain requirements, generally reduces to a set of multiple-choice selection problems among alternative sub-structures or parameters. The discrete nature of the selection problem, however, makes this optimization difficult. To tackle this problem we introduce a novel concept of a trainable gate function. The trainable gate function, which confers a differentiable property to discrete-valued variables, allows us to directly optimize loss functions that include non-differentiable discrete values such as 0-1 selection. The proposed trainable gate can be applied to pruning. Pruning can be carried out simply by appending the proposed trainable gate functions to each intermediate output tensor followed by fine-tuning the overall model, using any gradient-based training methods. So the proposed method can jointly optimize the selection of the pruned channels while fine-tuning the weights of the pruned model at the same time. Our experimental results demonstrate that the proposed method efficiently optimizes arbitrary neural networks in various tasks such as image classification, style transfer, optical flow estimation, and neural machine translation.

show abstract

Accelerating Recurrent Neural Network Language Model Based Online Speech Recognition System

Lee

Park

Kim

et al. 2018

View full text Add to dashboard Cite

This paper presents methods to accelerate recurrent neural network based language models (RNNLMs) for online speech recognition systems. Firstly, a lossy compression of the past hidden layer outputs (history vector) with caching is introduced in order to reduce the number of LM queries. Next, RNNLM computations are deployed in a CPU-GPU hybrid manner, which computes each layer of the model on a more advantageous platform. The added overhead by data exchanges between CPU and GPU is compensated through a frame-wise batching strategy. The performance of the proposed methods evaluated on LibriSpeech 1 test sets indicates that the reduction in history vector precision improves the average recognition speed by 1.23 times with minimum degradation in accuracy. On the other hand, the CPU-GPU hybrid parallelization enables RNNLM based real-time recognition with a four times improvement in speed.

show abstract

Integration of sporadic noise model in POMDP-based voice activity detection

Park

Kim

Cho

et al. 2010

View full text Add to dashboard Cite

Partially observable Markov decision process (POMDP) has recently been applied to a voice activity detector (VAD), which makes it possible to incorporate knowledge about the recording environments in the decision process in order to achieve a more stable performance in noisy situations. In this paper, the model has been further explored to utilize prior knowledge about possible intermittent noise signals such as breath or click sounds, in addition to the stationary background noise types. The experimental result shows that application of sporadic noise models in a POMDP-based VAD reduces the equal error rate of the voice activity decision by about 7% relatively.

show abstract

Accelerating recurrent neural network language model based online speech recognition system

Lee¹,

Park²,

Kim³

et al. 2018

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Chiyoun Park

Applying GPGPU to recurrent neural network language model based fast network search in the real-time LVCSR

Plug-in, Trainable Gate for Streamlining Arbitrary Neural Networks

Accelerating Recurrent Neural Network Language Model Based Online Speech Recognition System

Integration of sporadic noise model in POMDP-based voice activity detection

Accelerating recurrent neural network language model based online speech recognition system

Contact Info

Product

Resources

About