Speeding Up HMM Decoding and Training by Exploiting Sequence Repetitions

Mozes, Shay; Weimann, Oren; Ziv-Ukelson, Michal

doi:10.1007/978-3-540-73437-6_4

Cited by 17 publications

(16 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Nevertheless, MCMC remains substantially slower than training one model and running Viterbi once and the loss in reliability introduced by relying on one ML or MAP model is ignored in practice. For discrete emissions, compressing sequences and computing forward and backward variables and Viterbi paths on the compressed sequences yields impressive speed-ups [ 19 ]. However, discretization of continuous emissions, similar to vector quantization used in speech recognition [ 18 ], is not viable as the separation between the different classes of observations is low since the observations are one-dimensional .…”

Section: Introductionmentioning

confidence: 99%

Fast MCMC sampling for hidden markov models to determine copy number variations

Mahmud

Schliep

2011

BMC Bioinformatics

View full text Add to dashboard Cite

BackgroundHidden Markov Models (HMM) are often used for analyzing Comparative Genomic Hybridization (CGH) data to identify chromosomal aberrations or copy number variations by segmenting observation sequences. For efficiency reasons the parameters of a HMM are often estimated with maximum likelihood and a segmentation is obtained with the Viterbi algorithm. This introduces considerable uncertainty in the segmentation, which can be avoided with Bayesian approaches integrating out parameters using Markov Chain Monte Carlo (MCMC) sampling. While the advantages of Bayesian approaches have been clearly demonstrated, the likelihood based approaches are still preferred in practice for their lower running times; datasets coming from high-density arrays and next generation sequencing amplify these problems.ResultsWe propose an approximate sampling technique, inspired by compression of discrete sequences in HMM computations and by kd-trees to leverage spatial relations between data points in typical data sets, to speed up the MCMC sampling.ConclusionsWe test our approximate sampling method on simulated and biological ArrayCGH datasets and high-density SNP arrays, and demonstrate a speed-up of 10 to 60 respectively 90 while achieving competitive results with the state-of-the art Bayesian approaches.Availability: An implementation of our method will be made available as part of the open source GHMM library from http://ghmm.org.

show abstract

Section: Introductionmentioning

confidence: 99%

Fast MCMC sampling for hidden markov models to determine copy number variations

Mahmud

Schliep

2011

BMC Bioinformatics

View full text Add to dashboard Cite

show abstract

“…Researchers have mapped HMM based applications to GPU and achieved order of magnitude speedup. They have applied task parallel [19]- [23], data parallel [24]- [27], and combination of task and data parallel [28]- [32] approaches for HMM. Similar approaches can be adopted to improve the performance of stochastic automata.…”

Section: Accelerating Forward Algorithmmentioning

confidence: 99%

Accelerating Forward Algorithm for Stochastic Automata on Graphics Processing Units

et al. 2020

View full text Add to dashboard Cite

A stochastic automaton is a non-deterministic automata with input and output behavior which works serially and synchronously. Stochastic automata is being used in different application areas. For large state space and sequence lengths, performance of stochastic automata is a major concern. For this purpose, graphics processing units can be employed to improve the performance. In this study, a parallel version of inference algorithm for stochastic automata is designed. The parallel version is mapped to graphics processing unit using the dynamic parallelism. The performance of parallel version is compared with different realizations and parameters. Parallel implementation of inference algorithm achieved approximately speedup factor of 50 for 256 states.

show abstract

“…Mozes et al presented a method [18] to speed up the dynamic program algorithms used for solving the HMM decoding and training problems for discrete time-independent HMMs and discussed the application of this method to Viterbi's decoding and training algorithms [23], as well as to the forward -backward and Baum -Welch [5] algorithms. The presented approach was based on identifying repeated substrings in the observed input sequence.…”

Section: Introductionmentioning

confidence: 99%

Hierarchical-based parallel technique for HMM 3D MRI brain segmentation algorithm

El-Moursy

Saif

Younis

2012

International Journal of Parallel, Emergent and Distributed Sys

View full text Add to dashboard Cite

This paper proposes a hidden Markov model (HMM) algorithm for 3D MRI brain segmentation using a hierarchical/multi-level parallel implementation. The new technique is implemented using standard message passing interface (MPI). Two platforms are used to test the proposed technique namely PC-cluster system and IBM Blue Gene (BG)/L system. On PC-cluster system, hierarchical-based parallel HMM algorithm achieves a twofold speedup on a three nodes cluster and a threefold speedup on a six nodes cluster. Communication overhead and data dependency nullify any speedup beyond six nodes. On IBM BG/L system, the high-speed communication network and optimised MPI allow more efficient processing nodes utilisation although the algorithm data dependency limits the net speedup achieved.

show abstract

Speeding Up HMM Decoding and Training by Exploiting Sequence Repetitions

Cited by 17 publications

References 24 publications

Fast MCMC sampling for hidden markov models to determine copy number variations

Fast MCMC sampling for hidden markov models to determine copy number variations

Accelerating Forward Algorithm for Stochastic Automata on Graphics Processing Units

Hierarchical-based parallel technique for HMM 3D MRI brain segmentation algorithm

Contact Info

Product

Resources

About