Lingang Jiang scite author profile

Lingang Jiang

4Publications

54Citation Statements Received

97Citation Statements Given

How they've been cited

149

How they cite others

100

Affiliations

Hunan University

Publications

Order By: Most citations

Hadoop Recognition of Biomedical Named Entity Using Conditional Random Fields

Tang

et al. 2015

IEEE Trans. Parallel Distrib. Syst.

View full text Add to dashboard Cite

Processing large volumes of data has presented a challenging issue, particularly in data-redundant systems. As one of the most recognized models, the conditional random fields (CRF) model has been widely applied in biomedical named entity recognition (Bio-NER). Due to the internally sequential feature, performance improvement of the CRF model is nontrivial, which requires new parallelized solutions. By combining and parallelizing the limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) and Viterbi algorithms, we propose a parallel CRF algorithm called MRCRF (MapReduce CRF) in this paper, which contains two parallel sub-algorithms to handle two time-consuming steps of the CRF model. The MRLB (MapReduce L-BFGS) algorithm leverages the MapReduce framework to enhance the capability of estimating parameters. Furthermore, the MRVtb (MapReduce Viterbi) algorithm infers the most likely state sequence by extending the Viterbi algorithm with another MapReduce job. Experimental results show that the MRCRF algorithm outperforms other competing methods by exhibiting significant performance improvement in terms of time efficiency as well as preserving a guaranteed level of correctness.Index Terms-Biomedical named entity recognition, conditional random fields, MapReduce, parallel algorithm.

show abstract

A self-adaptive scheduling algorithm for reduce start time

Tang

Jiang

Zhou

et al. 2015

Future Generation Computer Systems

View full text Add to dashboard Cite

CRFs based parallel biomedical named entity recognition algorithm employing MapReduce framework

et al. 2015

View full text Add to dashboard Cite

As the rapid growth of the biomedical literature, the model training time in biomedical named entity recognition increases sharply when dealing with large-scale training samples. How to increase the efficiency of named entity recognition in biomedical big data becomes one of the key problems in biomedical text mining. For the purposes of improving the recognition performance and reducing the training time, this paper proposes an optimization method for two-phase recognition using conditional random fields. In the first stage, each named entity boundary is detected to distinguish all real entities. In the second stage, we label the semantic class of the entity detected. To expedite the training speed, in these two phases, we implement the model training process on a parallel optimization program framework based on MapReduce. Through dividing the training set into several parts, the iterations in the training algorithm are designed as map tasks which can be executed simultaneously in a cluster, where each map function is designed to complete the calculation of a gradient vector component for each part in the training set. Our experiments show that the proposed method in this paper can achieve high performance with short training time, which has important implications for the current biological big data processing.

show abstract

◾ Time–Space Scheduling in the MapReduce Framework

Qi¹,

Jiang²,

Li³

et al. 2015

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Lingang Jiang

Hadoop Recognition of Biomedical Named Entity Using Conditional Random Fields

A self-adaptive scheduling algorithm for reduce start time

CRFs based parallel biomedical named entity recognition algorithm employing MapReduce framework

◾ Time–Space Scheduling in the MapReduce Framework

Contact Info

Product

Resources

About