As the generation and accumulation of massive electronic health records (EHR), how to effectively extract the valuable medical information from EHR has been a popular research topic. During the medical information extraction, named entity recognition (NER) is an essential natural language processing (NLP) task. This paper presents our efforts using neural network approaches for this task. Based on the Chinese EHR offered by CCKS 2019 and the Second Affiliated Hospital of Soochow University (SAHSU), several neural models for NER, including BiLSTM, have been compared, along with two pretrained language models, word2vec and BERT. We have found that the BERT-BiLSTM-CRF model can achieve approximately 75% F1 score, which outperformed all other models during the tests.
Background:
Transforming growth factor-beta-induced (TGFBI) is an exocrine protein, which has been found to be able to promote the development of nasopharyngeal carcinoma, glioma, pancreatic cancer, and other tumors. However, there is currently no report concerning the relationship between TGFBI and invasive progression of bladder cancer (BCa).
Methods:
IHC staining, qRT-PCR and Western blot were used to analyze TGFBI and EMT markers levels. In vivo tumorigenesis was performed by xenograft tumor model.
Results:
In this study, we found that both mRNA and protein levels of TGFBI were significantly up-regulated in muscle invasive bladder cancer (MIBC) tissues compared with non-muscle-invasive bladder cancer (NMIBC) tissues. The high expression level of TGFBI was positively correlated with high histological grade and advanced clinical stage, and BCa patients with high TGFBI levels exhibited poor prognoses. We further confirmed that high expression level of TGFBI can promote proliferation, invasive progression, and epithelial-to-mesenchymal transition (EMT) of BCa cells in vitro, as well as promote tumor growth and EMT in vivo, while silencing of TGFBI inhibited these malignant phenotypes. TGFBI was involved in the up-regulation of EMT by inducing the expression level of Slug, Vimentin, Snail, MMP2, and MMP9 genes, while it down-regulated the expression level of E-cadherin. Moreover, Western blot analysis was carried out to demonstrate that BCa cell lines stably transfected with expression of TGFBI, a secreted protein. Furthermore, conditional medium containing TGFBI protein also resulted in enhanced EMT and malignant phenotype of BCa cells.
Conclusion:
Our results indicate that high expression level of TGFBI promotes EMT, proliferation, and invasive progression of BCa cells, and TGFBI is a potential therapeutic target and prognostic marker for BCa.
Text classification is of importance in natural language processing, as the massive text information containing huge amounts of value needs to be classified into different categories for further use. In order to better classify text, our paper tries to build a deep learning model which achieves better classification results in Chinese text than those of other researchers’ models. After comparing different methods, long short-term memory (LSTM) and convolutional neural network (CNN) methods were selected as deep learning methods to classify Chinese text. LSTM is a special kind of recurrent neural network (RNN), which is capable of processing serialized information through its recurrent structure. By contrast, CNN has shown its ability to extract features from visual imagery. Therefore, two layers of LSTM and one layer of CNN were integrated to our new model: the BLSTM-C model (BLSTM stands for bi-directional long short-term memory while C stands for CNN.) LSTM was responsible for obtaining a sequence output based on past and future contexts, which was then input to the convolutional layer for extracting features. In our experiments, the proposed BLSTM-C model was evaluated in several ways. In the results, the model exhibited remarkable performance in text classification, especially in Chinese texts.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.