“…In this review, some studies reported that the application or combination of BERT could significantly improve the result of entity recognition or relation extraction [18,58,59,61]. In the last decade, the proposed deep learning models for IE tasks include BERT-convolutional neural network (CNN) [18], convolutional neural network with segment attention mechanism (SEGATT-CNN) [63], K-nearest neighbor (KNN) [53], long short-term memory (LSTM) [52,53], bidirectional long short-term memory (BiLSTM) [17], structural BiLSTM [31], LSTM-CRF [54,58], BiLSTM-CRF [22,28,55,58,62,64], BERT-BiLSTM-CRF [59,61,66], graph neural networks [21], and a nested NER model based on LSTM-CRF [29]. Among the above-mentioned models, the "BiLSTM-CRF" and "BERT-BiLSTM-CRF" have become popular deep learning models because of their good extraction performance: the BiLSTM model can capture more context information than the LSTM model.…”