Owing to frequent traffic accidents and casualties nowadays, the ability to predict the number of traffic accidents in a period is significant for the transportation department to make decisions scientifically. However, owing to many variables affecting traffic accidents in the road traffic system, there are two critical challenges in traffic accident prediction. The first issue is how to evaluate the weight of each variable’s impact on the accident. The second issue is how to model the prediction process for multiple interrelated variables. Aiming to solve these two problems, we propose effective solutions to deal with traffic accident prediction. Firstly, for the first issue, we exploit the grey correlation analysis to measure the correlation of factors to accident occurrence. Then, for the second issue, we select the main factors by correlation analysis to establish a multivariable grey model—MGM(1,N) for prediction process modeling. Further, we explore the collinearity between variables and better optimize the predictive model. The experimental results show that our approach achieves best performance than four general-purpose comparative algorithms in traffic accident prediction task.
Background
Building a large-scale medical knowledge graphs needs to automatically extract the relations between entities from electronic medical records (EMRs) . The main challenges are the scarcity of available labeled corpus and the identification of complexity semantic relations in text of Chinese EMRs. A hybrid method based on semi-supervised learning is proposed to extract the medical entity relations from small-scale complex Chinese EMRs.
Methods
The semantic features of sentences are extracted by a residual network and the long dependent information is captured by bidirectional gated recurrent unit. Then the attention mechanism is used to assign weights for the extracted features respectively, and the output of two attention mechanisms is integrated for relation prediction. We adjusted the training process with manually annotated small-scale relational corpus and bootstrapping semi-supervised learning algorithm, and continuously expanded the datasets during the training process.
Results
We constructed a small corpus of Chinese EMRs relation extraction based on the EMR datasets released at the China Conference on Knowledge Graph and Semantic Computing. The experimental results show that the best F1-score of the proposed method on the overall relation categories reaches 89.78%, which is 13.07% higher than the baseline CNN.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.