Background
The popularization of health and medical informatics yields huge amounts of data. Extracting clinical events on a temporal course is the foundation of enabling advanced applications and research. It is a structure of presenting information in chronological order. Manual extraction would be extremely challenging due to the quantity and complexity of the records.
Methods
We present an recurrent neural network- based architecture, which is able to automatically extract clinical event expressions along with each event’s temporal information. The system is built upon the attention-based and recursive neural networks and introduce a piecewise representation (we divide the input sentences into three pieces to better utilize the information in the sentences), incorporates semantic information by utilizing word representations obtained from BioASQ and Wikipedia.
Results
The system is evaluated on the THYME corpus, a set of manually annotated clinical records from Mayo Clinic. In order to further verify the effectiveness of the system, the system is also evaluated on the TimeBank _Dense corpus. The experiments demonstrate that the system outperforms the current state-of-the-art models. The system also supports domain adaptation, i.e., the system may be used in brain cancer data while its model is trained in colon cancer data.
Conclusion
Our system extracts temporal expressions, event expressions and link them according to actually occurring sequence, which may structure the key information from complicated unstructured clinical records. Furthermore, we demonstrate that combining the piecewise representation method with attention mechanism can capture more complete features. The system is flexible and can be extended to handle other document types.