ObjectiveAutomated syndrome classification aims to aid near real-time syndromic
surveillance to serve as an early warning system for disease outbreaks,
using Emergency Department (ED) data. We present a system that improves the
automatic classification of an ED record with triage note into one or more
syndrome categories using the vector space model coupled with a
‘learning’ module that employs a pseudo-relevance feedback
mechanism. Materials and Methods: Terms from standard syndrome
definitions are used to construct an initial reference dictionary for
generating the syndrome and triage note vectors. Based on cosine similarity
between the vectors, each record is classified into a syndrome category. We
then take terms from the top-ranked records that belong to the syndrome of
interest as feedback. These terms are added to the reference dictionary and
the process is repeated to determine the final classification. The system
was tested on two different datasets for each of three syndromes:
Gastro-Intestinal (GI), Respiratory (Resp) and Fever-Rash (FR). Performance
was measured in terms of sensitivity (Se) and specificity (Sp).
Results: The use of relevance feedback produced high values
of sensitivity and specificity for all three syndromes in both test sets:
GI: 90% and 71%, Resp: 97% and 73%, FR: 100% and 87%, respectively, in test
set 1, and GI: 88% and 69%, Resp: 87% and 61%, FR: 97% and 71%,
respectively, in test set 2. Conclusions: The new system for
pre-processing and syndromic classification of ED records with triage notes
achieved improvements in Se and Sp. Our results also demonstrate that the
system can be tuned to achieve different levels of performance based on user
requirements.