Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014) 2014
DOI: 10.3115/v1/s14-2127
|View full text |Cite
|
Sign up to set email alerts
|

ULisboa: Identification and Classification of Medical Concepts

Abstract: This paper describes our participation on Task 7 of SemEval 2014, which focused on the recognition and disambiguation of medical concepts. We used an adapted version of the Stanford NER system to train CRF models to recognize textual spans denoting diseases and disorders, within clinical notes. We considered an encoding that accounts with noncontinuous entities, together with a rich set of features (i) based on domain specific lexicons like SNOMED CT, or (ii) leveraging Brown clusters inferred from a large col… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
7
0

Year Published

2014
2014
2017
2017

Publication Types

Select...
2
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(7 citation statements)
references
References 9 publications
0
7
0
Order By: Relevance
“…We applied the same type of approach used in our system from last year (Leal et al, 2014) for entity recognition. The Stanford NER software (Finkel et al, 2005) was used to train Conditional Random Fields (CRF) models using labelled data as input.…”
Section: Entity Recognitionmentioning
confidence: 99%
See 2 more Smart Citations
“…We applied the same type of approach used in our system from last year (Leal et al, 2014) for entity recognition. The Stanford NER software (Finkel et al, 2005) was used to train Conditional Random Fields (CRF) models using labelled data as input.…”
Section: Entity Recognitionmentioning
confidence: 99%
“…All input text had to be tokenized and encoded according to a named entity recognition scheme that encodes entities as token classifications. To be able to recognize non-continuous entities, an SBIEON (Leal et al, 2014) encoding was used. Besides the tags defined in the SBIEO encoding , a new tag N was added to identify words that do not belong to the entity but are inside the continuous span that contains the recognized entity.…”
Section: Entity Recognitionmentioning
confidence: 99%
See 1 more Smart Citation
“…A novel aspect of the SemEval-2014 shared task that differentiates it from the ShARE/CLEF task-other than the fact that it used more data and a new test set-is the fact that SemEval-2014 allowed the use of a much larger set of unlabeled MIMIC notes to inform the models. Surprisingly, only two of the systems (ULisboa (Leal et al, 2014) and UniPi (Attardi et al, 2014)) used the unlabeled MIMIC corpus to generalize the lexical features. Another team-UTH CCB (Zhang et al, 2014)-used off-the-shelf Brown clusters 10 as opposed to training them on the unlabeled MIMIC II data.…”
Section: Participantsmentioning
confidence: 99%
“…The prefix H indicates that the word or word sequence is the shared head, and the prefix D indicates otherwise. Another intermediate approach used by the ULisboa team (Leal et al, 2014) with the tagset-S, B, I, O, E and N. Here, S represents the single token entity to be recognized, E represents the end of an entity (which is part of one of the prior IOB variations) and an N tag to identify non-contiguous mentions. They don't provide an explicit example usage of this tag set in their paper.…”
Section: Participantsmentioning
confidence: 99%