2020
DOI: 10.1093/bioinformatics/btaa995
|View full text |Cite
|
Sign up to set email alerts
|

LexExp: a system for automatically expanding concept lexicons for noisy biomedical texts

Abstract: Summary LexExp is an open-source, data-centric lexicon expansion system that generates spelling variants of lexical expressions in a lexicon using a phrase embedding model, lexical similarity-based natural language processing methods, and a set of tunable threshold decay functions. The system is customizable, can be optimized for recall or precision, and can generate variants for multi-word expressions. Availability and implementation … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
12
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
4
1

Relationship

6
4

Authors

Journals

citations
Cited by 16 publications
(12 citation statements)
references
References 16 publications
0
12
0
Order By: Relevance
“…Since drug names and related expressions are often misspelled on social media, we generated commonly used lexical variants of the terms using the LexExp tool. 20 We found that some non-standard terms and lexical variants tend to have high noise associated with them (i.e., expressions not actually referring to a stimulant or opioid, e.g., oxy clean). Thus, we included additional lters for the terms stimulant, meth, and oxy (Appendix A3).…”
Section: Identifying Substance Mentionsmentioning
confidence: 81%
“…Since drug names and related expressions are often misspelled on social media, we generated commonly used lexical variants of the terms using the LexExp tool. 20 We found that some non-standard terms and lexical variants tend to have high noise associated with them (i.e., expressions not actually referring to a stimulant or opioid, e.g., oxy clean). Thus, we included additional lters for the terms stimulant, meth, and oxy (Appendix A3).…”
Section: Identifying Substance Mentionsmentioning
confidence: 81%
“…The first application of NLP was to generate lexical variants (e.g., misspellings) of the substances. Since drug names and related expressions are often misspelled on social media, we generated commonly used lexical variants of the terms using the LexExp tool [ 27 ]. We found that some non-standard terms and lexical variants tend to have high noise associated with them (i.e., expressions not actually referring to a stimulant or opioid, e.g., oxy clean) .…”
Section: Methodsmentioning
confidence: 99%
“…Key phrases included in this study along with their types and the date of the first letter mentioning each. Since product and entity names are often misspelled by social media subscribers, we generated potential spelling variants or misspellings of the products and entities using a datacentric tool [9]. The variant generation tool uses a combination of semantic and lexical similarity measures to automatically identify common misspellings and spelling variants of terms/phrases, including multi-word expressions.…”
Section: Product Detectionmentioning
confidence: 99%