2018
DOI: 10.1609/aaai.v32i1.12056
|View full text |Cite
|
Sign up to set email alerts
|

Content and Context: Two-Pronged Bootstrapped Learning for Regex-Formatted Entity Extraction

Abstract: Regular expressions are an important building block of rule-based information extraction systems. Regexes can encode rules to recognize instances of simple entities which can then feed into the identification of more complex cross-entity relationships. Manually crafting a regex that recognizes all possible instances of an entity is difficult since an entity can manifest in a variety of different forms. Thus, the problem of automatically generalizing manually crafted seed regexes to improve the recall of IE sys… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(1 citation statement)
references
References 13 publications
0
1
0
Order By: Relevance
“…Current research in entity extraction using regexes within the machine learning community relies on a human in the loop (e.g., [4,5,27,37,47,56]). Nevertheless, a fully automated holistic approach that first determines francetelecom refers to the operator of the router could allow an automated geolocation method to not consider that portion of a hostname as a possible geohint.…”
Section: Limitationsmentioning
confidence: 99%
“…Current research in entity extraction using regexes within the machine learning community relies on a human in the loop (e.g., [4,5,27,37,47,56]). Nevertheless, a fully automated holistic approach that first determines francetelecom refers to the operator of the router could allow an automated geolocation method to not consider that portion of a hostname as a possible geohint.…”
Section: Limitationsmentioning
confidence: 99%