Stanley Simoes scite author profile

Stanley Simoes

4Publications

1Citation Statement Received

44Citation Statements Given

How they've been cited

How they cite others

Affiliations

Queen's University Belfast, Indian Institute of Technology Madras

Publications

Order By: Most citations

Content and Context: Two-Pronged Bootstrapped Learning for Regex-Formatted Entity Extraction

Simoes

Deepak

Sairamesh

et al. 2018

AAAI

View full text Add to dashboard Cite

Regular expressions are an important building block of rule-based information extraction systems. Regexes can encode rules to recognize instances of simple entities which can then feed into the identification of more complex cross-entity relationships. Manually crafting a regex that recognizes all possible instances of an entity is difficult since an entity can manifest in a variety of different forms. Thus, the problem of automatically generalizing manually crafted seed regexes to improve the recall of IE systems has attracted research attention. In this paper, we propose a bootstrapped approach to improve the recall for extraction of regex-formatted entities, with the only source of supervision being the seed regex. Our approach starts from a manually authored high precision seed regex for the entity of interest, and uses the matches of the seed regex and the context around these matches to identify more instances of the entity. These are then used to identify a set of diverse, high recall regexes that are representative of this entity. Through an empirical evaluation over multiple real world document corpora, we illustrate the effectiveness of our approach.

show abstract

AI and Core Electoral Processes: Mapping the Horizons

Deepak¹,

Simoes²,

MacCárthaigh³

2023

Preprint

View full text Add to dashboard Cite

Exploring Rawlsian Fairness for K-Means Clustering

Simoes

Deepak

MacCárthaigh

2022

View full text Add to dashboard Cite

We conduct an exploratory study that looks at incorporating John Rawls' ideas on fairness into existing unsupervised machine learning algorithms. Our focus is on the task of clustering, specifically the k-means clustering algorithm. To the best of our knowledge, this is the first work that uses Rawlsian ideas in clustering. Towards this, we attempt to develop a postprocessing technique i.e., one that operates on the cluster assignment generated by the standard k-means clustering algorithm. Our technique perturbs this assignment over a number of iterations to make it fairer according to Rawls' difference principle while minimally affecting the overall utility. As the first step, we consider two simple perturbation operators -R1 and R2 -that reassign examples in a given cluster assignment to new clusters; R1 assigning a single example to a new cluster, and R2 a pair of examples to new clusters. Our experiments on a sample of the Adult dataset demonstrate that both operators make meaningful perturbations in the cluster assignment towards incorporating Rawls' difference principle, with R2 being more efficient than R1 in terms of the number of iterations. However, we observe that there is still a need to design operators that make significantly better perturbations. Nevertheless, both operators provide good baselines for designing and comparing any future operator, and we hope our findings would aid future work in this direction.

show abstract

Exploring Rawlsian Fairness for K-Means Clustering

Simoes¹,

Deepak²,

MacCárthaigh³

2022

Preprint

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Stanley Simoes

Content and Context: Two-Pronged Bootstrapped Learning for Regex-Formatted Entity Extraction

AI and Core Electoral Processes: Mapping the Horizons

Exploring Rawlsian Fairness for K-Means Clustering

Exploring Rawlsian Fairness for K-Means Clustering

Contact Info

Product

Resources

About