Kofi Arhin scite author profile

Kofi Arhin

3Publications

12Citation Statements Received

177Citation Statements Given

How they've been cited

How they cite others

129

173

Affiliations

Rensselaer Polytechnic Institute

Publications

Order By: Most citations

Reducing subgroup differences in personnel selection through the application of machine learning

Zhang

Wang

et al. 2023

Personnel Psychology

View full text Add to dashboard Cite

Researchers have investigated whether machine learning (ML) may be able to resolve one of the most fundamental concerns in personnel selection, which is by helping reduce the subgroup differences (and resulting adverse impact) by race and gender in selection procedure scores. This article presents three such investigations. The findings show that the growing practice of making statistical adjustments to (nonlinear) ML algorithms to reduce subgroup differences must create predictive bias (differential prediction) as a mathematical certainty. This may reduce validity and inadvertently penalize high‐scoring racial minorities. Similarly, one approach that adjusts the ML input data only slightly reduces the subgroup differences but at the cost of slightly reduced model accuracy. Other emerging tactics involve weighting predictors to balance or find a compromise between the competing goals of reducing subgroup differences while maintaining validity, but they have been limited to two outcomes. The third investigation extends this to three outcomes (e.g., validity, subgroup differences, and cost) and presents an online tool. Collectively, the studies in this article illustrate that ML is unlikely to be able to resolve the issue of adverse impact, but it may assist in finding incremental improvements.

show abstract

Supervised Mixture Models for Population Health

Shou

Mavroudeas

New

et al. 2019

View full text Add to dashboard Cite

Ground-Truth, Whose Truth? -- Examining the Challenges with Annotating Toxic Text Datasets

Arhin¹,

Baldini²,

Wei³

et al. 2021

Preprint

View full text Add to dashboard Cite

The use of machine learning (ML)-based language models (LMs) to monitor content online is on the rise. For toxic text identification, task-specific fine-tuning of these models are performed using datasets labeled by annotators who provide ground-truth labels in an effort to distinguish between offensive and normal content. These projects have led to the development, improvement, and expansion of large datasets over time, and have contributed immensely to research on natural language. Despite the achievements, existing evidence suggests that ML models built on these datasets do not always result in desirable outcomes. Therefore, using a design science research (DSR) approach, this study examines selected toxic text datasets with the goal of shedding light on some of the inherent issues and contributing to discussions on navigating these challenges for existing and future projects. To achieve the goal of the study, we re-annotate samples from three toxic text datasets and find that a multi-label approach to annotating toxic text samples can help to improve dataset quality. While this approach may not improve the traditional metric of inter-annotator agreement, it may better capture dependence on context and diversity in annotators. We discuss the implications of these results for both theory and practice.CCS Concepts: • Applied computing → Annotation.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Kofi Arhin

Reducing subgroup differences in personnel selection through the application of machine learning

Supervised Mixture Models for Population Health

Ground-Truth, Whose Truth? -- Examining the Challenges with Annotating Toxic Text Datasets

Contact Info

Product

Resources

About