Maria Ryskina scite author profile

Maria Ryskina

5Publications

33Citation Statements Received

146Citation Statements Given

How they've been cited

How they cite others

154

145

Affiliations

Carnegie Mellon University

Publications

Order By: Most citations

Phonetic and Visual Priors for Decipherment of Informal Romanization

Ryskina

Gormley

Berg-Kirkpatrick

2020

View full text Add to dashboard Cite

Informal romanization is an idiosyncratic process used by humans in informal digital communication to encode non-Latin script languages into Latin character sets found on common keyboards. Character substitution choices differ between users but have been shown to be governed by the same main principles observed across a variety of languagesnamely, character pairs are often associated through phonetic or visual similarity. We propose a noisy-channel WFST cascade model for deciphering the original non-Latin script from observed romanized text in an unsupervised fashion. We train our model directly on romanized data from two languages: Egyptian Arabic and Russian. We demonstrate that adding inductive bias through phonetic and visual priors on character mappings substantially improves the model's performance on both languages, yielding results much closer to the supervised skyline. Finally, we introduce a new dataset of romanized Russian, collected from a Russian social network website and partially annotated for our experiments. 1

show abstract

NoiseQA: Challenge Set Evaluation for User-Centric Question Answering

Ravichander

Dalmia

Ryskina

et al. 2021

View full text Add to dashboard Cite

When Question-Answering (QA) systems are deployed in the real world, users query them through a variety of interfaces, such as speaking to voice assistants, typing questions into a search engine, or even translating questions to languages supported by the QA system. While there has been significant community attention devoted to identifying correct answers in passages assuming a perfectly formed question, we show that components in the pipeline that precede an answering engine can introduce varied and considerable sources of error, and performance can degrade substantially based on these upstream noise sources even for powerful pre-trained QA models. We conclude that there is substantial room for progress before QA systems can be effectively deployed, highlight the need for QA evaluation to expand to consider real-world use, and hope that our findings will spur greater community interest in the issues that arise when our systems actually need to be of utility to humans. 1 XQuAD EN ASR MT Keyboard Model EM F1 EM F1 EM F1 EM F1

show abstract

SIGMORPHON 2021 Shared Task on Morphological Reinflection: Generalization Across Languages

Pimentel¹,

Ryskina²,

Mielke³

et al. 2021

View full text Add to dashboard Cite

Learning Mathematical Properties of Integers

Ryskina¹,

Knight²

2021

View full text Add to dashboard Cite

Embedding words in high-dimensional vector spaces has proven valuable in many natural language applications. In this work, we investigate whether similarly-trained embeddings of integers can capture concepts that are useful for mathematical applications. We probe the integer embeddings for mathematical knowledge, apply them to a set of numerical reasoning tasks, and show that by learning the representations from mathematical sequence data, we can substantially improve over number embeddings learned from English text corpora.

show abstract

NoiseQA: Challenge Set Evaluation for User-Centric Question Answering

Ravichander¹,

Dalmia²,

Ryskina³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Maria Ryskina

Phonetic and Visual Priors for Decipherment of Informal Romanization

NoiseQA: Challenge Set Evaluation for User-Centric Question Answering

SIGMORPHON 2021 Shared Task on Morphological Reinflection: Generalization Across Languages

Learning Mathematical Properties of Integers

NoiseQA: Challenge Set Evaluation for User-Centric Question Answering

Contact Info

Product

Resources

About