Pavel Sofroniev scite author profile

Pavel Sofroniev

4Publications

41Citation Statements Received

73Citation Statements Given

How they've been cited

How they cite others

Affiliations

Publications

Order By: Most citations

Using support vector machines and state-of-the-art algorithms for phonetic alignment to identify cognates in multi-lingual wordlists

Jäger

List

Sofroniev³

2017

View full text Add to dashboard Cite

Most current approaches in phylogenetic linguistics require as input multilingual word lists partitioned into sets of etymologically related words (cognates). Cognate identification is so far done manually by experts, which is time consuming and as of yet only available for a small number of well-studied language families. Automatizing this step will greatly expand the empirical scope of phylogenetic methods in linguistics, as raw wordlists (in phonetic transcription) are much easier to obtain than wordlists in which cognate words have been fully identified and annotated, even for under-studied languages. A couple of different methods have been proposed in the past, but they are either disappointing regarding their performance or not applicable to larger datasets. Here we present a new approach that uses support vector machines to unify different state-of-the-art methods for phonetic alignment and cognate detection within a single framework. Training and evaluating these method on a typologically broad collection of gold-standard data shows it to be superior to the existing state of the art.

show abstract

Phonetic Vector Representations for Sound Sequence Alignment

Sofroniev¹,

Çöltekin²

2018

View full text Add to dashboard Cite

This study explores a number of data-driven vector representations of the IPA-encoded sound segments for the purpose of sound sequence alignment. We test the alternative representations based on the alignment accuracy in the context of computational historical linguistics. We show that the data-driven methods consistently do better than linguistically-motivated articulatoryacoustic features. The similarity scores obtained using the data-driven representations in a monolingual context, however, performs worse than the state-of-the-art distance (or similarity) scoring methods proposed in earlier studies of computational historical linguistics. We also show that adapting representations to the task at hand improves the results, yielding alignment accuracy comparable to the state of the art methods.

show abstract

Computational analysis of Gondi dialects

Rama¹,

Çöltekin²,

Sofroniev³

2017

View full text Add to dashboard Cite

show abstract

The parse is darc and full of errors: Universal dependency parsing with transition-based and graph-based algorithms

Yu¹,

Sofroniev²,

Schill³

et al. 2017

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Pavel Sofroniev

Using support vector machines and state-of-the-art algorithms for phonetic alignment to identify cognates in multi-lingual wordlists

Phonetic Vector Representations for Sound Sequence Alignment

Computational analysis of Gondi dialects

The parse is darc and full of errors: Universal dependency parsing with transition-based and graph-based algorithms

Contact Info

Product

Resources

About