Spellchecker

Dembitz, ��andor; Gledec, Gordan; Randi��, Mirko

doi:10.1002/9780470050118.ecse414

Cited by 4 publications

(4 citation statements)

References 32 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The advantages of online spellchecking specifically in the Croatian context, emphasizing the relevance and impact of spellchecking tools for the Croatian language, are described in [33]. This highlights the growing significance of spellchecking technologies in addressing linguistic challenges unique to Croatian.…”

Section: Croatian Languagementioning

confidence: 99%

Error Pattern Discovery in Spellchecking Using Multi-Class Confusion Matrix Analysis for the Croatian Language

Gledec,

Sokele,

Horvat

et al. 2024

Computers

View full text Add to dashboard Cite

This paper introduces a novel approach to the creation and application of confusion matrices for error pattern discovery in spellchecking for the Croatian language. The experimental dataset has been derived from a corpus of mistyped words and user corrections collected since 2008 using the Croatian spellchecker available at ispravi.me. The important role of confusion matrices in enhancing the precision of spellcheckers, particularly within the diverse linguistic context of the Croatian language, is investigated. Common causes of spelling errors, emphasizing the challenges posed by diacritic usage, have been identified and analyzed. This research contributes to the advancement of spellchecking technologies and provides a more comprehensive understanding of linguistic details, particularly in languages with diacritic-rich orthographies, like Croatian. The presented user-data-driven approach demonstrates the potential for custom spellchecking solutions, especially considering the ever-changing dynamics of language use in digital communication.

show abstract

Section: Croatian Languagementioning

confidence: 99%

Error Pattern Discovery in Spellchecking Using Multi-Class Confusion Matrix Analysis for the Croatian Language

Gledec,

Sokele,

Horvat

et al. 2024

Computers

View full text Add to dashboard Cite

show abstract

“…False negatives are a much more serious problem than false positives, because a spellchecker with high recall, but low precision, does not serve its intended purposes. The problem can partially be resolved by introducing grammar checking and contextual spellchecking, but this is a privilege of a few central languages only14.…”

Section: Remarks On the Croatian Language And Hascheck Architecturementioning

confidence: 99%

“…Spellchecking is a privilege of approximately 100 languages (writing systems) of the world 14. This is a small number, considering that more than 1000 languages (writing systems) appear on the World Wide Web (WWW), a number which is increasing every day.…”

Section: Introductionmentioning

confidence: 99%

Advantages of online spellchecking: a Croatian example

2010

Self Cite

View full text Add to dashboard Cite

Online spellchecking is commonly regarded as an auxiliary way of performing spellchecking. However, it offers a unique opportunity to constantly improve spellchecker linguistic functionality through interaction with the community of spellchecker users. Such a possibility is crucial for spellchecking in non‐central and under‐resourced languages, in order to overcome gaps in NLP tools between them and central languages. The paper describes Hascheck, a Croatian online spellchecker able to learn words from texts it receives. It started as the first Croatian spellchecker, hence as a basic NLP tool for an under‐resourced language, but due to its learning ability it demonstrates linguistic functionality comparable to that of conventional central‐language spellcheckers. Based on these experiences we also discuss the future of online spellchecking in the context of global NLP tasks. Copyright © 2010 John Wiley & Sons, Ltd.

show abstract

“…The increasing prevalence of digital communication has highlighted the importance of accurate spelling and grammar in written text. Spellchecking services play a crucial role in ensuring the quality and readability of digital content, aiding both native speakers and language learners in producing error-free text [1]. This is particularly crucial for languages such as Croatian, which exhibit complex morphological and orthographic rules [2].…”

Section: Introductionmentioning

confidence: 99%

A Comprehensive Dataset of Spelling Errors and Users’ Corrections in Croatian Language

Gledec

Horvat

Mikuc

et al. 2023

Data

View full text Add to dashboard Cite

This paper presents a unique and extensive dataset containing over 33 million entries with pairs in the form “spelling error → correction” from ispravi.me, the most popular Croatian online spellchecking service, collected since 2008. The dataset, compiled from the contribution of nearly 900,000 users, is a valuable resource for researchers and developers in the field of natural language processing (NLP), improving spellcheck accuracy, and language learning applications. The dataset may be used to accomplish several goals: (1) improving spellchecking accuracy by incorporating common user corrections and reducing false positives and negatives; (2) helping language learners identify common errors and learn correct spelling through targeted feedback; (3) analyzing data trends and patterns to uncover the most common spelling errors and their underlying causes; (4) identifying and evaluating factors that influence typing input; (5) improving NLP applications such as text recognition and machine translation. Tasks specific to the Croatian language include the creation of a letter-level confusion matrix and the refinement of word suggestions based on historical usage of the service. This comprehensive dataset provides researchers and practitioners with a wealth of information, opening the path for advancements in spellchecking, language learning, and NLP applications in the Croatian language.

show abstract

Spellchecker

Cited by 4 publications

References 32 publications

Error Pattern Discovery in Spellchecking Using Multi-Class Confusion Matrix Analysis for the Croatian Language

Error Pattern Discovery in Spellchecking Using Multi-Class Confusion Matrix Analysis for the Croatian Language

Advantages of online spellchecking: a Croatian example

A Comprehensive Dataset of Spelling Errors and Users’ Corrections in Croatian Language

Contact Info

Product

Resources

About