We investigate both rule-based and machine learning methods for the task of compound error correction and evaluate their efficiency for North Sámi, a low resource language. The lack of error-free data needed for a neural approach is a challenge to the development of these tools, which is not shared by bigger languages. In order to compensate for that, we used a rulebased grammar checker to remove erroneous sentences and insert compound errors by splitting correct compounds. We describe how we set up the error detection rules, and how we train a bi-RNN based neural network. The precision of the rule-based model tested on a corpus with real errors (81.0%) is slightly better than the neural model (79.4%). The rule-based model is also more flexible with regard to fixing specific errors requested by the user community. However, the neural model has a better recall (98%). The results suggest that an approach that combines the advantages of both models would be desirable in the future. Our tools and data sets are open-source and freely available on GitHub and Zenodo.
The main objective of our study is to determine if the regular quiet daily curve (QDC) subtraction is a necessary procedure in quantifying the irregular geomagnetic variations at auroral latitudes. We define the hourly ΔH index, the absolute hour‐to‐hour deviation in nanotesla of the hourly geomagnetic horizontal component, which assigns each sample to sample deviation as geomagnetic activity without separating the “regular” and “irregular” parts of the daily magnetic field evolution. We demonstrate that the hourly gradient of the regular Sq variation is very small with respect to the irregular part, and a bulk of the nominal daily variation is actually part of the variation driven by solar wind and interplanetary magnetic field and traditionally classified as irregular. Therefore, attempts to subtract QDC can lead to a larger error, often caused by residual deviations between the used different mathematical and methodological tools and corresponding presumptions themselves. We show that ΔH provides the best and most consistent results at most timescales with the highest effective resolution among the studied indices. We also demonstrate that the ΔH index may equally be useful as a quick‐look near‐real‐time index of space weather and as a long‐term index derived from hourly magnetometer data for space climate studies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.