With over 1.4 million Bodo speakers, there is a need for Automated Language Processing systems such as Machine translation, Part Of Speech tagging, Speech recognition, Named Entity Recognition, and so on. In order to develop such a system it requires a sufficient amount of dataset. In this paper we present a detailed description of the primary resources available for Bodo language that can be used as datasets to study Natural Language Processing and its applications. We have listed out different resources available for Bodo language: 8,005 Lexicon dataset collected from agriculture and health, Raw corpus dataset of 2,915,544 words, Tagged corpus consisting of 30,000 sentences, Parallel corpus of 28,359 sentences from tourism, agriculture and health and Tagged and Parallel corpus dataset of 37,768 sentences. We further discuss the challenges and opportunities present in Bodo language.
Among all the food produced annually, the loss rate for fruits and vegetables is the highest. This is due to the inability to detect critical ambient environmental parameters in cold storage. The self‐life of food plays an important characteristic in minimizing loss. We present IntelliStore in this work, an intelligent Machine Learning (ML) and Internet of Things (IoT) powered storage monitoring system that enables real‐time monitoring of temperature, humidity & CO2$$ {CO}_2 $$ concentration, and pest detection using a Passive Infrared (PIR) sensor and microphone. We collected a dataset of 10 different fruit and vegetables from the Food Quality and Analysis lab. We have experimented with SVM, Decision Tree, AdaBoost, and Gradient Boosting machine learning algorithms and have achieved the highest accuracy of 88% with SVM. Moreover, the proposed system has additional functionality to update the dataset with actual observations in the future and retraining models that would allow improvement of ML performance over time.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.