People do not always use Unicode, rather, they mix multiple languages. The processing of codemixed data becomes challenging due to the linguistic complexities. The noisy text increases the complexities of language identification. The dataset used in this article contains Facebook and Twitter messages collected through Facebook graph API and twitter API. The annotated English Punjabi code mixed dataset has been trained using a pipeline Dictionary Vectorizer, N-gram approach with some features. Furthermore, classifiers used are Logistic Regression, Decision Tree Classifier and Gaussian Naïve Bayes are used to perform language identification at word level. The results show that Logistic Regression performs best with an accuracy of 86.63 with an F-1 measure of 0.88. The success of machine learning approaches depends on the quality of labeled corpora.
Detection and classification of mine-like objects in side-scan sonar images needs to compensate for variability of objects, noise and background signatures. The unsupervised algorithm presented in this paper addresses improvements with respect to previous work and focuses on object and shadow detection based on morphological operators. Feature extraction from the detected objects and their classification into two classes, namely mine or non-mine like objects is described. Row-wise processing technique is applied for decreasing computational costs and memory usage to allow easy porting of the algorithm to an embedded architecture. The performance of the algorithms is measured against the obtained groundtruth.
Background Maternal COVID-19 infection acquired during late pregnancy carries a potential risk for adverse neonatal outcomes. There is still a paucity of data on its effect on the transition from intrauterine to extrauterine life. Objectives The objectives of this study were to determine the impact of maternal COVID-19 infection on neonates for the risk and need for resuscitation at birth, Apgar scores at 1- and 5-minutes, and the need of NICU admission during early neonatal period. Materials and method In this hospital-based prospective matched cohort study, 100 COVID-positive pregnant women presenting for delivery were enrolled. We also included 100 non-COVID pregnant women after the best possible matching of their major baseline parameters with the study group. Neonates of both groups were followed-up till 7 days of life. Results The two groups were comparable for all baseline variables except for the mode of delivery. The requirement of neonatal resuscitation was 30% and 21% in the study and control groups (RR = 1.429; 95% CI 0.88–2.32; p = 0.149). Apgar scores at 1- and 5- minutes were also unaffected by maternal COVID-19 infection with mean scores of 8.8 ± 0.651 vs. 8.87 ± 0.562 (p = 0.42) in the study and control groups, respectively. COVID-exposed neonates had a higher incidence of NICU admission when compared with the unexposed group (RR =1.616; 95% CI 1.002–2.606; p = 0.047). Among neonates born to COVID-positive mothers, 11% demonstrated evidence of SARS-CoV-2 positivity within first 5 days of life. The risk for need of resuscitation and mean Apgar scores were comparable among SARS-CoV-2 positive and negative neonates (p > 0.05). Conclusion COVID-19 infection in pregnant women is not associated with an increased risk of neonatal resuscitation.
People do not always use Unicode, rather, they mix multiple languages. The processing of codemixed data becomes challenging due to the linguistic complexities. The noisy text increases the complexities of language identification. The dataset used in this article contains Facebook and Twitter messages collected through Facebook graph API and twitter API. The annotated English Punjabi code mixed dataset has been trained using a pipeline Dictionary Vectorizer, N-gram approach with some features. Furthermore, classifiers used are Logistic Regression, Decision Tree Classifier and Gaussian Naïve Bayes are used to perform language identification at word level. The results show that Logistic Regression performs best with an accuracy of 86.63 with an F-1 measure of 0.88. The success of machine learning approaches depends on the quality of labeled corpora.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.