The drastic growth of coastal observation sensors results in copious data that provide weather information. The intricacies in sensor-generated big data are heterogeneity and interpretation, driving high-end Information Retrieval (IR) systems. The Semantic Web (SW) can solve this issue by integrating data into a single platform for information exchange and knowledge retrieval. This paper focuses on exploiting the SW base system to provide interoperability through ontologies by combining the data concepts with ontology classes. This paper presents a 4-phase weather data model: data processing, ontology creation, SW processing, and query engine. The developed Oceanographic Weather Ontology helps to enhance data analysis, discovery, IR, and decision making. In addition to that, it also evaluates the developed ontology with other state-of-the-art ontologies. The proposed ontology's quality has improved by 39.28% in terms of completeness, and structural complexity has decreased by 45.29%, 11% and 37.7% in Precision and Accuracy. Indian Meteorological Satellite INSAT-3D's ocean data is a typical example of testing the proposed model. The experimental result shows the effectiveness of the proposed data model and its advantages in machine understanding and IR.
At present, the prevalence of diabetes is increasing because the human body cannot metabolize the glucose level. Accurate prediction of diabetes patients is an important research area. Many researchers have proposed techniques to predict this disease through data mining and machine learning methods. In prediction, feature selection is a key concept in preprocessing. Thus, the features that are relevant to the disease are used for prediction. This condition improves the prediction accuracy. Selecting the right features in the whole feature set is a complicated process, and many researchers are concentrating on it to produce a predictive model with high accuracy. In this work, a wrapper-based feature selection method called recursive feature elimination is combined with ridge regression (L2) to form a hybrid L2 regulated feature selection algorithm for overcoming the overfitting problem of data set. Overfitting is a major problem in feature selection, where the new data are unfit to the model because the training data are small. Ridge regression is mainly used to overcome the overfitting problem. The features are selected by using the proposed feature selection method, and random forest classifier is used to classify the data on the basis of the selected features. This work uses the Pima Indians Diabetes data set, and the evaluated results are compared with the existing algorithms to prove the accuracy of the proposed algorithm. The accuracy of the proposed algorithm in predicting diabetes is 100%, and its area under the curve is 97%. The proposed algorithm outperforms existing algorithms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.