The problem of clustering climate data observation sites and grouping them by their climate types is considered. Machine learning–based clustering algorithms are used in analyzing climate data time series from more than 3,000 climate observation sites in the United States, with the objective of classifying climate type for regions across the United States. Understanding the climate type of a region has applications in public health, environment, actuarial science, insurance, agriculture, and engineering.
In this study, daily climate data measurements for temperature and precipitation from the time period 1946–2015 have been used. The daily data observations were grouped into three derived data sets: a monthly data set (daily data aggregated by month), an annual data set (daily data aggregated by year), and a threshold exceeding frequency data set (threshold exceeding frequency provides the frequency of occurrence of certain climate extremes). Three existing clustering algorithms from the literature, namely, k‐means, density‐based spatial clustering of applications with noise, and balanced iterative reducing and clustering using hierarchies, were each applied to cluster each of the data sets, and the resulting clusters were assessed using standardized clustering indices. The results from these unsupervised learning techniques revealed the suitability and applicability of these algorithms in the climate domain. The clusters identified by these techniques were also compared with existing climate classification types such as the Köppen classification system. Additionally, the work also developed an interactive web and map‐based data visualization system that uses efficient big data management techniques to provide clustering solutions in real time and to display the results of the clustering analysis.
We present a study of the role of user profiles using fuzzy logic in web retrieval processes. Flexibility for user interaction and for adaptation in profile construction becomes an important issue. We focus our study on user profiles, including creation, modification, storage, clustering and interpretation. We also consider the role of fuzzy logic and other soft computing techniques to improve user profiles. Extended profiles contain additional information related to the user that can be used to personalize and customize the retrieval process as well as the web site. Web mining processes can be carried out by means of fuzzy clustering of these extended profiles and fuzzy rule construction. Fuzzy inference can be used in order to modify queries and extract knowledge from profiles with marketing purposes within a web framework. An architecture of a portal that could support web mining technology is also presented.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.