In the present article, an attempt is made to derive optimal data-driven machine learning methods for forecasting an average daily and monthly rainfall of the Fukuoka city in Japan. This comparative study is conducted concentrating on three aspects: modelling inputs, modelling methods and pre-processing techniques. A comparison between linear correlation analysis and average mutual information is made to find an optimal input technique. For the modelling of the rainfall, a novel hybrid multi-model method is proposed and compared with its constituent models. The models include the artificial neural network, multivariate adaptive regression splines, the k-nearest neighbour, and radial basis support vector regression. Each of these methods is applied to model the daily and monthly rainfall, coupled with a pre-processing technique including moving average and principal component analysis. In the first stage of the hybrid method, sub-models from each of the above methods are constructed with different parameter settings. In the second stage, the sub-models are ranked with a variable selection technique and the higher ranked models are selected based on the leave-one-out cross-validation error. The forecasting of the hybrid model is performed by the weighted combination of the finally selected models.
The matrix decomposition is one of the most powerful methods in recommendation systems. In the recommendation system, we can assume an incomplete matrix consisted of observed evaluation values by users and items, then we predict the vacant elements of the matrix using the observed values. This method is applied to a variety of the fields, e.g., for movie recommendations, music recommendations, book recommendations, etc. In this paper, we apply the matrix decomposition to predict the seasonal infectious disease spread. Applying the method to the case of infectious gastroenteritis caused by Norovirus in Japan, we have found that the early detection and prediction for the prevalence of the disease spread can be expected accurately. The infectious disease spread prediction using the matrix decomposition is new. To demonstrate the advantageous point and effectiveness of the matrix decomposition method, we applied the method to the influenza spread prediction in Japan, where missing observations are admitted for computation unlike other prediction methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.