Currently, there is a big increase in the usage of data analytics applications and services because of the growth in the data produced from different sources. The QoS properties such as response time and latency of these services are important factors to decide which services to select. As a result of IT expansion, energy consumption has become a big issue. Therefore, establishing a QoS-based web service recommender system that considers energy consumption as one of the essential QoS properties represents a significant step towards selecting the energy efficient web services.
This dissertation presents an experimental study on energy consumption levels and latency behavior collected from a set of data mining web services running on different datasets. Our study shows that there is a strong relation between the dataset properties and the QoS properties. Based on the findings from this study, a recommender system is built which considers three dimensions (user, service, dataset). The energy consumption values of candidate services invoked by specific users can be predicted for a given dataset. Afterwards, these services can be ranked according to their predicted energy values and presented to users.
We propose three approaches to build our recommender system and we treat it as a context-aware recommendation problem. The dataset is considered as contextual information and we use a context-aware matrix factorization model to predict energy values. In the first approach, we adopt the pre-filtering model where the contextual information serves as a query for filtering relevant rating data. In the second approach, we propose a new method for the pre-filtering implementation. Finally, in the last approach, we adopt the contextual modeling method and we explore different ways of representing dataset information as contextual factors to investigate their impacts on the recommendation accuracy.
We compare the proposed approaches with the baseline approaches and the results show the effectiveness of the proposed ones. Also, we compare the performance of the three approaches to discover the best-fit approach when being measured using different metrics. Both prediction and recommendation accuracy of the proposed approaches are significantly better than the baseline models.