AcknowledgmentsThis path would not have been possible without the help of my supervisor, FidelCacheda, who gave me the opportunity of diving again into the Information Retrieval field, after an unforgettable start in the Information Retrieval Lab. Thank you for being such a great motivating person, for guiding and advising me in the best way and, above all, for always letting me choose the next step.All my gratitude to all the colleagues that have shared with me great moments in the Telematic Engineering Lab. Thank you for making our office a warm place to spend lots of hours and for building such a great friendship even out of our blind walls.I would like to acknowledge Iadh Ounis and the Information Retrieval Group from the University of Glasgow. They gave me the opportunity of staying three months in 2012 in one of the leading groups in IR, where I put in touch with an impeccable way of work. My special gratitude goes to Craig Macdonald: thank you for becoming my best teacher and reference during all my PhD. I can't ever forget some of your encouraging words that became my motto: Don't think "is this sufficient?" but "how can we do better". All this work would not have being possible without your valuable help. A special gratitude also for Silvia Lorenzo Freire, for sharing with me her huge knowledge and offering me her useful help.
Vorrei ringraziare in modo particolare a tutti i membri del HighI can not forget Roi Blanco, who was discreetly present at every milestone of my career. Thank you also for opening me the next opportunity of learning, I will make the most of it.A great deal of gratitude is due to all the people that have walked with me during this period. Your encouraging and warm-hearted words were the best incentive to finish this work. Those who kept me out of this thesis, sharing awesome moments and messages, have also done a good job.
AbstractWeb search engines have to deal with a rapid increase of information, demanded by high incoming query traffic. This situation has driven companies to build geographically distributed data centres housing thousands of computers, consuming enormous amounts of electricity and requiring a huge infrastructure around. At this scale, even minor efficiency improvements result in large financial savings.This thesis represents a novel contribution to query scheduling and power consumption state-of-the-art, by assisting large-scale data centres to build more efficient search engines.On the one hand, this thesis proposes new scheduling techniques to decrease the response time of queries, by estimating the server that will be idle soonest.On the other hand, this thesis defines a simple mathematical model that establishes a threshold between the power and latency of a search engine. Using historical and current data, the model estimates the incoming query traffic and automatically increases/decreases the necessary number of active machines in the system. We achieve high energy savings during the whole day, without degrading the latency.Our experiments have attested th...