Abstract-This paper uses Big Data and Machine Learning for the real-time management of Internet scale Qualityof-Service Route Optimisation with an overlay network. Based on the collection of data sampled each 2 minutes over a large number of source-destinations pairs, we show that intercontinental Internet Protocol (IP) paths are far from optimal with respect to Quality of Service (QoS) metrics such as end-to-end round-trip delay. We therefore develop a machine learning based scheme that exploits large scale data collected from communicating node pairs in a multi-hop overlay network that uses IP between the overlay nodes, and selects paths that provide substantially better QoS than IP. Inspired from Cognitive Packet Network protocol, it uses Random Neural Networks with Reinforcement Learning based on the massive data that is collected, to select intermediate overlay hops. The routing scheme is illustrated on a 20-node intercontinental overlay network that collects some 2 × 10 6 measurements per week, and makes scalable distributed routing decisions. Experimental results show that this approach improves QoS significantly and efficiently.