In this work we consider the problem of fault localization in transparent optical networks. We attempt to localize single-link failures by utilizing statistical machine learning techniques trained on data that describe the network state upon current and past failure incidents. In particular, a Gaussian process classifier is trained on historical data extracted from the examined network, with the goal of modeling and predicting the failure probability of each link therein. To limit the set of suspect links for every failure incident, the proposed approach is complemented by the utilization of a graph-based correlation heuristic. The proposed approach is tested on a number of datasets generated for an orthogonal frequency-division multiplexing-based optical network, and demonstrates that the approach achieves a high localization accuracy (91%-99%) that is insignificantly affected as the size of the historical dataset is reduced. The approach is also compared to a conventional fault localization method that is based on the utilization of monitoring information. It is shown that the conventional method significantly increases the network cost, as measured by the number of monitoring nodes required to achieve the same accuracy as that achieved by the proposed approach. The proposed scheme can be used by service providers to reduce the network cost related to the fault localization procedure. As the approach is generic and does not depend on specific network technologies, it can be applied to different network types, e.g., fixed-grid or space-division multiplexing elastic optical networks.