The trade-off between more user bandwidth and quality of service requirements introduces unprecedented challenges to the next generation smart optical networks. In this regard, the use of optical performance monitoring (OPM) and modulation format identification (MFI) techniques becomes a common need to enable the development of next-generation autonomous optical networks, with ultra-low latency and selfadaptability. Recently, machine learning (ML)-based techniques have emerged as a vital solution to many challenging aspects of OPM and MFI in terms of reliability, quality, and implementation efficiency. This paper surveys ML-based OPM and MFI techniques proposed in the literature. First, we address the key advantages of employing ML algorithms in optical networks. Then, we review the main optical impairments and modulation formats being monitored and classified, respectively, using ML algorithms. Additionally, we discuss the current status of optical networks in terms of MFI and OPM. This includes standards, monitoring parameters, and the available commercial products with their limitations. Second, we provide a comprehensive review of the available ML-based techniques for MFI, OPM, and joint MFI/OPM, describing their performance, advantages, and limitations. Third, we give an overview of the exiting ML-based OPM and MFI techniques for the emerging optical networks such as the new fiber-based networks that use future space division multiplexing techniques (e.g. few-mode fiber), the hybrid radioover-fiber networks, and the free space optical networks. Finally, we discuss the open issues, potential future research directions, and recommendations for the potential implementation of MLbased OPM and MFI techniques. Some lessons learned are presented after each section throughout the paper to help the reader identifying the gaps, weaknesses, and strengths in this field.