With the amount of global network traffic steadily increasing, mainly due to video streaming services, network operators are faced with the challenge of efficiently managing their resources while meeting customer demands and expectations. A prerequisite for such Quality-of-Experience-driven (QoE) network traffic management is the monitoring and inference of application-level performance in terms of video Key Performance Indicators (KPIs) that directly influence end-user QoE. Given the persistent adoption of end-to-end encryption, operators lack direct insights into video quality metrics such as start-up delays, resolutions, or stalling events, which are needed to adequately estimate QoE and drive resource management decisions. Numerous solutions have been proposed to tackle this challenge on individual use-cases, most of them relying on machine learning (ML) for inferring KPIs from observable traffic patterns and statistics. In this paper, we summarize the key findings in state-of-the-art research on the topic. Going beyond previous work, we devise the concept of a generic framework for ML-based QoE/KPI monitoring of HTTP adaptive streaming (HAS) services, including model training, deployment, and re-evaluation. Components of the framework are designed in a generic way, independent of a particular streaming service and platform. The methodology for applying different framework components is discussed across various use-cases. In particular, we demonstrate framework applicability in a concrete use-case involving the YouTube service delivered to smartphones via the mobile YouTube app, as this presents one of the most prominent examples of accessing YouTube. We tackle both QoE/KPI estimation on a per-video-session level (utilizing the validated ITU-T P.1203 QoE model), as well as ''real-time'' KPI estimation over short time intervals. Obtained results provide important insights and challenges related to the deployment of a generic in-network QoE monitoring framework for encrypted video streams.INDEX TERMS Quality of Experience (QoE), video streaming, in-network QoE estimation, machine learning, encrypted traffic.