Network operators require real-time traffic monitoring insights to provide high performance and security to their customers. It has been shown that artificial intelligence and machine learning (ML) can improve the visibility of telemetry systems, especially with encrypted traffic. However, current solutions cannot cope with high traffic rates and volumes in largescale networks. To realize the ML-driven network intelligence paradigm at terabit scale, we design Marina, a system that spreads monitoring over a highly efficient data plane, which can extract traffic statistics at line rate, and a powerful ML server, which can run monitoring inference using complex ML models. We apply temporal microaggregation into sub-second time slots and extract moment-based statistics. These allow to flexibly obtain accurate ML-based monitoring decisions during the next time slot. To demonstrate the scalability of our design, we implement and evaluate a Marina data plane prototype on a Barefoot Wedge 100BF-65X P4 switch, which can monitor more than 520,000 concurrent flows at full switching capacity of 6.4 Tbps. We validate the analytics capabilities enabled by our Marina implementation for four ML-driven real-time monitoring tasks with a broad set of standard ML models, achieving comparable or better than state-of-the-art results.