Traffic Classification (TC) systems are designed to identify the applications generating network traffic. Recent advancements in TC leverage Deep Learning (DL) techniques, surpassing traditional methods in complex scenarios, including those with encrypted traffic. Notably, state-of-the-art DL-based TC systems have been developed for wireless networks using Physical Layer (L1) packets. This approach overcomes the common limitation in TC research that assumes traffic flows within a wired network under a single network management domain. Despite their benefits, DL-based TC systems often demand significant computational resources, typically available only in cloud environments. Consequently, deploying models at the edge is often infeasible due to their resource-intensive nature, given their original training and optimization for high-resource environments. The inherent challenge lies in adapting these systems for edge computing scenarios, including deployment at access points. In this paper, we propose a novel methodology that exploits expert knowledge in combination with recent advances in Multi-Task Learning (MTL) and Deep Neural Network (DNN) optimization to allow spectrum-based TC systems to run on constrained devices. Performance evaluations on an NVIDIA Jetson TX2 demonstrate that our most optimized MTL model, handling four TC tasks, can reduce memory requirements by a factor of 2.65x and improve execution time by 3.6x compared to sequential execution of four Single-Task Learning (STL) models in a servergrade configuration, with minimal accuracy impact (less than a 0.5% drop) and energy efficiency of 0.97 millijoules per sample at inference. Compared to other edge platforms such as the Raspberry Pi model 3B+ (RPI3B+) with a low-power Artificial Intelligence (AI)-accelerator such as the Coral Tensor Processing Unit (TPU), the NVIDIA Jetson achieves a 12-fold improvement in energy efficiency with no impact on accuracy. These are the first available results to provide a benchmark for different performance metrics (memory, computing, energy) over heterogeneous constrained devices for this type of TC system.