In conventional cloud systems, large volumes of data streams are sent to the data centres for monitoring, storage, and analytics. However, migrating all the data to the cloud is often not feasible due to cost, privacy, and performance concerns. Deep neural network (DNN) based applications typically require a tremendous amount of computation, hence cannot be directly deployed on resource-constrained edge devices for learning and analytics. Edge-enhanced compressive offloading becomes a sustainable solution that allows data to be compressed at the edge and offloaded to the cloud for further analytics, therefore reducing bandwidth consumption and communication latency. However, it poses a unique challenge to decode the data and perform inference at the server-side within an acceptable quality of service (QoS) limit. This paper makes the first contribution to address the gaps by designing and implementing a principled compression learning method for discovering the compression models that offer the right QoS for an application. It works by a novel modularisation approach that maps features to models and classifies them for a range of quality of service models. An automated QoS-aware orchestrator has been designed to select the best autoencoder model in real-time for compressive offloading in edge-enhanced clouds based on changing QoS requirements. The orchestrator has been designed to have diagnostic capabilities to search appropriate parameters that give the best compression. To our knowledge, this is one of the first attempts at harnessing the capabilities of autoencoders for edge-enhanced compressive offloading based on portable encodings, latent space splitting, and fine-tuning network weights. Due to the discoverable pool of features offering variety of QoS models, the system is capable of processing a large number of QoS requests in a given time. The search strategy, based on a narrowed search over the entire neural architectural space, reduces the computational cost of searching through the entire space by up to 89%. When deployed on an edge-enhanced cloud in Azure IoT testbed, the approach saves up to 70% data transfer costs and takes 32% less time for job completion. It eliminates the additional computational cost of decompression, thereby reducing the processing cost by up to 30%.