Ensemble-based collaborative inference systems, Edge Ensembles, are deep learning edge inference systems that enhance accuracy by aggregating predictions from models deployed on each device. They offer several advantages, including scalability based on task complexity and decentralized functionality without dependency on centralized servers. In general, ensemble methods effectively improve the accuracy of deep learning, and conventional research uses several model integration techniques for deep learning ensembles. Some of these existing integration methods are more effective than those used in previous Edge Ensembles. However, it remains uncertain whether these methods can be directly applied in the context of cooperative inference systems involving multiple edge devices. This study investigates the effectiveness of conventional model integration techniques, including cascade, weighted averaging, and testtime augmentation (TTA), when applied to Edge Ensembles to enhance their performance. Furthermore, we propose enhancements of these techniques tailored for Edge Ensembles. The cascade reduces the number of models required for inference but worsens latency by sequential inference processing. To address this latency issue, we propose m-parallel cascade, which adjusts the number of models processed simultaneously to m. We also propose learning TTA policies and weights for weighted averaging using ensemble prediction labels instead of ground truth labels. In the experiments, we verified the effectiveness of each technique for Edge Ensembles. The proposed m-parallel cascade achieved a 2.8 times reduction in latency compared to the conventional cascade, even with a 1.06 times increase in computational costs. Additionally, the ensemble label-based learning demonstrated comparable effectiveness to the approach using ground truth labels.INDEX TERMS Ensemble, edge computing, collaborative inference, neural networks, cascade, test time augmentation.