Autonomous driving and self-driving vehicles have become the most popular selection for customers for their convenience. Vehicle angle prediction is one of the most prevalent topics in the autonomous driving industry, that is, realizing real-time vehicle angle prediction. However, existing methods of vehicle angle prediction utilize only single-modal data to achieve model prediction, such as images captured by the camera, which limits the performance and efficiency of the prediction system. In this paper, we present Emma, a novel vehicle angle prediction strategy that achieves multi-modal prediction and is more efficient. Specifically, Emma exploits both images and inertial measurement unit (IMU) signals with a fusion network for multi-modal data fusion and vehicle angle prediction.Moreover, we design and implement a few-shot learning module in Emma for fast domain adaptation to varied scenarios (e.g., different vehicle models). Evaluation results demonstrate that Emma achieves overall 97.5% accuracy in predicting three vehicle angle parameters (yaw, pitch, and roll), which outperforms traditional single-modalities by approximately 16.7%-36.8%. Additionally, the few-shot learning module presents promising adaptive ability and shows overall 79.8% and 88.3% accuracy in 5-shot and 10-shot settings, respectively. Finally, empirical results show that Emma reduces energy consumption by 39.7% when running on the Arduino UNO board.