Metaverses embedded in our lives create virtual experiences inside of the physical world. Moving towards metaverses in aircraft maintenance, mixed reality (MR) creates enormous opportunities for the interaction with virtual airplanes (digital twin) that deliver a near-real experience, keeping physical distancing during pandemics. 3D twins of modern machines exported to MR can be easily manipulated, shared, and updated, which creates colossal benefits for aviation colleges who still exploit retired models for practicing. Therefore, we propose mixed reality education and training of aircraft maintenance for Boeing 737 in smart glasses, enhanced with a deep learning speech interaction module for trainee engineers to control virtual assets and workflow using speech commands, enabling them to operate with both hands. With the use of the convolutional neural network (CNN) architecture for audio features and learning and classification parts for commands and language identification, the speech module handles intermixed requests in English and Korean languages, giving corresponding feedback. Evaluation with test data showed high accuracy of prediction, having on average 95.7% and 99.6% on the F1-Score metric for command and language prediction, respectively. The proposed speech interaction module in the aircraft maintenance metaverse further improved education and training, giving intuitive and efficient control over the operation, enhancing interaction with virtual objects in mixed reality.