As Extended Reality (XR) continues to grow, new possibilities arise to provide users with novel ways to experience cultural heritage (CH). In particular, applications based on Virtual Reality (VR) like, e.g., virtual museums, have gained increasing popularity, since they can offer new ways for preserving and presenting CH content that are not feasible in physical museums. Despite the numerous benefits, the level of immersion and presence provided by VR experiences still present challenges that could hinder the effectiveness of this technology in the CH context. In this perspective, it is crucial to provide the users with high-fidelity experiences, in which also the interaction with the objects and the characters populating virtual environments are realistic and natural. This paper focuses on this challenge and specifically investigates how the combined use of tangible and speech interfaces can help to improve the overall experience. To this aim, a immersive VR experience is proposed, which allows the users to manipulate virtual objects belonging to a museum collection (in the specific case, Ancient Egypt remains) by physically operating on 3D printed replicas and to talk with a curator’s avatar to get explanations by using their voice. A user study was conducted to evaluate the impact of the considered interfaces on immersion, presence, user experience, usability, and intention to visit, comparing the richest configuration against simpler setups obtained by either removing the tangible interface, the speech interface or both (and using only handheld controllers). The results showed that the combined use of the two interfaces can effectively contribute at making the CH experience in VR more engaging.