We propose a three-dimensional (3D) multimodal medical imaging system that combines freehand ultrasound and structured light 3D reconstruction in a single coordinate system without requiring registration. To the best of our knowledge, these techniques have not been combined as a multimodal imaging technique. The system complements the internal 3D information acquired with ultrasound with the external surface measured with the structured light technique. Moreover, the ultrasound probe's optical tracking for pose estimation was implemented based on a convolutional neural network. Experimental results show the system's high accuracy and reproducibility, as well as its potential for preoperative and intraoperative applications. The experimental multimodal error, or the distance from two surfaces obtained with different modalities, was 0.12 mm. The code is available in a Github repository. © 2021 Society of Photo-Optical Instrumentation Engineers (SPIE)