Finger-vein recognition, which is one of the conventional biometrics, hinders fake attacks, is cheaper, and it features a higher level of user-convenience than other biometrics because it uses miniaturized devices. However, the recognition performance of finger-vein recognition methods may decrease due to a variety of factors, such as image misalignment that is caused by finger position changes during image acquisition or illumination variation caused by non-uniform near-infrared (NIR) light. To solve such problems, multimodal biometric systems that are able to simultaneously recognize both finger-veins and fingerprints have been researched. However, because the image-acquisition positions for finger-veins and fingerprints are different, not to mention that finger-vein images must be acquired in NIR light environments and fingerprints in visible light environments, either two sensors must be used, or the size of the image acquisition device must be enlarged. Hence, there are multimodal biometrics based on finger-veins and finger shapes. However, such methods recognize individuals that are based on handcrafted features, which present certain limitations in terms of performance improvement. To solve these problems, finger-vein and finger shape multimodal biometrics using near-infrared (NIR) light camera sensor based on a deep convolutional neural network (CNN) are proposed in this research. Experimental results obtained using two types of open databases, the Shandong University homologous multi-modal traits (SDUMLA-HMT) and the Hong Kong Polytechnic University Finger Image Database (version 1), revealed that the proposed method in the present study features superior performance to the conventional methods.