At present, many diseases are diagnosed by computer tomography (CT) image technology, which affects the health of the lives of millions of people. In the process of disease confrontation, it is very important for patients to detect diseases in the early stage by deep learning of 3D CT images. The paper offers a hybrid multi-instance learning model (HLFSRNN-MIL), which hybridizes high-low frequency feature fusion (HLFFF) with sequential recurrent neural network (SRNN) for CT image classification tasks. Firstly, the hybrid model uses Resnet-50 as the deep feature. The main feature of the HLFSRNN-MIL lies in its ability to make full use of the advantages of the HLFFF and SRNN methods to make up for their own weakness; i.e., the HLFFF can extract more targeted feature information to avoid the problem of excessive gradient fluctuation during training, and the SRNN is used to process the time-related sequences before classification. The experimental study of the HLFSRNN-MIL model is on two public CT datasets, namely, the Cancer Imaging Archive (TCIA) dataset on lung cancer and the China Consortium of Chest CT Image Investigation (CC-CCII) dataset on pneumonia. The experimental results show that the model exhibits better performance and accuracy. On the TCIA dataset, HLFSRNN-MIL with Residual Network (ResNet) as the feature extractor achieves an accuracy (ACC) of 0.992 and an area under curve (AUC) of 0.997. On the CC-CCII dataset, HLFSRNN-MIL achieves an ACC of 0.994 and an AUC of 0.997. Finally, compared with the existing methods, HLFSRNN-MIL has obvious advantages in all aspects. These experimental results demonstrate that HLFSRNN-MIL can effectively solve the disease problem in the field of 3D CT images.