Medical datasets are usually imbalanced, where negative cases severely outnumber p osit iv e cases. Therefore, it is essential to deal with this data skew problem when training machine learning algorithms. This study uses two representative lung cancer datasets, PLCO an d NLST, wit h imb alan ce ratios (the proportion of samples in the majority class to those in the minority class) of 24.7 and 25.0, respectively, to predict lung cancer incidence. This research uses the performance o f 23 clas s imb alan ce methods (resampling and hybrid systems) with three classical classifiers (logistic regression, random forest, and LinearSVC) to identify the best imbalance techniques suitable for medical datasets. Resampling includes ten under-sampling methods (RUS, Etc.), seven over-sampling methods (SMOTE, Etc.), an d t wo integrated sampling methods (SMOTEENN, SMOTE-Tomek). Hybrid systems include (Balanced Bagging, Etc.). The results show that class imbalance learning can improve the classification abilit y o f t h e mo d el. Compared with other imbalanced techniques, under-sampling techniques have the highest standard deviation (SD), and over-sampling techniques have the lowest SD. Over-sampling is a stable met h od, an d the AUC in the model is generally higher than in other ways. Using ROS, the random forest p erforms t h e best predictive ability and is more suitable for the lung cancer datasets used in this study.