Ear images are easy to capture, and ear features are relatively stable and can be used for identification. The ear images are all asymmetric, and the asymmetry of the ear images collected in the unconstrained environment will be more pronounced, increasing the recognition difficulty. Most recognition methods based on hand-crafted features perform poorly in terms of recognition performance in the face of ear databases that vary significantly in terms of illumination, angle, occlusion, and background. This paper proposes a feature fusion human ear recognition method based on channel features and dynamic convolution (CFDCNet). Based on the DenseNet-121 model, the ear features are first extracted adaptively by dynamic convolution (DY_Conv), which makes the ear features of the same class of samples more aggregated and different types of samples more dispersed, enhancing the robustness of the ear feature representation. Then, by introducing an efficient channel attention mechanism (ECA), the weights of important ear features are increased and invalid features are suppressed. Finally, we use the Max pooling operation to reduce the number of parameters and computations, retain the main ear features, and improve the model’s generalization ability. We performed simulations on the AMI and AWE human ear datasets, achieving 99.70% and 72.70% of Rank-1 (R1) recognition accuracy, respectively. The recognition performance of this method is significantly better than that of the DenseNet-121 model and most existing human ear recognition methods.