Photoacoustic computed tomography (PACT) has centimeter‐level imaging ability and can be used to detect the human body. However, strong photoacoustic signals from skin cover deep tissue information, hindering the frontal display and analysis of photoacoustic images of deep regions of interest. Therefore, we propose a 2.5 D deep learning model based on feature pyramid structure and single‐type skin annotation to extract the skin region, and design a mask generation algorithm to remove skin automatically. PACT imaging experiments on the human periphery blood vessel verified the correctness our proposed skin‐removal method. Compared with previous studies, our method exhibits high robustness to the uneven illumination, irregular skin boundary, and reconstruction artifacts in the images, and the reconstruction errors of PACT images decreased by 20% ~ 90% with a 1.65 dB improvement in the signal‐to‐noise ratio at the same time. This study may provide a promising way for high‐definition PACT imaging of deep tissues.