Edible Bird’s Nest (EBN), a costly food product made from swiftlet’s saliva, has encountered a longstanding problem of plucking the swiftlet’s feather from the nests. The destructive and inefficient manual process of plucking the feathers can be substituted with a serine protease enzyme alternative. Accurate detection of enzyme dosage is crucial for ensuring efficient feather degradation with cost-effective enzyme usage. This research employed the transfer learning method using pretrained Convolutional Neural Networks (Pt-CNN) to detect enzyme dosage based on EBN’s images. This study aimed to compare the image classification mechanisms, architectures, and performance of three Pt-CNN: Resnet50, InceptionResnetV2, and EfficientNetV2S. InceptionResnetV2, using parallel convolutions and residual networks, significantly contributes to learning rich informative features. Consequently, the InceptionResnetV2 model achieved the highest accuracy of 96.18%, while Resnet50 and EfficientNetV2S attained only 30.44% and 17.82%, respectively. The differences in architecture complexity, parameter count, dataset characteristics, and image resolution also play a role in the performance disparities among the models. The study’s findings aid future researchers in streamlining model selection when facing limited datasets by understanding the reasons for the model’s performance and contributing to a non-destructive and quick solution for EBN’s cleaning process.