Traditional phenotyping relies on experts visually examining plants for physical traits like size, color, or disease presence. Measurements are taken manually using rulers, scales, or color charts, with all data recorded by hand. This labor-intensive and time-consuming process poses a significant obstacle to the efficient breeding of new cultivars. Recent innovations in computer vision and machine learning offer potential solutions for accelerating the development of robust and highly effective plant phenotyping. This study introduces an efficient plant recognition framework that leverages the power of the Segment Anything Model (SAM) guided by Explainable Contrastive Language–Image Pretraining (ECLIP). This approach can be applied to a variety of plant types, eliminating the need for labor-intensive manual phenotyping. To enhance the accuracy of plant phenotype measurements, a B-spline curve is incorporated during the plant component skeleton extraction process. The effectiveness of our approach is demonstrated through experimental results, which show that the proposed framework achieves a mean absolute error (MAE) of less than 0.05 for the majority of test samples. Remarkably, this performance is achieved without the need for model training or labeled data, highlighting the practicality and efficiency of the framework.