“…As deep learning technology has been developed and used for fast and efficient analysis of medical images acquired by techniques such as X-ray, computed tomography (CT), and magnetic resonance imaging (MRI) [ 18 , 19 , 20 ], recent studies have tried to apply deep learning to automate VFSS analysis [ 21 , 22 , 23 , 24 , 25 , 26 , 27 ]. However, we found only two studies that proposed deep learning models to detect the hyoid bone or track its movement in VFSS images [ 21 , 27 ]. Zhang et al proposed the single shot multibox detector (SSD) model that can detect the hyoid bone fully automatically, but it showed less than perfect accuracy (mAP of the SSD-500 model = 89.14%), and tracking the whole movement of the hyoid bone was not attempted [ 27 ].…”