Word difficulty in the listening task is considered challenging because of high subjectivity, high dimensionality, and low generalizability. We propose a word listening difficulty score, as a linear combination of several complementary features. A dataset of expert-annotated partial and synchronized captions for TED talks is prepared for a target language proficiency, in which only the difficult words are shown. A linear SVM was trained on this dataset, and the learned parameters of the SVM were transferred to the proposed score. This data-driven score demonstrates higher accuracy on the annotated dataset and facilitates model and feature expansion.