Since the emergence of wireless communication networks, a plethora of research papers focus their attention on the quality aspects of wireless links. The analysis of the rich body of existing literature on link quality estimation using models developed from data traces indicates that the techniques used for modeling link quality estimation are becoming increasingly sophisticated. A number of recent estimators leverage Machine Learning (ML) techniques that require a sophisticated design and development process, each of which has a great potential to significantly affect the overall model performance. In this paper, we provide a comprehensive survey on link quality estimators developed from empirical data and then focus on the subset that use ML algorithms. We analyze ML-based Link Quality Estimation (LQE) models from two perspectives using performance data. Firstly, we focus on how they address quality requirements that are important from the perspective of the applications they serve. Secondly, we analyze how they approach the standard design steps commonly used in the ML community. Having analyzed the scientific body of the survey, we review existing open source datasets suitable for LQE research. Finally, we round up our survey with the lessons learned and design guidelines for MLbased LQE development and dataset collection.