“…Also, with the use of supervised methods, it is very unlikely that more than a few thousand tweets are relevant to a given discussion topic [ 38 ]. Still, the massive amount of data, in conjunction with the small amount of ground truth, poses a real challenge for the classification task [ 39 , 40 ]. Consequently, to overcome these challenges, a heuristic method using unsupervised learning techniques can deliver promising results in the health-related domain.…”