“…In addition to the stopwords supplied in the library, the twelve most frequent tokens were used as custom excluded stopwords: data, article, personal, protection, processing, company, authority, regulation, information, case, art, and page. After this pre-processing, the token-based term frequency (TF) and term frequency inverse document frequency (TF-IDF) were calculated from the whole corpus constructed (for the exact formulas used see, e.g., [19]). These common information retrieval statistics are used for evaluating the other part in Q 2 .…”