QuaPy: A Python-Based Framework for Quantification

Moreo, Alejandro; Esuli, Andrea; Sebastiani, Fabrizio

doi:10.1145/3459637.3482015

Cited by 12 publications

(8 citation statements)

References 26 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…SLD is the strongest baseline; this is true in all four subtasks, in which SLD, while never being the best performer, is always in the top ranks. This confirms the fact (already recorded in previous work -see e.g., [34][35][36]) that SLD is a very strong performer when the APP is used for generating the dataset, i.e., when the test data contain many samples characterized by substantial distribution shift.…”

Section: Resultssupporting

confidence: 88%

“…CC and PCC obtain very low quantification accuracy; this is the case in all four subtasks, where these two methods are always near the bottom of the ranking. This confirms the fact (already recorded in previous worksee e.g., [34][35][36]) that they are not good performers when the APP is used for generating the dataset, i.e., they are not good performers when there is substantial distribution shift. Interestingly enough, CC always outperforms PCC, which was somehow unexpected.…”

Section: Resultssupporting

confidence: 86%

“…In order to set a sufficiently high bar for the participants to overcome, we made them aware of the availability of QuaPy [34], a library of quantification methods that contains, among others, implementations of a number of methods that have performed well in recent comparative evaluations. 3 QuaPy is a publicly available, open-source, Python-based framework that we have recently developed, and that implements not only learning methods, but also evaluation measures, parameter optimisation routines, and evaluation protocols, for LQ.…”

Section: Baselinesmentioning

confidence: 99%

See 2 more Smart Citations

A Concise Overview of LeQua@CLEF 2022: Learning to Quantify

Esuli

Moreo

Sebastiani

et al. 2022

Lecture Notes in Computer Science

Self Cite

View full text Add to dashboard Cite

LeQua 2022 is a new lab for the evaluation of methods for "learning to quantify" in textual datasets, i.e., for training predictors of the relative frequencies of the classes of interest Y = {y1, ..., yn} in sets of unlabelled textual documents. While these predictions could be easily achieved by first classifying all documents via a text classifier and then counting the numbers of documents assigned to the classes, a growing body of literature has shown this approach to be suboptimal, and has proposed better methods. The goal of this lab is to provide a setting for the comparative evaluation of methods for learning to quantify, both in the binary setting and in the single-label multiclass setting; this is the first time that an evaluation exercise solely dedicated to quantification is organized. For both the binary setting and the single-label multiclass setting, data were provided to participants both in ready-made vector form and in raw document form. In this overview article we describe the structure of the lab, we report the results obtained by the participants in the four proposed tasks and subtasks, and we comment on the lessons that can be learned from these results.The LeQua 2022 lab (https://lequa2022.github.io/) at CLEF 2022 has a "shared task" format; it is a new lab, in two important senses:-No labs on LQ have been organized before at CLEF conferences.-Even outside the CLEF conference series, quantification has surfaced only episodically in previous shared tasks. The first such shared task was Se-mEval 2016 Task 4 "Sentiment Analysis in Twitter" [37], which comprised a binary quantification subtask and an ordinal quantification subtask (these two subtasks were offered again in the 2017 edition). Quantification also featured in the Dialogue Breakdown Detection Challenge [23], in the Dialogue Quality subtasks of the NTCIR-14 Short Text Conversation task [46], and in the NTCIR-15 Dialogue Evaluation task [47]. However, quantification was never the real focus of these tasks. For instance, the real focus of the tasks described in [37] was sentiment analysis on Twitter data, to the point that almost all participants in the quantification subtasks used the trivial "classify and count" method, and focused, instead of optimising the quantification component, on optimising the sentiment analysis component, or on picking the best-performing learner for training the classifiers used by "classify and count". Similar considerations hold for the tasks discussed in [23,46,47].

show abstract

Section: Resultssupporting

confidence: 88%

Section: Resultssupporting

confidence: 86%

Section: Baselinesmentioning

confidence: 99%

See 1 more Smart Citation

A Concise Overview of LeQua@CLEF 2022: Learning to Quantify

Esuli

Moreo

Sebastiani

et al. 2022

Lecture Notes in Computer Science

Self Cite

View full text Add to dashboard Cite

show abstract

“…In the first such talk, Alejandro Moreo presented (joint work with Andrea Esuli and Fabrizio Sebastiani) QuaPy (https://github.com/HLT-ISTI/QuaPy), an open-source Python-based software library for LQ. QuaPy (which was also the object of a presentation in the main CIKM 2021 conference -see [7]) provides implementations of both baseline and advanced LQ methods, of routines for LQoriented model selection, of several broadly accepted evaluation measures, and of robust evaluation protocols routinely used in the field. QuaPy also makes available datasets commonly used for testing quantifiers, and offers visualization tools for facilitating the analysis and interpretation of the results.…”

Section: The Workhopmentioning

confidence: 99%

Report on the 1st International Workshop on Learning to Quantify (LQ 2021)

Coz

González

Moreo

et al. 2022

SIGKDD Explor. Newsl.

Self Cite

View full text Add to dashboard Cite

The 1st International Workshop on Learning to Quantify (LQ 2021 - https://cikmlq2021.github.io/), organized as a satellite event of the 30th ACM International Conference on Knowledge Management (CIKM 2021), took place on two separate days, November 1 and 5, 2021. As the main CIKM 2021 conference, the workshop was held entirely online, due to the COVID-19 pandemic. This report presents a summary of each keynote speech and contributed paper presented in this event, and discusses the issues that were raised during the workshop.

show abstract

“…We have recently developed (and made publicly available) QuaPy, an opensource, Python-based framework that implements several learning methods, evaluation measures, parameter optimisation routines, and evaluation protocols, for LQ [16]. 5 Among other things, QuaPy contains implementations of the baseline methods and evaluation measures officially adopted in LeQua 2022.…”

Section: Baselinesmentioning

confidence: 99%