Nowadays, neural networks algorithms, such as those based on Attention and Transformers, have excelled on Automatic Text Classification (ATC). However, such enhanced performance comes at high computational costs. Stacking of simpler classifiers that exploit algorithmic and representational complementarity has also been shown to produce superior performance in ATC, enjoying high effectiveness and potentially lower computational costs than complex neural networks. In this master's thesis, we present the first and largest comparative study to exploit the cost-effectiveness of Stacking in ATC, consisting of Transformers and non-neural algorithms. In particular, we are interested in answering the following research question: Is it possible to obtain an effective ensemble with significantly less computational cost than the best learning model for a given dataset? Besides answering that question, another main contribution of this thesis is the proposal of a low-cost oracle-based method that can predict the best ensemble in each scenario using only a fraction of the training data.