In this paper we present an analysis on the usage of Deep Neural Networks for extreme multi-label and multiclass text classification. We will consider two network models: the first one is formed by a word embeddings (WEs) stage followed by two dense layers, hereinafter Dense, and a second model with a convolution stage between the WEs and the dense layers, hereinafter CNN-Dense. We will take into account classification problems characterized by different number of labels, ranging from an order of 10 to an order of 30, 000, showing the different performances of the neural networks varying the total label number and the average number of labels for sample, exploiting the hierarchical structure of the label space of the dataset used for experimental assessment. It is worth noting that multi-label classification is an harder problem if compared to multi-class, due to the variable number of labels associated to each sample. We will even investigate on the behaviour of the neural networks as function of the training hyperparameters, analysing the link between them and the dataset complexity. All the result will be evaluated using the PubMed scientific articles collection as test case.