Extreme classification consists of extreme multi-class or multilabel predictions, whose objective is to learn classifiers that can label each data point with the most relevant labels. Recently, some approaches such as 1-vs-all method have been proposed to accomplish the task. However, their training time is linear with the number of classes, which makes them unrealistic in real-world applications such as text and image tagging. In this work, we are motivated to present a two-stage thread-level parallelism which is based on Partitioned Label Trees for Extreme Classification (Parabel). Our method is able to train the tree nodes in different parallel ways according to their number of labels. We compare our algorithm with recent state-of-the-art approach on some publicly available real-world datasets which have up to 670,000 labels. The experimental results demonstrate that our algorithm achieves the shortest training time.