There is an increasing trend in the industry of knowing in real-time the condition of their assets. In particular, tool wear is a critical aspect, which requires real-time monitoring to reduce costs and scrap in machining processes. Traditionally, for the purpose of predicting tool wear conditions in machining, mathematical models have been developed to extract the information from the signal of sensors attached to the machines. To reduce the complexity of developing physical models, where an in-depth knowledge of the system being modelled is required, the current trend is to use machine-learning (ML) models based on data from the tool wear. The acoustic emission (AE) technique has been widely used to capture data from and understand the real-time condition of industrial assets such as cutting tools. However, AE signal interpretation and processing is rather complex. One of the most common features extracted from AE signals to predict the tool wear is the counts parameter, defined as the number of times that the amplitude of the signal exceeds a predefined threshold. A recurrent problem of this feature is to define the adequate threshold to obtain consistent wear prediction. Additionally, AE signal bandwidth is rather wide, and the selection of the optimum frequencies band for feature extraction has been pointed out as critical and complex by many authors. To overcome these problems, this paper proposes a methodology that applies multi-threshold count feature extraction at multiresolution level using wavelet packet transform, which extracts a redundant and non-optimal feature map from the AE signal. Next, recursive feature elimination is performed to reduce and optimize the vast number of predicting features generated in the previous step, and random forests regression provides the estimated tool wear. The methodology presented was tested using data captured when turning 19NiMoCr6 steel under pre-established cutting conditions. The results obtained were compared with several ML algorithms such as k-nearest neighbors, support vector machines, artificial neural networks and decision trees. Experimental results show that the proposed method can reduce the predicted root mean squared error by 36.53%.