In the algorithm selection problem, where the task is to identify the most suitable solving technique for a particular situation, most methods used as performance mapping mechanisms have been relatively simple models such as logistic regression or neural networks. In the latter case, most implementations tend to have a shallow and straightforward architecture and, thus, exhibit a limited ability to extract relevant patterns. This research explores the use of attention-based neural networks as meta-learners to improve the performance mapping mechanism in the algorithm selection problem and fully take advantage of the model’s capabilities for pattern extraction. We compare the proposed use of an attention-based meta-learner method as a performance mapping mechanism against five models from the literature: multi-layer perceptron, k-nearest neighbors, softmax regression, support vector machines, and decision trees. We used a meta-data dataset obtained by solving the vehicle routing problem with time window (VRPTW) instances contained in the Solomon benchmark with three different configurations of the simulated annealing meta-heuristic for testing purposes. Overall, the attention-based meta-learner model yields better results when compared to the other benchmark methods in consistently selecting the algorithm that best solves a given VRPTW instance. Moreover, by significantly outperforming the multi-layer perceptron, our findings suggest promising potential in exploring more recent and novel advancements in neural network architectures.