Predicting the ship fuel consumption constitutes a prerequisite for speed, trim, and voyage optimization. In spite of the rise of deep learning and transformers in many domains, research works train shallow machine learning (ML) algorithms for predicting ship fuel oil consumption (FOC). Although the auxiliary machinery is in support of the main propulsion engines and the emissions from ships' auxiliary engines contribute to the environmental pollution, most existing research initiatives train ML algorithms for predicting only the main engine FOC. Additionally, all the existing research initiatives use the mean squared error (MSE) as the loss function. However, recent studies have shown that neural network models tend to replicate the last observed value of the time series, thus limiting their applicability to real-world data.To address these limitations, this is the first study proposing transformer-based approaches and a multitask learning (MTL) framework. Firstly, the authors introduce Single-Task learning (STL) models consisting of BiLSTMs and MultiHead Self-Attention for predicting the main and auxiliary engine FOC. Secondly, the authors introduce the first MTL setting, which predicts the main and auxiliary engine FOC simultaneously allowing one task to inform the other. A loss function is introduced, which includes a regularization term for penalizing the replication of previously seen values. The authors evaluate the proposed approaches using data from three fishing ships and compare these approaches with traditional ML algorithms. Extensive experiments show that the introduced MTL models can improve the R 2 score, mean bias error, root mean squared error, and mean absolute error in comparison with shallow ML algorithms.