Due to sensor failures and other issues, real-world time series may contain missing values, often in consecutive segments. Classification of such time series is an important task with prominent applications in various domains such as medicine, manufacturing, social networks and environmental sciences. In this paper, we consider various approaches that have been designed for this task, in particular, fully-convolutional neural networks (FCNs) with sparsity-invariant convolution and dynamic time warping convolution. We compare their performance to that of a standard transformer, TARNet, which has not been tailored to the classification of time series with missing values. Our results indicate that even this simple transformer may outperform the aforementioned models that were designed to deal with missing values. As this observation is consistent for many datasets from various domains and various distributions of missing values, we conclude that transformers are an exceptionally strong baseline for the classification of time series with missing values. In order to support the reproduction of our results as well as follow-up works, we performed the aforementioned experiments on publicly available time series datasets using a publicly available implementation of TARNet.