“…However, in the last couple of years, a structure first in natural language processing 15 known as Transformers 16 has success-fully been employed within bioinformatics, e.g., structure prediction, 17,18 gene expression prediction, 19 and even within MS-based proteomics, e.g., peptide detection problem, 20 DIA library generation for the phosphoproteome, 21 and de novo interpretation of MS2 spectra. 22 Transformers are, like RNNs, designed to handle sequential input data and do so through attention mechanisms, i.e., mechanisms that enhance the essential parts of the input sequence for its output. However, unlike RNNs, the Transformers do not use recurrence, thus enabling a significant speed-up by parallelizing their training.…”