Although limited research has explored the integration of electroencephalography (EEG) and deep learning approaches for attention deficit hyperactivity disorder (ADHD) detection, using deep learning models for actual data, including EEGs, remains a difficult endeavour. The purpose of this work was to evaluate how different attention processes affected the performance of well-established deep-learning models for the identification of ADHD. Two specific architectures, namely long short-term memory (LSTM)+ attention (Att) and convolutional neural network (CNN)s+Att, were compared. The CNN+Att model consists of a dropout, an LSTM layer, a dense layer, and a CNN layer merged with the convolutional block attention module (CBAM) structure. On top of the first LSTM layer, an extra LSTM layer, including T LSTM cells, was added for the LSTM+Att model. The information from this stacked LSTM structure was then passed to a dense layer, which, in turn, was connected to the classification layer, which comprised two neurons. Experimental results showed that the best classification result was achieved using the LSTM+Att model with 98.91% accuracy, 99.87% accuracy, 97.79% specificity and 98.87% F1score. After that, the LSTM, CNN+Att, and CNN models succeeded in classifying ADHD and Normal EEG signals with 98.45%, 97.74% and 97.16% accuracy, respectively. The information in the data was successfully utilized by investigating the application of attention mechanisms and the precise position of the attention layer inside the deep learning model. This fascinating finding creates opportunities for more study on largescale EEG datasets and more reliable information extraction from massive data sets, ultimately allowing links to be made between brain activity and specific behaviours or task execution.