Objective. Auditory Attention Decoding (AAD) determines which speaker the listener is focusing on by analyzing his/her EEG. CNN was adopted to extract Spectro-Spatial-Feature (SSF) from short-time-interval of EEG to detect auditory spatial attention without stimuli. However, the following factors are not considered in SSF-CNN scheme. i) Single-band frequency analysis cannot represent the EEG pattern precisely. ii) The power cannot represent the EEG feature related to the dynamic patterns of the attended auditory stimulus. iii) The temporal feature of EEG representing the relationship between EEG and attended stimulus is not extracted. To solve these problems, SSF-CNN scheme was modified. Approach. i) Multiple-frequency bands, but not a single alpha frequency band, of EEG, were analyzed to represent the EEG pattern more precisely. ii) Differential Entropy (DE), but not power, was extracted from each frequency band to represent the disorder degree of EEG, which was related to the dynamic patterns of the attended auditory stimulus. iii) CNN and Convolutional-Long-Short-Term-Memory (ConvLSTM) were combined to extract spectro-spatial-temporal features from the 3-D descriptor sequence constructed based on the topographical activity maps of multiple-frequency bands. Main results. Experimental results on KUL, DTU, and PKU with 0.1s, 1s, 2s, and 5s decision windows demonstrated that: i) The proposed model outperformed SSF-CNN and state-of-the-art AAD models. Specifically, when the auditory stimulus was unavailable, AAD accuracy could be enhanced by at least 3:25%, 3:96%, and 5:08% on KUL, DTU, and PKU, respectively, compared with the baselines. And, on KUL, the longer decision window corresponded to lower enhancement, while on both DTU and PKU, the longer decision window corresponded to higher enhancement, except for two cases when decision window length was 2s on PKU or 5s on DTU. ii) Each modification contributed to the performance enhancement. Significance. DE feature, multi-band frequency analysis, and ConvLSTM-based temporal analysis help to enhance AAD accuracy.