Lung diseases have been a significant concern throughout history, necessitating early disease prediction using high-level knowledge. Deep Learning models have proven effective in diagnosing lung disorders using clinical imaging modalities like Computerized Tomography (CT) and Chest X-Ray (CXR) images. The Ensemble Deep Lung Disease Predictor (EDEPLDP) framework has been proposed for the rapid detection of various diseases using CT and CXR images. However, the U-Net model used for segmentation tasks lacks sufficient low-level localization abilities. To address this, a Semantic Location enhanced Swin Transformer-based U-Net (SLST-U-Net)+EDEPLDP model is proposed in this article. This model leverages Location Attention (LA) and De Mejora Progresiva (DMP) to enhance feature discrimination at the level of spatial information and semantic position. The Contextual Guidance Attention (CGA) method combines spatial and semantic information. The DMP enhances feature discrimination by increasing edge data inference and providing a richer depiction of the target position. The CGA reduces the semantic gap and effectively fuses spatial texture information and semantic information. The LA mechanism improves computational capacity for semantic features and precision of semantic position data, enabling retrieval of longrange contextual data in channel and geographic contexts. Additionally, Swin Transformer (ST) is added in the encoder and decoder section of U-net to increase the finer details of spatial and semantic information. Finally, the features extraction and classification part of EDEPLDP is employed to detect and classify the lung diseases. Experimental results revealed that the proposed SLST-U-Net+EDEPLDP model outperforms the CNN, E2E-DNN LungNet22, EfficientNet-SE, LDDNet and EDepLDP models with an accuracy of 94.94% and 95.42% on CXR and CT images, respectively.