Developing a reliable and accurate indoor localization system is a crucial step for creating a seamless and interactive user-device experience in nearly all intelligent internet of things (IIoTs) and smart applications. Indoor localization systems based on WiFi fingerprinting have been considered as a promising alternative to model-based approaches owing to their accuracy, low cost, availability, and ease of configuration. However, recent studies have revealed that in complex environments, WiFi fingerprinting techniques are faced with a lot of challenges as the coverage area increases. These challenges include fingerprint spatial uncertainty, instability in the received signal strength indicator (RSSI) and discrepancy in fingerprint distribution. Furthermore, there is frequent need for database upgrades or even recreation whenever there is a change in the architecture of the location. These challenges have questioned the robustness and efficiency of most of the existing schemes. In this paper, we present an indoor localization architecture for complex multi-building multi-floor location prediction and subsequently propose SALLoc (SAE-ALSTM Localization), a WiFi fingerprinting indoor localization scheme based on Stacked Autoencoder (SAE) and Attention-based Long Short-Time Memory (ALSTM) framework. Firstly, stratified sampling technique is used to separate validation set from the entire uneven RSSI training set which ensures that the same proportion of RSSI samples are present in both sets. Secondly, SAE is utilized to select core features and decrease the dimensions of the RSSI samples. Finally, ALSTM is trained to focus on these features to achieve robust location prediction. Extensive investigations were conducted using UJIIndoorLoc, Tampere and UTSIndoorLoc datasets, and the results obtained demonstrated the superiority of the proposed scheme in terms of prediction accuracy, robustness, and generalizations when compared to state-of-the-art methods. The mean localization error (MLE) on UJIIndoorLoc, Tampere and UTSIndoorLoc datasets are 8.28 m, 9.52 m, and 6.48 m respectively. Consequently, it can be concluded that the proposed scheme is accurate and well-suited for large-scale indoor environment location prediction.