Depth estimation from images is fundamental for autonomous navigation of robots, vehicles, drones, and integrating navigation aid systems for people with visual impairments. Despite the challenges of obtaining depth information from complex scenes, advancements in Deep Learning have opened new possibilities. Thus, this work introduces an approach based on recent Convolutional Neural Network architectures and attention mechanisms to enhance monocular image depth estimation, with potential applications in navigation aid systems for the visually impaired. The proposal focuses on implementing a Convolutional Neural Network model with an attention mechanism configuration that has not yet been tested in the literature, primarily integrating the Convolutional Block Attention Module and the Modified Global Context Network in the encoder and decoder, respectively. Unlike stereo camera-based systems, which require complex setups and image pairs, this model simplifies data collection and processing, although it still faces the challenge of requiring large datasets and significant computational capacity. However, it is experimentally possible to demonstrate that these limitations can be overcome by using reduced-resolution images and resizing techniques. The evaluation of the proposed model indicated satisfactory performance compared to state-of-the-art works that use images with resolutions identical to those in this work, validating the comparative tests. It presented an improvement in the Absolute Relative Error of 25.22% and 6.28% in relation to the Root Mean Squared Error. The results highlight the feasibility of conducting Deep Learning research, even with limited hardware resources.