Traditional image retrieval methods often face challenges in adapting to varying user preferences and dynamic datasets. To address these limitations, this research introduces a novel image retrieval framework utilizing deep deterministic policy gradients (DDPG) augmented with a self-adaptive reward mechanism (SARM). The DDPG-SARM framework dynamically adjusts rewards based on user feedback and retrieval context, enhancing the learning efficiency and retrieval accuracy of the agent. Key innovations include dynamic reward adjustment based on user feedback, context-aware reward structuring that considers the specific characteristics of each retrieval task, and an adaptive learning rate strategy to ensure robust and efficient model convergence. Extensive experimentation with the three distinct datasets demonstrates that the proposed framework significantly outperforms traditional methods, achieving the highest retrieval accuracy having 3.38%, 5.26%, and 0.21% improvement overall as compared to the mainstream models over DermaMNIST, PneumoniaMNIST, and OrganMNIST datasets, respectively. The findings contribute to the advancement of reinforcement learning applications in image retrieval, providing a user-centric solution adaptable to various dynamic environments. The proposed method also offers a promising direction for future developments in intelligent image retrieval systems.