The framework of reinforcement learning-based optimal control depends on a mathematical formulation of intelligent decision making. In this article, we demonstrated the comprehensive design framework for offline reinforcement learning algorithms that utilizes sparse and discrete data space for efficient decisionmaking purposes. Learning is often difficult with the sparse reward function under the absence of optimization. Hence, an optimized map can be used in the reward function to improve efficacy. Some reward functions outperform sparse reward, such as "map completeness" and "information gain". "Map completeness" is proportional to the difference between the current time step and the previous time step, while "information gain" utilizes the entropy information for measuring the uncertainty in the map. On a whole. this article proposes a framework for the development of optimized reward function-based reinforcement learning based control strategy.