This paper aims to handle the optimal attitude tracking control tasks for rigid bodies via a reinforcement learning-based control scheme, in which a constrained parameter estimator is designed to compensate system uncertainties accurately. This estimator guarantees the exponential convergence of estimation errors and can strictly keep all instant estimates always within pre-determined bounds. Based on it, a criticonly adaptive dynamic programming (ADP) control strategy is proposed to learn the optimal control policy with respect to a user-defined cost function. The matching condition on reference control signals, which is commonly employed in relevant ADP design, is not required in the proposed control scheme. We prove the uniform ultimate boundedness of the tracking errors and critic weight's estimation errors under finite excitation conditions by Lyapunov-based analysis. Moreover, an easy-to-implement initial control policy is designed to trigger the real-time learning process. The effectiveness and advantages of the proposed method are verified by both numerical simulations and hardware-in-loop experimental tests.