Mining of High Utility Itemset (HUI) is an area of high importance in data mining that involves numerous methodologies for addressing it effectively. When the diversity of items and size of an item is quite vast in the given dataset, then the problem search space that needs to be solved by conventional exact approaches to High Utility Itemset Mining (HUIM) also increases in terms of exponential. This factual issue has made the researchers to choose alternate yet efficient approaches based on Evolutionary Computation (EC) to solve the HUIM problem. Particle Swarm Optimization (PSO) is an EC-based approach that has drawn the attention of many researchers to unravel different NP-Hard problems in real-time. Variants of PSO techniques have been established in recent years to increase the efficiency of the HUIs mining process. In PSO, the Minimization of execution time and generation of reasonable decent solutions were greatly influenced by the PSO control parameters namely Acceleration Coefficient and and Inertia Weight. The proposed approach is called Adaptive Particle Swarm Optimization using Reinforcement Learning with Off Policy (APSO-RLOFF), which employs the Reinforcement Learning (RL) concept to achieve the adaptive online calibration of PSO control and, in turn, to increase the performance of PSO. The state-of-the-art RL approach called the Q-Learning algorithm is employed in the APSO-RLOFF approach. In RL, state-action utility values are estimated during each episode using Q-Learning. Extensive tests are carried out on four benchmark datasets to evaluate the performance of the suggested technique. An exact approach called HUP-Miner and three EC-based approaches, namely HUPEUMU-GRAM, HUIM-BPSO, and AGA_RLOFF, are used to relate the performance of the anticipated approach. From the outcome, it is inferred that the performance metrics of APSO-RLOFF, namely no of discovered HUIs and execution time, outstrip the previously considered EC computations.