“…The state and action spaces are discrete and their corresponding value function is stored in what is known as a Q-table. To use Q-learning with continuous systems (continuous state and action spaces), one can discretize the state and action spaces [21,[31][32][33][34][35][36] or use some type of function approximation such as FISs [26][27][28], NNs [12,19,37], or use some type of optimization technique such as genetic algorithms [38,39].…”