Owing to the exponentially growing amount of transportation on the road, the number of mishaps on daily basis is also growing at a shocking rate. There is one death every 4 minutes due to road mishaps in India. One of the foremost causes of these road mishaps to occur is the poor road environment. In the present work, Reinforcement learning is used to classify road quality. Reinforcement learning is a part of machine learning. Suitable actions are taken to maximize the reward. Reinforcement learning is different from supervised learning, in reinforcement learning, there is no trained dataset but the agent learns from its experience and decides what actions should be performed for the given task in order to achieve the maximum reward. We use Qlearning a Reinforcement learning algorithm that will find the best action, in the given current state of an environment. It chooses the action randomly and aims to maximize the reward. An environment is created in which policy, action, and rewards to classify road structure quality are defined. The dataset has 15 features and 5 classes based on which policy is defined that defines the next action to be taken based on the state. Training and Testing are done with 80:20 ratios. Accuracy for q-learning is 93.62% and enhanced q-learning is 93.74% obtained. A confusion matrix with precision, F1 score, recall, and support is calculated. The accuracy obtained for different training and testing model ratios 60:40, 70:30, 80:20, and 90:10 is 93.19, 91.06, 91.51, and 91.86 respectively. The average accuracy for the training and testing model is 91.86.