Reinforcement learning is considered a sort of machine learning that acquires knowledge of solving problems using the trial-and-error technique. The process starts with the main actor that is the agent interacting with a given environment and attempting to achieve a multi-step goal within this environment. The environment is characterized by a state that the agent detects and examines. On the other hand, due to the agent's several actions, the environment's state changes according to these modifications. Eventually, and at this stage, the agent gets reward signals as it proceeds nearer to its goal. The agent uses these rewards signals to determine which actions were successful and which actions were not. The state action is then repeated and the reward is looped until the agent learns how to operate effectively within the environment using the trial-and-error concept. The agent's main objective is to learn how to always choose the right action given any state of the environment that leads it closer to its goal. In this paper, we gathered all the methods used in the literature. Multi-armed bandits, the Markov decision process, Monte Carlo methods, dynamic programming as well as temporal-difference learning are some of the corresponding methods used to solve reinforcement learning issues. The current paper is organized and structured as follows: we'll start with an introduction followed by a reinforcement learning section where we discussed all the methods and techniques used in the literature. Furthermore, the third section will be about deep reinforcement learning, here we gathered deep reinforcement learning techniques. In the fourth section, we will summarize the reinforcement and deep reinforcement learning algorithms in detail. Furthermore, we will finalize the article with a discussion and a conclusion.