Trading strategies to maximize profits by tracking and responding to dynamic stock market variations is a complex task. This paper proposes to use a multilayer perceptron method (a part of artificial neural networks (ANNs)), that can be used to deploy deep reinforcement strategies to learn the process of predicting and analyzing the stock market products with the aim to maximize profit making. We trained a deep reinforcement agent using the four algorithms: proximal policy optimization (PPO), deep Q-learning (DQN), deep deterministic policy gradient (DDPG) method, and advantage actor critic (A2C). The proposed system, comprising these algorithms, is tested using real time stock data of two products: Dow Jones (DJIA-index), and Qualcomm (shares). The performance of the agent linked to the individual algorithms was evaluated, compared and analyzed using Sharpe ratio, Sortino ratio, Skew and Kurtosis, thus leading to the most effective algorithm being chosen. Based on the parameter values, the algorithm that maximizes profit making for the respective financial product was determined. We also extended the same approach to study and ascertain the predictive performance of the algorithms on trading under highly volatile scenario, such as the pandemic coronavirus disease 2019 (COVID-19).