This study proposes different machine learning-based solutions to both single and multi-agent systems, took place on a 2-D simulation platform, namely, Robocode. This dynamic and programmable platform allows agents to interact with the environment and each other by employing a variety of battling strategies. Q-Learning is one of the leading and popular machine learning-based solutions to be applied to such a problem. However, especially for continued spaces, the control problem gets deeper. Essentially, one of the main drawbacks of reinforcement learning (RL) is to design an appropriate reward function that the function can be described by only employing few parameters for simple tasks, whereas estimating the goal of the reward function may be a challenging problem. Recent studies prove that neural network-based approaches can handle these challenges and achieve to learn control strategies from 2-D or 1-D data. Besides those problems of RL algorithms for single robots, once the number of robots increases and the systems need to behave as multi-agent systems, the overall design requirements become more complex. Accordingly, the proposed system is validated by considering different battle scenarios. The performance of the Q-Learning-based system and the supervised learning techniques are compared by employing different scenarios for this problem. Results reveal the superiority of the ANN-based approach over other methods.