Abstract-Formation control is an important subtask for autonomous robots. From flying drones to swarm robotics, many applications need their agents to control their group behavior. Especially when moving autonomously in humanrobot teams, motion and formation control of a group of agents is a critical and challenging task.In this work, we propose a method of applying the GQ(λ) reinforcement learning algorithm to a leader-follower formation control scenario on the e-puck robot platform.In order to allow control via classical reinforcement learning, we present how we modeled a formation control problem as a Markov decision making process. This allows us to use the Greedy-GQ(λ) algorithm for learning a leader-follower control law. The applicability and performance of this control approach is investigated in simulation as well as on real robots.In both experiments, the followers are able to move behind the leader. Additionally, the algorithm improves the smoothness of the follower's path online, which is beneficial in the context of human-robot interaction.