Board games like chess serve as an excellent testbed for human–robot interactions, where advancements can lead to broader human–robot cooperation systems. This paper presents a chess-playing robotic system to demonstrate controlled pick and place operations using a 3-DoF manipulator with image and speech recognition. The system identifies chessboard square coordinates through image processing and centroid detection before mapping them onto the physical board. User voice input is processed and transcribed into a string from which the system extracts the current and destination locations of a chess piece with a word error rate of 8.64%. Using an inverse-kinematics algorithm, the system calculates the joint angles needed to position the end effector at the desired coordinates actuating the robot. The developed system was evaluated experimentally on the 3-DoF manipulator with a voice command used to direct the robot movement in grasping a chess piece. Consideration was made involving both the own pieces as well as capturing the opponent’s pieces and moving the captured piece outside the board workspace.