Quantum computing is expected to provide new promising approaches for solving the most challenging problems in material science, communication, search, machine learning and other domains. However, due to the decoherence and gate imperfection errors modern quantum computer systems are characterized by a very complex, dynamical, uncertain and fluctuating computational environment. We develop an autonomous agent effectively interacting with such an environment to solve magnetism problems. By using the reinforcement learning the agent is trained to find the bestpossible approximation of a spin Hamiltonian ground state from self-play on quantum devices. We show that the agent can learn the entanglement to imitate the ground state of the quantum spin dimer. The experiments were conducted on quantum computers provided by IBM. To compensate the decoherence we use local spin correction procedure derived from a general sum rule for spin-spin correlation functions of a quantum system with even number of antiferromagnetically-coupled spins in the ground state. Our study paves a way to create a new family of the neural network eigensolvers for quantum computers.