An experiment is performed to reconstruct an unknown photonic quantum state with a limited amount of copies. A semiquantum reinforcement learning approach is employed to adapt one qubit state, an “agent,” to an unknown quantum state, an “environment,” by successive single‐shot measurements and feedback, in order to achieve maximum overlap. The experimental learning device herein, composed of a quantum photonics setup, can adjust the corresponding parameters to rotate the agent system based on the measurement outcomes “0” or “1” in the environment (i.e., reward/punishment signals). The results show that, when assisted by such a quantum machine learning technique, fidelities of the deterministic single‐photon agent states can achieve over 88% under a proper reward/punishment ratio within 50 iterations. This protocol offers a tool for reconstructing an unknown quantum state when only limited copies are provided, and can also be extended to higher dimensions, multipartite, and mixed quantum state scenarios.