Machine learning techniques based on artificial neural networks have been successfully applied to solve many problems in science. One of the most interesting domains of machine learning, reinforcement learning, has natural applicability for optimization problems in physics. In this work we use deep reinforcement learning and Chopped Random Basis optimization, to solve an optimization problem based on the insertion of an off-center barrier in a quantum Szilard engine. We show that using designed protocols for the time dependence of the barrier strength, we can achieve an equal splitting of the wave function (1/2 probability to find the particle on either side of the barrier) even for an asymmetric Szilard engine in such a way that no information is lost when measuring which side the particle is found. This implies that the asymmetric non-adiabatic Szilard engine can operate with the same efficiency as the traditional Szilard engine, with adiabatic insertion of a central barrier. We compare the two optimization methods, and demonstrate the advantage of reinforcement learning when it comes to constructing robust and noise-resistant protocols.