Three-dimensional (3D) charge-trap based solid-statedrivers (SSDs) have become an emerging storage solution in recent years. One-shot-programming in 3D charge-trap based SSDs could deliver a maximized system input/output (I/O) throughput at the cost of degraded Quality-of-Service (QoS) performance. This paper proposes reinforcement-learning based one-shotprogramming (RLOSP), a reinforcement learning based approach to improve the QoS performance for 3D charge-trap based SSDs. By learning the I/O patterns of the workload environments as well as the device internal status, the proposed approach could properly choose requests in the device queue, and allocate physical addresses for these requests during one-shotprogramming. In this manner, the storage device could deliver an improved QoS performance. Experimental results reveal that the proposed approach could reduce the worst-case latency at the 99.9th percentile by 37.5%-59.2%, with an optimal system I/O throughput.