Server load levels affect the performance of cloud task execution, which is rooted in the impact of server performance on cloud task execution. Traditional cloud task scheduling methods usually only consider server load without fully considering the server’s real-time load-performance mapping relationship, resulting in the inability to evaluate the server’s real-time processing capability accurately. This deficiency directly affects the efficiency, performance, and user experience of cloud task scheduling. Firstly, we construct a performance platform model to monitor server real-time load and performance status information in response to the above problems. In addition, we propose a new deep reinforcement learning task scheduling method based on server real-time performance (SRP-DRL). This method introduces a real-time performance-aware strategy and adds status information about the real-time impact of task load on server performance on top of considering server load. It enhances the perception capability of the deep reinforcement learning (DRL) model in cloud scheduling environments and improves the server’s load-balancing ability under latency constraints. Experimental results indicate that the SRP-DRL method has better overall performance regarding task average response time, success rate, and server average load variance compared to Random, Round-Robin, Earliest Idle Time First (EITF), and Best Fit (BEST-FIT) task scheduling methods. In particular, the SRP-DRL is highly effective in reducing server average load variance when numerous tasks arrive within a unit of time, ultimately optimizing the performance of the cloud system.