When the server cluster is processing concurrent task requests, if the performance difference among servers is not fully considered, task allocation will be unreasonable, which will lead to an increase in task making span and a decrease in cluster resource utilization. As one of the core technologies of server cluster, load balancing is used to balance the load of each server by allocating tasks to each server through an algorithm before the task processing. Therefore, this paper proposes a dynamic load balancing algorithm based on optimal matching of weighted bipartite graph. First, we constructed a bipartite graph with servers and tasks as vertices. The management server collects the load indicators of each server in the cluster in real time, using the real-time processing rate of each server as the load indicator. Each edge of the bipartite graph is determined by comparing the expected completion time of the tasks with the load of each server. The degree of matching between each task amount and each server load capacity is defined as the weight matrix of the edges, and the bipartite graph is weighted to construct a weighted bipartite graph. The Kuhn-Munkres algorithm was used to solve the optimal matching of the weighted bipartite graph, and the optimal assignment of tasks to servers was achieved based on the result of the optimal matching. The proposed algorithm fully considers the differentiation of each task amount and each server load capacity. By building a server cluster example and conducting comparison experiments, it is demonstrated that the algorithm can achieve load balancing of the server cluster and improve the resource utilization efficiency of the cluster, while offsetting the extra time overhead caused by the algorithm.