Aiming at the problem of increased response delay of microservice clusters caused by the exponential growth of the number of Internet users. A load balancing algorithm suitable for microservice clusters is proposed. By introducing dynamic weights to the least active number algorithm, when the load balancer receives a user request, the load balancer will select the server with the smallest active number to execute. If the active number of multiple servers is the same as the least active number at this time, use The CRITIC method recalculates the weights of memory size, memory usage, number of processor cores, processor usage, disk size, and disk usage indicators for servers with the same least active number, and combines each indicator to obtain the real-time performance quantification value of the server. Select the server with the best real-time performance quantification value for service. The experimental results show that, compared with the least active number algorithm before improvement, the proposed algorithm can reduce the overall response delay of the microservice cluster.