Cloud services are widely used in manufacturing, logistics, digital applications, and document processing. Cloud services must be able to handle tens of thousands of concurrent requests and to enable servers to seamlessly provide the amount of load balance capacity needed in response to incoming application traffic in addition to allowing users to obtain information quickly and accurately. In the past, scholars proposed using static load balance or server response times to evaluate load balance capacity, both of which cause the server to load unevenly, and in this study, a dynamic annexed balance method is used to solve this problem.