Recent years have seen rapid growth of online services that rely on large-scale server clusters to handle high volume of requests. Such clusters must adaptively control the CPU utilizations of many processors in order to maintain desired soft real-time performance and prevent system overload in face of unpredictable workloads. This paper presents DUC-LB, a novel distributed utilization control algorithm for cluster-based soft real-time applications. Compared to earlier works on utilization control, a distinguishing feature of DUC-LB is its capability to handle system dynamics caused by load balancing, which is a common and essential component of most clusters today.