Recent collapses of SIP servers (e.g., Skype outage) indicate that the built-in SIP overload control mechanism cannot mitigate overload effectively. We introduce our analytical approach by investigating an overloaded tandem server scenario. Our analytical model: (1) considers a general case that both arrival rate and service rate for signaling messages are generic random processes; (2) makes a detailed analysis of departure processes; (3) allows us to run fluid-based simulations to observe and analyze SIP system performance under some specific scenarios. This approach is much faster than event-driven simulation which needs to track thousands of retransmission timers for outstanding messages and may crash a simulator due to limited computing resources. Our numerical results help us reach a counterintuitive conclusion: A SIP system with a large buffer size may continuously exhibit overload and long queuing delay after experiencing a short period of demand burst or a temporary server slowdown. Small buffer size, on the other hand, can mitigate overload quickly by rejecting a large portion of the requests from a demand burst, and then resume normal operation after a short period of time. Furthermore, numerical results demonstrate that overload at a downstream server may propagate or migrate to its upstream servers and therefore cause widespread server crashes in a real SIP network.
ACM Reference Format:Hong, Y., Huang, C., and Yan, J. 2011. Modeling and simulation of SIP tandem server with finite buffer. ACM Trans. Model.