Fog computing has recently emerged as an extension of cloud computing in providing high-performance computing services for delay-sensitive Internet of Things (IoT) applications. By offloading tasks to a geographically proximal fog computing server instead of a remote cloud, the delay performance can be greatly improved. However, some IoT applications may still experience considerable delays, including queuing and computation delays, when huge amounts of tasks instantaneously feed into a resource-limited fog node. Accordingly, the cooperation among geographically close fog nodes and the cloud center is desired in fog computing with the ever-increasing computational demands from IoT applications. This paper investigates a workload allocation scheme in an IoT–fog–cloud cooperation system for reducing task service delay, aiming at satisfying as many as possible delay-sensitive IoT applications’ quality of service (QoS) requirements. To this end, we first formulate the workload allocation problem in an IoT-edge-cloud cooperation system, which suggests optimal workload allocation among local fog node, neighboring fog node, and the cloud center to minimize task service delay. Then, the stability of the IoT-fog-cloud queueing system is theoretically analyzed with Lyapunov drift plus penalty theory. Based on the analytical results, we propose a delay-aware online workload allocation and scheduling (DAOWA) algorithm to achieve the goal of reducing long-term average task serve delay. Theoretical analysis and simulations have been conducted to demonstrate the efficiency of the proposal in task serve delay reduction and IoT-fog-cloud queueing system stability.