Intelligent reflecting surface (IRS) is an emerging technology that is able to reconfigure the wireless channel via tunable passive signal reflection and thereby enhance the spectral/energy efficiency of wireless networks cost-effectively. In this paper, we study an IRS-aided multiuser multiple-input single-output (MISO) wireless system and adopt the two-timescale (TTS) transmission to reduce the signal processing complexity and channel training overhead as compared to the existing schemes based on the instantaneous channel state information (I-CSI), and at the same time, exploit the multiuser channel diversity in transmission scheduling. Specifically, the long-term passive beamforming (i.e., IRS phase shifts) is designed based on the statistical CSI (S-CSI) of all links, while the short-term active beamforming (i.e., transmit precoding vectors at the access point (AP)) is designed to cater to the I-CSI of all users' reconfigured channels with optimized IRS phase shifts. We aim to minimize the average transmit power at the AP, subject to the users' individual quality of service (QoS) constraints on the achievable long-term average rate. The formulated stochastic optimization problem is nonconvex and difficult to solve since the long-term and short-term design variables are complicatedly coupled in the QoS constraints. To tackle this problem, we propose an efficient algorithm, called the primal-dual decomposition based TTS joint active and passive beamforming (PDD-TJAPB), where the original problem is decomposed into a long-term passive beamforming problem and a family of shortterm active beamforming problems, and the deep unfolding technique is employed to extract gradient information from the short-term problems to construct a convex surrogate problem for the long-term problem. We show that both the long-term and short-term problems can be efficiently solved and the proposed algorithm is proved to converge to a stationary solution of the original problem almost surely. Simulation results are presented which demonstrate the advantages and effectiveness of the proposed algorithm as compared to benchmark schemes.