SummaryThe next‐generation networks are expected to support diverse latency‐sensitive services, that is, auto‐driving, telemedicine, and so on. Deploying these services in current hardware‐based networks faces significant challenges. The emergence of network function virtualization (NFV) and software defined network (SDN) makes network deployment flexible and agile. In order to meet the strict latency constraints, joint VNF deployment and resource allocation problems are needed. In this paper, we consider the resource‐delay interpendency and formulate the joint VNF deployment and resource allocation problems as linear programming (LP) problems. Discrete spider monkey optimization (DSMO) algorithm is proposed to solve this problem, which imitates spider monkeys' fission and fusion social behavior during foraging. The two‐phase decision update mechanism greatly increases the diversity of the spider monkey decision population and promotes the algorithm to perform a global search of the solution space. Evaluation results show that the proposed solutions achieve superior performance in terms of latency, CPU utilization and service acceptance rate.