This study focuses on the scheduling problem of heterogeneous unmanned surface vehicles (USVs) with obstacle avoidance pretreatment. The goal is to minimize the overall maximum completion time of USVs. First, we develop a mathematical model for the problem. Second, with obstacles, an A* algorithm is employed to generate a path between two points where tasks need to be performed. Third, three meta-heuristics, i.e., simulated annealing (SA), genetic algorithm (GA), and harmony search (HS), are employed and improved to solve the problems. Based on problem-specific knowledge, nine local search operators are designed to improve the performance of the proposed algorithms. In each iteration, three Q-learning strategies are used to select high-quality local search operators. We aim to improve the performance of meta-heuristics by using Q-learning-based local search operators. Finally, 13 instances with different scales are adopted to validate the effectiveness of the proposed strategies. We compare with the classical meta-heuristics and the existing meta-heuristics. The proposed meta-heuristics with Q-learning are overall better than the compared ones. The results and comparisons show that HS with the second Q-learning, HS + QL2, exhibits the strongest competitiveness (the smallest mean rank value 1.00) among 15 algorithms.