Metro intelligent system produces massive passenger flow and traffic data every day, among which route, station, and operation data are important for optimizing the train operation scheme. We collect passenger flow information of Shenzhen metro, analyze the passenger flow pattern and its distribution characteristics based on the data warehouse of the Hadoop platform, and optimize the train operation scheme in this paper. Using dynamic passenger flow data, an optimization model with train departure and dwell time as decision variables and passenger waiting time, passenger ride time, train full load ratio, and train operation balance as objectives is developed. An improved parallel genetic algorithm (GA) incorporating a simulated annealing algorithm (SAA) and an optimal individual retention strategy is used to find the optimal result. To verify the usefulness of the method, simulation experiments are conducted on the optimization model and method using the real passenger flow and train operation data of Shenzhen metro, and the simulation results are compared with the original plan.