High‐angle cameras are commonly used for trajectory data collection in transportation research. However, without refinement and validation, trajectory data obtained through video processing software may be unreliable, inaccurate, or incomplete. This paper focuses on a critical issue in the field of trajectory data acquisition and analysis—there is still no reliable and fully vetted trajectory dataset in the research community. The current practice for validating video‐based trajectory can be classified as indirect methods and direct methods. Indirect methods of trajectory validation use algorithms to efficiently correct data anomalies without human intervention but may overlook detailed driving behaviors, whereas direct methods involve meticulous manual verification to preserve data fidelity but are labor‐intensive and less scalable. The spatial‐temporal maps (STMaps) method offers an additional layer of verification to affirm the accuracy and reliability of trajectory data. To enhance the performance, the deep spatial‐temporal embedding model is proposed for trajectory instance segmentation on STMaps using the contrastive learning framework. The parity constraints at both pixel and instance levels guide the deep neural network to learn the embedding spaces that can be built on different backbone networks. The reconstructed Next Generation Simulation (NGSIM) highway dataset trajectory dataset is thoroughly validated against manually processed ground truth, and the error‐free NGSIM data are refined to be a reliable resource for transportation research based on car‐following behaviors, lane‐change frequency, consistency, and jerk value measurements.