The emerging event cameras have the potential to be an excellent complement for standard cameras within various visual tasks, especially in illumination‐changing environments or situations requiring high‐temporal resolution. Herein, an event‐based stereo visual odometry (VO) system via adaptive time‐surface (TS) and truncated signed distance function (TSDF), namely, T‐ESVO, is proposed . The system consists of three carefully designed components, including the event processing unit, the mapping unit, and the tracking unit. Specifically, the event processing unit adopts a novel spatial–temporal adaptive TS that can deal with different camera motions in various environments. The mapping unit introduces the TSDF to describe the 3D representation of environments and achieves depth estimation based on the global historical depth information contained in the environmental TSDF description. The tracking unit achieves the 6‐DoF pose estimation through an 3D–2D registration method based on the left/right TS selection mechanism and the depth point selection mechanism. The effectiveness and robustness of the proposed system are evaluated on various datasets, and the experimental results show that T‐ESVO achieves good performance in both accuracy and robustness when compared with other state‐of‐the‐art event‐based stereo VO systems.