Time series data with missing values are ubiquitous in real applications due to various unforeseen faults during data generation, storage, and transmission. Time-Series Data Imputation (TSDI) is thus crucial to many temporal data analysis tasks. However, existing works usually consider only one of the following two issues: (1) intra-feature temporal dependency, and (2) inter-feature correlation, leading to the overlook of complex coupling information in imputation. To achieve more accurate TDSI, we design a novel imputation model called TABiG, which delicately preserves the short-term, long-term, and inter-feature dependencies by attention mechanisms in a delay error-reduced bi-directional architecture. That is, it leverages GRU to model short-term temporal dependencies and adopts self-attention mechanisms hierarchically to capture long-term temporal dependencies and inter-feature correlations. The multiple self-attention mechanisms are nested in a bi-directional structure to alleviate the problem of delay errors in RNN-like structures. To facilitate model training with higher generalization, a masking strategy that mimics various extreme real missing situations beyond the simple random ones has been adopted for generating self-supervised learning tasks. Comprehensive experiments demonstrate that TABiG significantly outperforms most state-of-the-art imputation counterparts. Complementary results and source code can be accessed at https://github.com/Zhang2112105189/TABiG