Graph neural networks (GNNs) have revolutionized drug discovery in chemistry and biology, enhancing efficiency and reducing resource demands. However, classical GNNs often struggle to capture long-range dependencies due to challenges like oversmoothing and oversquashing. Graph Transformers address these issues by employing global self-attention mechanisms that allow direct information exchange between any pair of nodes, enabling the modeling of long-range interactions. Despite this, Graph Transformers often face difficulties in capturing the nuanced structural information on graphs. To overcome these challenges, we introduce the CurvFlow-Transformer, a novel graph Transformer model incorporating a curvature flow-based masked attention mechanism. By leveraging a topologically enhanced mask matrix, the attention layer can effectively detect subtle structural differences within graphs, balancing the focus between global mutual information and local structural details of molecules. The CurvFlow-Transformer demonstrates superior performance on the MoleculeNet data set, surpassing several state-of-the-art models across various tasks. Moreover, the model provides unique insights into the relationship between molecular structure and chemical properties by analyzing the attention heat coefficients of individual atoms.