Low visibility always leads to serious traffic accidents worldwide, and it remains a tough problem despite the extensive work researched in the field of meteorology. Therefore, we propose a novel end-to-end network, FGS-Net, for visibility estimation, combining "engineered features" and "learned features" to achieve higher accuracy. Specifically, we propose a novel and effective fog region segmentation method, named ASRS(Auto Seed Region Segmentation), to segment the fog regions in the input image. Subsequently, two "specific features" (transmittance matrix, dark channel matrix) and three "common features" (contrast, average gradient, brightness) are extracted from the fog region. Next, the feature information of the fog region is statistically calculated as "engineering features" for visibility estimation. In addition, our approach uses Transformer, a classical model commonly used in the field of Natural Language Processing (NLP) to obtain the "learned features" for visibility estimation. Furthermore, in order to obtain more efficient "learned features", we embed the Coordinate Attention (CA) module in FGS-Net. Finally, to verify the effectiveness and superiority of our method, We evaluate our approach on two visibility datasets: Visibility Image Dataset Ⅰ (VID I) and Visibility Image Dataset Ⅱ (VID II), where VID I is a real scene visibility dataset and VID II is a synthetic visibility dataset. The experimental results show that our method has better performance than the classical one on these two datasets. And compared with the runner-up, it has 2.2% and 0.9% higher accuracy in VID I and VID II, respectively.