Daily inspection of ground crack over mine goaf is a necessary task for environmental protection and mining safety. However, the traditional manual inspection method is time-consuming, misjudgement-prone and with potential dangers because the goaf normally locates in remote regions and inevitably induces complicated and gullies-cross landform. Therefore, Unmanned Aerial Vehicle(UAV) is adopted to capture aerial images of the ground crack, which provides a convenient way for goaf inspection. Though aerial images display a full view for the region containing cracks, the rugged terrain results in rich noises that have the similar characteristics with cracks in an image, such as shadow, cliff, terrace and so on, which may lead to misclassification of cracks. To overcome the obstacles, an improved semantic segmentation framework called MSI-FCN is proposed in the paper. Firstly, a statistic pre-processing method is employed to remove useless patches before training, with the purpose of decreasing the training complexity. Following that, a multi-scale input method is applied to integrate the patches with the output tensor of each stage based on FCN, which precisely recognizes pixel-level crack. Furthermore, a multi-scale connection module is constructed to select the most important contextual information from the features under multiple scales and help for recovering high-dimension features. In particular, a statistic weighted softmax cross-entropy loss function is presented so as to classify crack pixels more rationally. The experimental results for three groups of images under various environmental conditions indicate that MSI-FCN is superior to the FCN-8, FCD-56 and DeepCrack-GF. In addition, the proposed method also shows high performance on CFD and deepcrack dataset. Thus, multi-scale information embedded in the proposed method can provide more local details of low-level features to high-level stages and improve the recognition performances.