Magnetic flux leakage (MFL), a widely used Nondestructive Evaluation (NDE) method, for inspecting pipelines to prevent potential long-term failures. During field testing, uncertainties can affect the accuracy of the inspection and the decision-making process regarding damage conditions. Therefore, it is essential to identify and quantify these uncertainties to ensure the reliability of the inspection. This study focuses on the uncertainties that arise during the inverse NDE process due to the dynamic magnetization process, which is affected by the relative motion of the MFL sensor and the material being tested. Specifically, the study investigates the uncertainties caused by sensing liftoff, which can affect the output signal of the sensing system. Due to the complexity of describing the forward uncertainty propagation process, this study compared two typical machine learning-based approximate Bayesian inference methods, Convolutional Neural Network (CNN) and Deep Ensemble (DE), to address the input uncertainty from the MFL response data. Besides, an Autoencoder method is applied to tackle the lack of experimental data for the training model by augmenting the dataset, which is constructed with the pre-trained model based on transfer learning. Prior knowledge learned from large simulated MFL signals can fine-tune the Autoencoder model which enhances the subsequent learning process on experimental MFL data with faster generalization. The augmented data from the fine-tuned Autoencoder is further applied for machine learning-based defect size classification. This study conducted prediction accuracy and uncertainty analysis with calibration, which can evaluate the prediction performance and reveal the relation between the liftoff uncertainty and prediction accuracy. Further, to strengthen the trustworthiness of the prediction results, the decision-making process guided by uncertainty is applied to provide valuable insights into the reliability of the final prediction results. Overall, the proposed framework for uncertainty quantification offers valuable insights into the assessment of reliability in MFL-based decision-making and inverse problems.