Load forecasting with distributed energy resources (DERs) behind‐the‐meter is more challenging owing to transformed data patterns. Traditional forecasting method which is only based on unmasked‐load could not suit the present limited masked‐load. To bridge the divergence between unmasked‐load and masked‐load, this article proposes a masked‐load forecasting (MLF) method based on transfer learning technique and Bayesian optimization, which is Maximum Mean Discrepancy‐Neural Network with Bayesian optimization (MMD‐NNb). At first, common feature vectors between unmasked‐load and masked‐load are extracted and an outcome predictor could be established based on feature vectors from historical unmasked‐load. The feature vectors from masked‐load could therefore accommodate to the outcome predictor, and the masked‐load could be forecast. Owing to the excessive hyperparameters involved in training, Bayesian optimization is adopted for hyperparameters fine‐tuning. MMD‐NNb was tested and compared with four related models. The improvements from MMD‐NNb were observed in all comparison scenarios. Also, MMD‐NNb was proved to have high resilience to the different DERs and not requiring additional DERs‐data.