Breast cancer is the most common cancer in the world. With a 5-year survival rate of over 90% for patients at the early disease stages, the management of side-effects of breast cancer treatment has become a pressing issue. Observational, real-world data such as electronic health records, insurance claims, or data from wearable devices have the potential to support research on the quality of life (QoL) of breast cancer patients (BCPs), but care must be taken to avoid errors introduced due to data quality and bias. This paper proposes a causal inference methodology for using observational data to support research on the QoL of BCPs, focusing on the osteopenia of patients undergoing treatment with aromatase inhibitors (AIs). We propose a machine learning-based pipeline to estimate the average and conditional average treatment effects (ATE and CATE). For evaluation, we develop a Structural Causal Model for the osteopenia of BCPs and rely on synthetically generated data to study the effectiveness of the proposed methodology under various data challenges. A set of studies were designed to estimate the effect of high-intensity exercise on bone mineral density loss using synthetic datasets of BCPs under AI treatment. Four observational study scenarios were evaluated, corresponding to synthetically generated data of 1000 BCPs with (a) no bias, (b) sampling bias, (c) hidden confounder bias, and (d) bias due to unobserved mediator. In all cases, evaluations were performed under both complete and missing data scenarios. In particular, machine learning-based models based on tree ensembles and neural networks achieved a lower estimation error by 23.8–51.3% and 32.4–89.3% for ATE and CATE, respectively, compared to direct estimation using sample averages. The proposed approach shows improved effectiveness in treatment effect estimation in the presence of missing values and sampling bias, compared to a “traditional” statistical analysis workflow. This suggests that the application of causal effect estimation methods for the study of BCPs’ quality of life using real-world data is promising and worth pursuing further.