Mutationsin epidermal growth factor receptor (EGFR) are found in approximately 48% of Asian and 19% of Western patients with lung adenocarcinoma (LUAD), leading to aggressive tumor growth. While tyrosine kinase inhibitors (TKIs) like gefitinib and osimertinib target this mutation, treatments often face challenges such as metastasis and resistance. To address this, we developed physiologically based pharmacokinetic (PBPK) models for both drugs, simulating their distribution within the primary tumor and metastases following oral administration. These models, combined with a mechanistic knowledge-based disease model of EGFR-mutated LUAD, allow us to predict the tumor’s behavior under treatment considering the diversity within the tumor cells due to different mutations. The combined model reproduces the drugs’ distribution within the body, as well as the effects of both gefitinib and osimertinib on EGFR-activation-induced signaling pathways. In addition, the disease model encapsulates the heterogeneity within the tumor through the representation of various subclones. Each subclone is characterized by unique mutation profiles, allowing the model to accurately reproduce clinical outcomes, including patients’ progression, aligning with RECIST criteria guidelines (version 1.1). Datasets used for calibration came from NEJ002 and FLAURA clinical trials. The quality of the fit was ensured with rigorous visual predictive checks and statistical tests (comparison metrics computed from bootstrapped, weighted log-rank tests: 98.4% (NEJ002) and 99.9% (FLAURA) similarity). In addition, the model was able to predict outcomes from an independent retrospective study comparing gefitinib and osimertinib which had not been used within the model development phase. This output validation underscores mechanistic models’ potential in guiding future clinical trials by comparing treatment efficacies and identifying patients who would benefit most from specific TKIs. Our work is a step towards the design of a powerful tool enhancing personalized treatment in LUAD. It could support treatment strategy evaluations and potentially reduce trial sizes, promising more efficient and targeted therapeutic approaches. Following its consecutive prospective validations with the FLAURA2 and MARIPOSA trials (validation metrics computed from bootstrapped, weighted log-rank tests: 94.0% and 98.1%, respectively), the model could be used to generate a synthetic control arm.