Aims Models predicting mortality in heart failure (HF) patients are often limited with regard to performance and applicability. The aim of this study was to develop a reliable algorithm to compute expected in-hospital mortality rates in HF cohorts on a population level based on administrative data comparing regression analysis with different machine learning (ML) models. Methods and results Inpatient cases with primary International Statistical Classification of Diseases and Related Health Problems (ICD-10) encoded discharge diagnosis of HF non-electively admitted to 86 German Helios hospitals between 1 January 2016 and 31 December 2018 were identified. The dataset was randomly split 75%/25% for model development and testing. Highly unbalanced variables were removed. Four ML algorithms were applied, and all algorithms were tuned using a grid search with multiple repetitions. Model performance was evaluated by computing receiver operating characteristic areas under the curve. In total, 59 125 cases (69.8% aged 75 years or older, 51.9% female) were investigated, and in-hospital mortality was 6.20%. Areas under the curve of all ML algorithms outperformed regression analysis in the testing dataset with values of 0.829 [95% confidence interval (CI) 0.814-0.843] for logistic regression, 0.875 (95% CI 0.863-0.886) for random forest, 0.882 (95% CI 0.871-0.893) for gradient boosting machine, 0.866 (95% CI 0.854-0.878) for single-layer neural networks, and 0.882 (95% CI 0.872-0.893) for extreme gradient boosting. Brier scores demonstrated a good calibration especially of the latter three models. Conclusions We introduced reliable models to calculate expected in-hospital mortality based only on administrative routine data using ML algorithms. A broad application could supplement quality measurement programs and therefore improve future HF patient care.