Caffeine is by far the most ubiquitous psychostimulant worldwide found in tea, coffee, cocoa, energy drinks, and many other beverages. Caffeine is almost completely metabolized in the liver by the cytochrome P-450 enzyme system mainly to paraxanthine and the additional products theobromine and theophylline. Besides its stimulating properties, two important applications of caffeine are metabolic phenotyping of cytochrome P450 1A2 (CYP1A2) and liver function testing. An open challenge in this context is to identify underlying causes of the large inter-individual variability in caffeine pharmacokinetics. Data is urgently needed to understand and quantify confounding factors such as lifestyle (e.g. smoking), the effects of drug-caffeine interactions (e.g. medication metabolized via CYP1A2), and the effect of disease. Here we report the first integrative and systematic analysis of data on caffeine pharmacokinetics from 147 publications and provide a comprehensive high-quality data set on the pharmacokinetics of caffeine, caffeine metabolites, and their metabolic ratios in human adults. The data set is enriched by meta-data on the characteristics of studied patient cohorts and subjects (e.g. age, body weight, smoking status, health status), the applied interventions (e.g. dosing, substance, route of application), measured pharmacokinetic time-courses, and pharmacokinetic parameters (e.g. clearance, half-life, area under the curve). We demonstrate via multiple applications how the data set can be used to solidify existing knowledge and gain new insights relevant for metabolic phenotyping and liver function testing based on caffeine. Specifically, we analyzed (i) the alteration of caffeine pharmacokinetics with smoking and oral contraceptive use; (ii) drug-drug interactions with caffeine as possible confounding factors of caffeine pharmacokinetics or source of adverse effects; (iii) alteration of caffeine pharmacokinetics in disease; and (iv) the applicability of caffeine as a salivary test substance by comparison of plasma and saliva data. In conclusion, our data set and analyses provide important resources which could enable more accurate caffeine-based metabolic phenotyping and liver function testing.