Reliable and sensitive testing of physical function is crucial for assessing the effects of treatment or exercise intervention in various patient populations. The present study investigated the test–retest reliability and sensitivity (smallest detectable difference: SDD) of selected physical performance tests commonly used in clinical rehabilitation, including tests of habitual and maximal walking speed, walking endurance capacity, handgrip strength (HGS), and lower limb muscle power (Sit‐to‐Stand (STS), stair climb) in adults with severe obesity meeting the criteria for bariatric surgery. Thirty‐two adults (BMI 43.8 ± 6.6 kg/m2) were enrolled in the study. Participants were assessed in three separate test sessions performed at the same time of day (±2 h) separated by 3 to 7 days. Habitual and maximal walking speed, walking endurance capacity, lower limb muscle power evaluated by stair climb and STS performance, and HGS demonstrated good‐to‐excellent inter‐session reproducibility (ICC: 0.84–0.98, CV and SEM: 2.9%–11.3%) with individual sensitivity (SDD) ranging from 11.8% to 31.2%. Systematic learning effects from test session 1–2 were observed for the STS test and the 3‐ and 10‐m habitual walk speed tests, manifested by increases of 6%–9%, 7%, and 3%, respectively (p < 0.05). Performing a familiarization session (test 1) fully prevented these learning effects (test 2 vs. 3). A majority of physical function tests showed improved reproducibility and sensitivity after the familiarization session. In conclusion, physical function can be assessed in a reliable manner in adults with severe obesity. Further, familiarization sessions prior to actual testing result in improved test–retest reliability and increased sensitivity.