Decoherence, resulting from unwanted interaction between a qubit and its environment, poses a serious challenge towards the development of quantum technologies. Recently, researchers have started analysing how real-time Hamiltonian learning approaches, based on estimating the qubit state faster than the environmental fluctuations, can be used to counteract decoherence. In this work, we investigate how the back-action of the quantum measurements used in the learning process can be harnessed to extend qubit coherence. We propose an adaptive protocol that, by learning the qubit environment, narrows down the distribution of possible environment states. While the outcomes of quantum measurements are random, we show that real-time adaptation of measurement settings (based on previous outcomes) allows a deterministic decrease of the width of the bath distribution, and hence an increase of the qubit coherence. We numerically simulate the performance of the protocol for the electronic spin of a nitrogen-vacancy centre in diamond subject to a dilute bath of 13 C nuclear spin, finding a considerable improvement over the performance of non-adaptive strategies.