In this paper, an approach to the development of a localisation system for supporting visually impaired people is proposed. Instead of using unique visual markers or radio tags, this approach relies on image recognition with local feature descriptors. In order to provide fast and robust keypoint description, a new binary descriptor is introduced. The descriptor computation pipeline selects four image patches with scale-dependent sizes around the keypoint and then places five square pixel blocks within each patch. The binary string is obtained in pairwise tests between directional gradients obtained for blocks. In contrary to other binary descriptors, tests take into account gradient values obtained for blocks from all patches. The proposed approach is extensively tested using six demanding image datasets. Some of them contain labelled indoor and outdoor images under different real-world transformations, as well as challenging illumination conditions. Two datasets were prepared for the needs of this research. Experimental evaluation reveals that the introduced binary descriptor is more robust and achieves shorter computation time than state-of-the-art floating-point and binary descriptors. Furthermore, the approach outperforms other techniques in image recognition tasks, making it more suitable for the vision-based localisation.