Abstract. The detection and recognition of racing bib number/text, which is printed on paper, cardboard tag, or t-shirt in natural images in marathon, race and sports, is challenging due to person movement, non-rigid surface, distortion by non-illumination, severe occlusions, orientation variations etc. In this paper, we present a multi-modal technique that combines both biometric and textual features to achieve good results for bib number/text detection. We explore face and skin features in a new way for identifying text candidate regions from input natural images. For each text candidate region, we propose to use text detection and recognition methods for detecting and recognizing bib numbers/texts, respectively. To validate the usefulness of the proposed multi-modal technique, we conduct text detection and recognition experiments before text candidate region detection and after text candidate region detection in terms of recall, precision and f-measure. Experimental results show that the proposed multi-modal technique outperforms the existing bib number detection method.