We present a completely automatic and scalable framework to perform query-by-example word-spotting on medieval manuscripts. Our system does not require any human intervention to produce a large amount of annotated training data, and it provides Computer Vision researchers and Cultural Heritage practitioners with a compact and efficient system for document analysis. We have executed the pipeline both in a single-manuscript and a cross-manuscript setup, and we have tested it on a heterogeneous set of medieval manuscripts, that includes a variety of writing styles, languages, image resolutions, levels of conservation, noise and amount of illumination and ornamentation. We also present a precision/recall based analysis to quantitatively assess the quality of the proposed algorithm.