For scientific, clinical, and machine learning purposes alike, it is desirable to quantify the verbal reports of high-level visual percepts. Methods to do this simply do not exist at present. Here we propose a novel methodological principle to help fill this gap, and provide empirical evidence designed to serve as the initial “proof” of this principle. In the proposed method, subjects view images of real-world scenes and describe, in their own words, what they saw. The verbal description is independently evaluated by several evaluators. Each evaluator assigns a rank score to the subject’s description of each visual object in each image using a novel ranking principle, which takes advantage of the well-known fact that semantic descriptions of real life objects and scenes can usually be rank-ordered. Thus, for instance, “animal,” “dog,” and “retriever” can be regarded as increasingly finer-level, and therefore higher ranking, descriptions of a given object. These numeric scores can preserve the richness of the original verbal description, and can be subsequently evaluated using conventional statistical procedures. We describe an exemplar implementation of this method and empirical data that show its feasibility. With appropriate future standardization and validation, this novel method can serve as an important tool to help quantify the subjective experience of the visual world. In addition to being a novel, potentially powerful testing tool, our method also represents, to our knowledge, the only available method for numerically representing verbal accounts of real-world experience. Given that its minimal requirements, i.e., a verbal description and the ground truth that elicited the description, our method has a wide variety of potential real-world applications.