SummaryMany clinical data are in natural language form (diagnoses, therapies, etc.). There is great interest in making these data retrievable to form samples of patients for scientific investigations (statistical analyses, courses of diseases, etc.). To perform this task, “medical natural language data” have to be prepared and stored in a retrieval-oriented database. In this paper, the advantages of processing textual data are shown in contrast to coding. Accordingly, in our system WAREL medical thesauri (like ICD 9 or SNOMED) are not used for codification; they are taken as a knowledge base during the retrieval and for testing the quality of the data during documentation. The fundamental methods (computerized textual analysis and different algorithms for comparing texts) are explained in detail, and their realization within the system WAREL is illustrated (WAREL stands for Wiener Allgemeines Relationenschema).