Background Health information records in many countries, especially developing countries, still rely on a paper-based system which, when compared to electronic systems, are disadvantageous in terms of storage and data extraction. Given the importance of health records as a data resource for epidemiological studies, guidelines for systematic data cleaning and sorting are essential, yet are largely absent in the literature. This paper discusses the process by which an electronic database was generated from emergency department registers in Lebanon and the data subsequently cleaned, sorted, and categorized.Methods Demographic and health complaint-related information was extracted from emergency department registers of a convenience sample of seven hospitals in Beirut. Appropriate categories were selected for data categorization. For health-complaint related information, disease categories and codes were selected according to the International Classification of Disease 10th Edition.Results A total of 16,537 entries were collected. Demographic information was categorized into appropriate categories and groups as required for future epidemiological studies. Analysis of the health information allowed for the creation of a sorting algorithm which then used to categorize and code the heath data. Several counts were then performed to represent and visualize the data numerically and graphically to aid in data interpretation.Conclusions The article describes the current state of health information records in Lebanon and the associated disadvantages of a paper-based system in terms of storage and data extraction and subsequent analysis. Furthermore, the article describes the algorithm by which health information was sorted and categorized to allow for future data analysis using paper records.