Background: Lower urinary tract urinary symptoms (LUTS), such as urinary urgency, frequency, and incontinence, affect the majority of the population at some point in the lifepspan, causing substantial morbidity, yet few receive effective care. Acccurate diagnosis and treatment is usually dictated by the dominant symptom, however, the sizeable symptomatic overlap between disease categories and subjectivity of language used to describe symptoms leads to high rates of misdiagnosis. We hypothesized that more specific and homogeneous LUTS diagnoses are characterized not by specific pathognomic features, but by patterns of existing symptoms indicative of unique causes of convergent symptomatologies. To improve care and diagnostic acuracy, we sought to employ a data-driven approach to LUTS categorization using machine learning to generate diagnostic groupings based on patient-reported clinical data, creating a novel tool for diagnosis for patients with voiding complaints. Methods and Findings: Questionnaire responses in a development cohort of 514 female subjects was used for model development, identifying 4 major categories and 9 specific phenotypes of LUTS using agglomerative hierarchical clustering. The dominant features for each cluster and phenotype were examined independently by two urologic specialists and assigned a clinical identity consistent with recognized causes of voiding dysfunction. Then, a supervised machine learning model was trained to assign unseen patients into these phenotypes. The model was then applied to a validation population of 571 additional subjects to validate the diagnostic machine learning algorithm, demonstrating good reproducibillty of the phenotypes and their symptomatic patterns in an independent cohort. This data-driven, hierarchical clustering approach captured overlapping symptoms inherent in typical patients, recognizing common uncomplicated diagnoses (e.g. overactive bladder) and several novel diagnostic categories (e.g. myofascial pelvic pain). The assigned clusters and phenotypes were consistent with coded primary diagnoses with 70% accuracy; however, physician primary diagnosis may not reflect the complete range of patient symptoms as patients often present with multiple urinary symptoms that do not fit pre-established diagnoses. Although this analysis reflects care-seeking populations from a single center, further studies will need to determine if the model is scalable to the population at large. Conclusions: We describe the generation of a machine learning algorithm relying only on validated patient-reported symptoms for accurate diagnostic classification. Given a growing physician shortage and increasing challenges for patients accessing specialist care, this type of digital technology holds great potential to improve the recognition, diagnosis, and treatment of functional urologic conditions. While future prospective work with larger, multi-institutional cohorts is needed, with refinement, this approach is capable of increasing both the equity and rapidity of access to effective urologic care.