“…Performance of specific phenotype extraction algorithms developed as part of the i2b2 project using cTAKES (Apache clinical Text Analysis and Knowledge Extraction System) and HITex (Health Information Text Extraction) showed that for an NLP approach high PPV (precision) and sensitivity (recall) was achieved for extracting the following phenotypes; Crohn's disease (98%,64%), Ulcerative Colitis (97%,68%) , MS (94%,68%), and Rheumatoid arthritis (89%,56%) [139]. As we aimed to extract epilepsy specific information other than a confirmed diagnosis, a recent study on patients with known MS identified from electronic healthcare records used NLP techniques to extract attributes specific to MS with high PPV and sensitivity, namely EDSS (Expanded Disability Status Scale) (97%,89%), T25FW (Timed 25 Foot Walk) (93%,87%), MS subtype (92%,74%) and age of onset (77%,64%) [140]. This study took into account items attributable only to the patient, as opposed to family members, which is an important distinction and interesting area of study in terms of identify potential risk factors for disease development.…”