Information recorded in electronic medical health records, clinical reports, and summaries has the possibility of revolutionizing health-related research and its corresponding industry. EMR data can be used for epidemiological studies, disease registries, data banks, drug safety surveillance, clinical trials, and healthcare audits. With the rapid adoption of electronic health records (EHRs), it is desirable to harvest information and knowledge from EHRs to support automated systems and to enable secondary use of EHRs for clinical and translational research; thereby increasing efficiency. One critical component which is predominantly used to facilitate the secondary use of EHR data is information extraction (IE) task, which automatically extracts and encodes clinical information from a given text. Now, a natural language processing model (NLP) focuses on “developing computational models for understanding the interaction between data science and language”. In the clinical domain, researchers have often used NLP systems to identify clinical syndromes and common biomedical concepts from imaging data, radiology reports, discharge summaries, problem lists, nursing documentation, drug reviews, and medical education documents. These data can help doctors determine patients' health condition(s) including diagnostic information, procedures and tests performed, treatment results, drugs administered, and more. Therefore, we hope to gain some insights and develop strategies to improve the utilization of these NLP systems in the clinical domain. We hope to provide a vision for addressing the existing data challenge(s) in this domain. For this, we would look at the various models that have been used/published over the years and test them for their attributes including effectiveness, accuracy, precision, etc. We believe that adding a probabilistic graphical model framework for structured output prediction would further improve the performance of our system. This experiment remains our future work.