“…Applying methods in natural language processing to the EHR is a growing field with many potential applications in clinical decision support and augmented care. Corpus and annotation on EHR data are created to model semantic features and relation through linguistic cues, including relation extraction (Mowery et al, 2008), named entity recognition (Wang, 2009;Patel et al, 2018;Lybarger et al, 2021), question answering (Pampari et al, 2018;Raghavan et al, 2021), natural language inference (Romanov and Shivade, 2018), etc. However, few corpora have been built to model clinical thinking, especially about clinical diagnostic reasoning, a process involving clinical evidence acquisition, generating hypothesis, integration and abstraction over medical knowledge and synthesizing a conclusion in the form of a diagnosis and treatment plan (Bowen, 2006).…”