Decision support tools that rely on supervised learning require large amounts of expert annotations. Using past radiological reports obtained from hospital archiving systems has many advantages as training data above manual single-class labels: they are expert annotations available in large quantities, covering a population-representative variety of pathologies, and they provide additional context to pathology diagnoses, such as anatomical location and severity. Learning to auto-generate such reports from images present many challenges such as the difficulty in representing and generating long, unstructured textual information, accounting for spelling errors and repetition/redundancy, and the inconsistency across different annotators. We therefore propose to first learn visually-informative medical concepts from raw reports, and, using the concept predictions as image annotations, learn to autogenerate structured reports directly from images. We validate our approach on the OpenI [2] chest x-ray dataset, which consists of frontal and lateral views of chest x-ray images, their corresponding raw textual reports and manual medical subject heading (MeSH R ) annotations made by radiologists.