Novel pathogens have the potential to become critical issues of national security, public health and economic welfare. As demonstrated by the response to Severe Acute Respiratory Syndrome (SARS) and influenza, genomic sequencing has become an important method for diagnosing agents of infectious disease. Despite the value of genomic sequences in characterizing novel pathogens, raw data on their own do not provide the information needed by public health officials and researchers. One must integrate knowledge of the genomes of pathogens with host biology and geography to understand the etiology of epidemics. To these ends, we have created an application called Supramap (http://supramap.osu.edu) to put information on the spread of pathogens and key mutations across time, space and various hosts into a geographic information system (GIS). To build this application, we created a web service for integrated sequence alignment and phylogenetic analysis as well as methods to describe the tree, mutations, and host shifts in Keyhole Markup Language (KML). We apply the application to 239 sequences of the polymerase basic 2 (PB2) gene of recent isolates of avian influenza (H5N1). We map a mutation, glutamic acid to lysine at position 627 in the PB2 protein (E627K), in H5N1 influenza that allows for increased replication of the virus in mammals. We use a statistical test to support the hypothesis of a correlation of E627K mutations with avian-mammalian host shifts but reject the hypothesis that lineages with E627K are moving westward. Data, instructions for use, and visualizations are included as supplemental materials at: http://supramap.osu.edu/sm/supramap/publications. Ó The Willi Hennig Society 2010.We have created a web-based workflow application, Supramap (http://supramap.osu.edu). Using a web browser, a user inputs text files containing sequence and or phenotypic data, latitude and longitude coordinates, and (optionally) a date of isolation for each strain. Our application then executes a workflow that entails integrated sequence alignment and phylogenetic analysis, computation of character changes (e.g., mutations and host shifts), and geographical projection of the tree on a computing cluster. Once the analyses are complete, the user can download a phylogenetic layer expressed in KML file and view the file with a Geographic Information System (GIS). The user can use the phylogenetic layer to visualize several aspects of pathogen evolution including: spread of lineages, mutations, shifts among hosts, and phenotypic changes over geography and time. We illustrate the use of the system with a case study on H5N1 and discuss use of visualization in conjunction with statistical validation.
Other tree projection effortsSupramap is superficially similar to other efforts for projecting phylogenetic trees in GIS, such as