SUMMARYIn personalized medicine, SNPs are used to identify specific diseases of a patient. However, for many SNPs, no information about the pathogenicity is available. Current programs try to predict the effect of a SNP on the function of a protein, but give no possibility for visual interpretation. We have developed SNPViz, a program that first finds 3D structures of affected proteins and then highlights the affected amino acid in the 3D structure. This can give researchers and doctors more information about the probable pathogenicity of the SNP. In the future, we plan to add also further information, such as whether the position of the SNP is in a binding domain, or is involved in a protein-protein interaction.
KEYWORDSSNP; Visualization; Protein; Personalized medicine
AVAILABILITY AND REQUIREMENTSThe git repository containing the program is available at https://lambda.informatik.uni-tuebingen.de/ gitlab/seitz/snpviz
CONFLICT OF INTERESTThe authors declare no conflict of interest.
BODYSingle nucleotide polymorphisms (SNPs) are the most common genetic variations between humans and many are believed to be causative for phenotypic differences [1].Non-synonymous (ns) SNPs, meaning SNPs that result in a substitution of the amino acid in the corresponding protein, are known to be the possible cause of structural changes. A prominent example is the sickle-cell disease [2]. However, even though many non-synonymous SNPs are known, for the majority of them the corresponding structural change is still un-known [3]. In personalized medicine, whole exome sequencing can lead to the detection of several thousand SNPs per sample. However, which of them could be responsible for the cause of the disease remains unclear [4].One tool available to predict the impact of a SNP on the protein function is SIFT [5]. Our idea is to look at the putative structural changes that could be the result of a mutation in a protein in order to gain insight into possible disease related SNPs. Also, researchers often want to see the protein and exact position of the amino acid subject to mutation. This to get her with an automated pipeline that first finds the respective protein in PDB, then identifies the amino acid(s) affected by the nsNSPs, and finally visualizes this result does not exist (to our knowledge). Here, we present a tool that can highlight affected positions in the 3D structure of the corresponding proteins.For this we developed the Java tool SNPViz. It can give insights regarding SNPs for whichno pathogenic effect is known. It first identifies the exons that are affected by SNPs of interest. These exons are then mapped to the corresponding proteins using the ID mapping of UniProt [6], a database containing multiple gene annotations like ENSEMBL [7], the Protein Data Bank (PDB) [8], and more. Afterwards, if existing, the corresponding 3D structures for identified proteins are downloaded from the PDB. The affected exons are then translated to all 6 possible amino acid sequences. Next, the position of the exon within the protein is ide...