15The past decade has seen the rise of omics data, for the understanding of biological 16 systems in health and disease. This wealth of data includes protein-protein interaction 17 (PPI) derived from both low and high-throughput assays, which is curated into multiple 18 databases that capture the extent of available information from the peer-reviewed 19 literature. Although these curation efforts are extremely useful, reliably downloading 20 and integrating PPI data from the variety of available repositories is challenging and 21 time consuming. 22 We here present a novel user-friendly web-resource called PINOT (Protein Interaction 23 Network Online Tool; available at 24 http://www.reading.ac.uk/bioinf/PINOT/PINOT_form.html) to optimise the collection 25 and processing of PPI data from the IMEx consortium associated repositories 26 (members and observers) and from WormBase for constructing, respectively, human 27 and C. elegans PPI networks. 28 Users submit a query containing a list of proteins of interest for which PINOT will mine 29 PPIs. PPI data is downloaded, merged, quality checked, and confidence scored based 30 on the number of distinct methods and publications in which each interaction has been 31 reported. Examples of PINOT applications are provided to highlight the performance, 32 the ease of use and the potential applications of this tool. 33 2 PINOT is a tool that allows users to survey the literature, extracting PPI data for a list 34 of proteins of interest. The comparison with analogous tools showed that PINOT was 35 able to extract similar numbers of PPIs while incorporating a set of innovative features. 36 PINOT processes both small and large queries, it downloads PPIs live through 37 PSICQUIC and it applies quality control filters on the downloaded PPI annotations (i.e. 38 removing the need of manual inspection by the user). PINOT provides the user with 39 information on detection methods and publication history for each of the downloaded 40 interaction data entry and provides results in a table format that can be easily further 41 customised and/or directly uploaded in a network visualization software. 42 43 database 45 46 Background 47During the past two decades the use of omics data to understand biological systems 48 has become an increasingly valued approach (1). This includes extensive efforts to 49 detect protein-protein interactions (PPIs) on an almost proteome-wide scale (2, 3).
50The utility of such data has been greatly supported by primary database curation 51 and the International Molecular Exchange (IMEx) Consortium, which promotes 52 collaborative efforts in standardising and maintaining high quality data curation 53 across the major molecular interaction data repositories (4). The primary databases, 54 such as IntAct (5) and BioGRID (6), are rich data resources providing a 55 comprehensive record of published PPI literature. PPI data are critical to describe 56 connections among proteins, which in turn supports both inference of new functions 57 for proteins (based on the gui...