Over a decade ago, a new discipline called network medicine emerged as an approach to understand human diseases from a network theory point-of-view. Disease networks proved to be an intuitive and powerful way to reveal hidden connections among apparently unconnected biomedical entities such as diseases, physiological processes, signaling pathways, and genes.One of the fields that has benefited most from this improvement is the identification of new opportunities for the use of old drugs, known as drug repurposing. The importance of drug repurposing lies in the high costs and the prolonged time from target selection to regulatory approval of traditional drug development. In this document we analyze the evolution of disease network concept during the last decade and apply a data science pipeline approach to evaluate their functional units. As a result of this analysis, we obtain a list of the most commonly used functional units and the challenges that remain to be solved. This information can be very valuable for the generation of new prediction models based on disease networks.
Data Science Pipelineimprove the classification of diseases [2]. However, the use of these sources raised new problems such as their fragmentation, heterogeneity, availability and different conceptualization of their data [3,4].Recent developments in network theory provide a way to address this challenge by representing these complex relationships as a collection of linked nodes [5]. Complex networks theory is a statistical physics interpretation of the old graph theory, aimed at describing and understanding the structures created by the relationships between the elements of a complex system [6][7][8][9]. Those elements are represented by nodes, pairwise connected by links whenever a relationship is observed between the corresponding elements. The resulting structure can then be described by means of a plethora of topological metrics [10], or be used as a base for modelling the system. The application of this field to biological problems has been named "network biology", while its use in biomedical problems is known as "network medicine" [11]. Some applications of biological networks are protein-protein interaction networks, gene regulatory networks (DNA-protein interaction networks), metabolic networks, signaling networks, neuronal network or phylogenetic trees [12].Following this approach, disease networks express the relationship between diseases as nodes and edges in a graph in, where D represents the set of diseases (nodes) and W the set of their relationships (edges) based upon their similarity. The meaning of similarity varies depending on the data used to build the network, which may be biological (genes or common proteins) or phenotypic (comorbidity, similar symptoms) [13], among other approaches. As will be explained throughout this article, the concept of disease network is not limited to diseasedisease connections (homogeneous networks), but also to relations between the disease and other factors such as its symptoms, its associ...