Microservices are becoming the defacto design choice for software architecture. It involves partitioning the software components into finer modules such that the development can happen independently. It also provides natural benefits when deployed on the cloud since resources can be allocated dynamically to necessary components based on demand. Therefore, enterprises as part of their journey to cloud, are increasingly looking to refactor their monolith application into one or more candidate microservices; wherein each service contains a group of software entities (e.g., classes) that are responsible for a common functionality. Graphs are a natural choice to represent a software system. Each software entity can be represented as nodes and its dependencies with other entities as links. Therefore, this problem of refactoring can be viewed as a graph based clustering task. In this work, we propose a novel method to adapt the recent advancements in graph neural networks in the context of code to better understand the software and apply them in the clustering task. In that process, we also identify the outliers in the graph which can be directly mapped to top refactor candidates in the software. Our solution is able to improve state-of-the-art performance compared to works from both software engineering and existing graph representation based techniques.
Monolithic software encapsulates all functional capabilities into a single deployable unit. But managing it becomes harder as the demand for new functionalities grow. Microservice architecture is seen as an alternative as it advocates building an application through a set of loosely coupled small services wherein each service owns a single functional responsibility. But the challenges associated with the separation of functional modules, slows down the migration of a monolithic code into microservices. In this work, we propose a representation learning based solution to tackle this problem. We use a heterogeneous graph to jointly represent software artifacts (like programs and resources) and the different relationships they share (function calls, inheritance, etc.), and perform a constraint-based clustering through a novel heterogeneous graph neural network. Experimental studies show that our approach is effective on monoliths of different types.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.