Software Architectural Process (SAP) is a core and excessively knowledge intensive phase of software development life cycle, as it consumes and produces knowledge artifacts, simultaneously. SAP is about making design decisions, and the changes in these verdicts may pose adverse effects on software projects. The performance and properties of software components are fundamentally influenced by the design decisions.The implementation of immature and abrupt design decisions seriously threatens the development process of SAP. Moreover, software architectural knowledge management (AKM) approaches offer systematic ways to support SAP through versatile architectural solutions and design decisions. However, the majority of software organizations have limited access to data and still depend upon manually created and maintained AKM process. In this paper, we have utilized the one of the most prominent online community for software development (i.e., Stack Overflow) as a source of SAP knowledge to support AKM. In order to support AKM, we have proposed a supervised machine learning-based approach to classify the architectural knowledge into predefined categories, that is, analysis, synthesis, evaluation, and implementation. We have employed different combinations of feature selection technique to achieve the optimal classification results of the used classifiers (Support Vector Machine [SVM], K-Nearest Neighbor, Random Forest, and Naive Bayes [NB]). Among these classifiers, SVM with Uni-gram feature set provides best classification results and attains 85.80% accuracy.For evaluating the proposed approach's effectiveness, we have also computed the suitability of the classifiers, that is, the cost of computation along with its accuracy, and NB with Uni-gram feature set proved to be the most suitable.
K E Y W O R D Sarchitectural knowledge management, stack overflow, crowd-sourced communities, text mining, classification
INTRODUCTIONSoftware architectural design appears to have a significant importance in Software Development Life Cycle (SDLC). Its manifests are the early design decisions which help to determine the system development, deployment, and evaluation. To develop a quality project, better architectural decisions are essential and these decisions pose vast challenges for software engineers. Architectural Knowledge Management (AKM) is used to capture