Supporting Product Line Adoption by Combining Syntactic and Textual Feature Extraction

Computational Science and Its Applications – ICCSA 2019

2019

Self Cite

Automatic recovery of test-to-code traceability links is an important task in many areas of software engineering, like quality assurance and code maintenance. The research community has shown great interest in such a topic and has developed several techniques that already made significant advances in the field. These techniques include text-based learning algorithms, of which corpus is built from the source code of the software components. Several techniques based on information retrieval have been benchmarked, but the capabilities of many learning algorithms have not yet been tested. In this work we examine the textual similarity measures produced by three different machine learning techniques for the recovery of traceability information while also considering various textual representations of the source code. The obtained results are evaluated on 4 open source systems based on naming conventions. We have been able to improve the current textual similarity based state-of-the-art results in the case of each evaluated system.

Section: Related Workmentioning

confidence: 99%

Evaluation of Textual Similarity Techniques in Code Level Traceability

Csuvik

Computational Science and Its Applications – ICCSA 2019

2019

Self Cite

“…Conceptual analysis was successfully applied in various software engineering topics in recent years [19]. LSI [3] is often used throughout software engineering, for example in fault localization [18], in detection of bug report duplicates [12], test-prioritization [30], feature analysis [9,10] and in the field of traceability, for example between tests and requirements [15]. Several efforts have been made to improve the application of the LSI technique itself, for example Query-based reconfiguration approach [17] and using part of speech information [1].…”

Section: Related Workmentioning

confidence: 99%

Exploring the benefits of utilizing conceptual information in test-to-code traceability

Tóth

Proceedings of the 6th International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering

2018

Self Cite

Striving for reliability of software systems often results in immense numbers of tests. Due to the lack of a generally used annotation, finding the parts of code these tests were meant to assess can be a demanding task. This is a valid problem of software engineering called test-to-code traceability. Recent research on the subject has attempted to cope with this problem applying various approaches and their combinations, achieving profound results. These approaches have involved the use of naming conventions during development processes and also have utilized various information retrieval (IR) methods often referred to as conceptual information. In this work we investigate the benefits of textual information located in software code and its value for aiding traceability. We evaluated the capabilities of the natural language processing technique called Latent Semantic Indexing (LSI) in the view of the results of the naming conventions technique on five real, medium sized software systems. Although LSI is already used for this purpose, we extend the viewpoint of one-to-one traceability approach to the more versatile view of LSI as a recommendation system. We found that considering the top 5 elements in the ranked list increases the results by 30% on average and makes LSI a viable alternative in projects where naming conventions are not followed systematically. CCS CONCEPTS• Computing methodologies → Natural language processing; • Software and its engineering → Traceability;

“…This is the area where domain experts and developers need to interact: features provide a logical view of system functionality, while they are implemented by various parts of the program code. In previous work we provided methods for feature extraction based on textual similarity and call graphs [11,12]. Our textual similarity based extraction relies on the Latent Semantic Indexing (LSI) technique.…”

Section: Introductionmentioning

confidence: 99%

Feature Level Complexity and Coupling Analysis in 4GL Systems

Csuvik

Computational Science and Its Applications – ICCSA 2018

et al. 2018

Self Cite

Product metrics are widely used in the maintenance and evolution phase of software development to advise the development team about software quality. Although most of these metrics are defined for mainstream languages, several of them were adapted to fourth generation languages (4GL) as well. Usual concepts like size, complexity and coupling need to be re-interpreted and adapted to program elements defined by these languages. In this paper we take a further step in this process to address product line development in 4GL. Adopting product line architecture is a necessary step to handle challenges of a growing number of similar product variants. The product line adoption process itself is a tedious task where features of the product variants play crucial role. Features represent a higher level of abstraction that are cross-cutting to program elements of 4GL applications. We propose a set of metrics related to features by linking existing program elements to metrics and by relating features with each other. The focus of this study is on complexity and coupling metrics. We provide a metrics based analysis of several variants of a large scale industrial product line written in the Magic XPA 4GL language.