One of the hardest tasks to be fulfilled during the analysis of legacy systems is how to determine the precise semantics of program components. Investigating the internal data and control structures is difficult due to the huge number of possible implementation variants for the same problem.To facilitate the task we propose to use components kept and described in a repository of reusable concepts as reference points. This becomes possible when behavior sampling is used as classification/retrieval strategy. In matching the results of isolated components from a legacy system against already executed components in a repository, one can tackle the problem of classifying legacy components without considering there internal structure. As a side effect, the population of the reuse repository is increased.In this paper we propose a model to reuse the knowledge containd in a behavior based reuse repository for analyzing, classifying and understanding isolated executable components from a legacy system. Components not yet classfied will augment the repository.
MOTIVATIONOne of the main topics in software reengineering is the task of analyzing parts of the system in order to (re)detect the functionality and meaning of such fragments. A broad range of methods for identifying, understanding, classifyCopyright ACM 11999 I-581 13-lOl-1/99/05...$5.00 65 ing, redocumenting, and reengineering program components in legacy systems has been proposed in the literature. Most of these methods can be applied very successtily, if a-priori knowledge about program structure and programming style is available. As a consequence, if these assumptions do not hold, structure based analysis techniques must fail.In contrast, to classify assets for the purpose of software reuse, extensive documentation is available to fulfill this task. But here the problem of correct interpretation of describing keywords arises. Since interpretation of keywords depends heavily on the cultural, social, and personal context [3, 11, successfiJ classification depends on many human factors. To overcome such obstacles, researchers work on questions such as how to describe software without relying on human interpretation. Many approaches deal with formalizing the properties of component interfaces [6, 191 or using the inherent property of executabilty to directly determine meaning from software attributes.Part of our motivation for the work described in this paper stems from a project to develop a reengineering tool [ 181.Here, we were faced with the problem to perform semantics preserving code transformations. In quite a number of cases, this re-juvenation of code would be better performed by replacing parts of this code (a chunk) by some semantically equivalent component written according to state-ofthe-art programming practices. Linking reengineering techniques with reuse experience seems promising in this situation. Thus, we were looking for domain specific components either as reference points or even as candidates for substitution of pieces of legacy code. As stat...