This paper describes an investigation into measuring the quality of functional programs. The work reported here is part of a larger, on-going study into a q u a n titative analysis of the e ect of utilizing di erent programming paradigms on code quality. Prior to undertaking such a comparative analysis it is necessary to establish a baseline of quality indicators which can then be used as metrics for the remainder of the project. Thus the aim of the research presented here was to evaluate a set of suggested indicators corresponding to internal attributes by investigating the correlation between the suggested indicators and the desired external quality-type attributes of the code. A method for the evaluation of suggested metrics is discussed and the results of performing such a n e v aluation for functional programs are presented.Keywords: quantifying quality, functional languages, internal attributes
INTRODUCTIONThe research described in this paper was performed as part of a project concerned with investigating the variations in code quality resulting from the use of di erent programming paradigms. In particular, the initial aim of the project was to investigate whether or not the quality of code produced using a functional language was signi cantly di erent from that produced using an object-oriented language 1]. In order to carry out the experiment i t w as rst necessary to determine which metrics should be considered. An earlier paper 2] analysed a variety of metrics and discussed their suitability for measuring functional programs. This paper describes a formal experiment w h i c h w a s s e t u p t o e v aluate the suggested metrics, and reports on the results of the experiment.
1The method described here was in uenced by w ork reported in the literature 7] and also by the work of the DESMET project 13]
METHODTo evaluate di erent programming languages it is necessary to measure the external quality-type attributes of the code, such as reliability, usability, maintainability, testability, reusability, i n tegrity, e ciency, and portability. However, with the exception of e ciency, such attributes are notoriously di cult to quantify, because they depend on the way in which the software reacts with external factors, such as developers and users. Metrics based on internal attributes are often employed in the belief that there is a strong correlation between internal attributes and the desired external attributes. Internal attributes such as length, modularity, reuse, coupling and cohesion are much easier to measure than external attributes, and some work has already been done on correlating internal and external attributes for programs written in imperative languages 3, 6, 7]. The experiment described here used established statistical techniques to ascertain whether or not there is any correlation between a selection of internal attributes (based on length and modularity) and certain characteristics of the development process, such a s t h e n umber of errors found during development, which are assumed to be in...