It is now acknowledged that the evaluation of the quality of MT output is inextricably linked to the purpose to which the translation output will be put. It is also true to say that the value of the evaluation is inseparably linked to the purpose to which the evaluation results will be put.For the developer of an MT system, the evaluation of the quality of the MT output must be approached from the viewpoint of increasing the knowledge about the MT system. Resultant measures must be analysed, so that practical feedback to improve the system is feasible. However, for the manager, evaluation of the MT output is often viewed in terms of comparison. Measures are compared against previous measures, or against those obtained by other systems, in order to gauge progress, or to assess the systems ability.Measurement can be viewed as a tool for increasing the knowledge of some object or entity. From both viewpoints described above, the measurement is required as a means to increase the knowledge about the system, whether it is knowledge about the systems errors, or performance.However, there is a more fundamental level at which measurement should be applied as a tool for increasing knowledge; that is, to increase the knowledge of the properties we are trying to measure (in this case intelligibility and fidelity). Such measurement is a precursory requirement for more general uses of evaluation measures, as described above.When surveying the many methods currently employed in MT evaluation, it is not immediately obvious that the methods used serve to increase the knowledge of the properties being measured. This report describes a constructive machine translation evaluation method, aimed at addressing this issue.
99