For many years the expression "curriculum evaluation" has tended to mean to most people some sort of use and interpretation of achievement tests. Most assuredly, this is a very real part of the total concept of evaluation; but the thesis of this paper is that the concept must be broadened to include other sorts of ventures besides those of collecting and summarizing the test scores of students who have undergone a particular curricular treatment. The most commonly held idea of the sequence of evaluation endeavors starts with the act of stating the objectives of a set of materials--a full course, a unit of some sort, or a group of several units. This is followed by definition of these objectives in behavioral terms. Next comes the development of items, that is, situations which call for the behavior defined. These items are combined into scorable units, scores are obtained on appropriate samples of youngsters. Then, finally, the sequence ends in attempts to interpret these scores in terms of the extent to which the new materials have developed the behaviors which satisfy the purposes which the innovators had in mind.A bit of experience in this area on a real job of evaluation will convince anyone that the steps of this total procedure--as simple as they are to state--are laden with problems of several kinds. We have the usual measurement problems of any test construction together with certain special problems of sampling and of the treatment of gain scores. Furthermore, all of this is imbedded in larger tactical problems of deriving the data from ongoing classroom settings. There is no denying the importance of solving these various problems if we are to move forward in evaluation of educational curricula.At this point, however, it is important to raise the question of "What are the purposes of evaluation?" in order to lead into the theme that we need a considerably broader attack than is implied by the sequence described in the first paragraph. In curriculum innovation as a real ongoing venture there are two general purposes for evaluation. One of these concerns collection of information to be used as feedback to the innovators for further revision of materials and methods. Without such feedback, either the decision to revise or the decision not to t~vise-and most certainly the decision of how to revise--must be based upon feeling tones and the arguments of personal preference. The second main purpose of evaluation
This discussion is a plea for educational measurement specialists to broaden the repertoire of methodologies for data collection and analysis by borrowing from elsewhere than psychology and agronomy. It suggests that such disciplines as history, economics, and sociology have techniques highly appropriate to evaluation of educational endeavors.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.