Hierarchical
cluster analysis
refers to a collection of methods that seek to construct a hierarchically arranged sequence of partitions for some given object set. Typically, the methods produce a hierarchy based on some proximity measure defined for every pair of objects. The common agglomerative methods for producing partition hierarchies are discussed along with the characterizing notion of an ultrametric. The hierarchical clustering task is then redefined as one of finding a least‐squares approximation to the proximity measure by an ultrametric structure. A generalization of the ultrametric condition, and thus of hierarchical clustering, is to an additive tree; again, a least‐squares approximation to the proximity measure is sought but now by a structure that satisfies the more general additive tree inequality. Finally, a discussion is given as to how an additive tree explicitly generalizes an ultrametric by being nonuniquely decomposable by an ultrametric plus what might be called a centroid metric.
One-dimensional bin-packing problems require the assignment of a collection of items to bins with the goal of optimizing some criterion related to the number of bins used or the 'weights' of the items assigned to the bins. In many instances, the number of bins is fixed and the goal is to assign the items such that the sums of the item weights for each bin are approximately equal. Among the possible applications of one-dimensional bin-packing in the field of psychology are the assignment of subjects to treatments and the allocation of students to groups. An especially important application in the psychometric literature pertains to splitting of a set of test items to create distinct subtests, each containing the same number of items, such that the maximum sum of item weights across all bins is minimized. In this context, the weights typically correspond to item statistics derived from difficulty and discrimination indices. We present a mixed zero-one integer linear programming (MZOILP) formulation of this one-dimensional minimax bin-packing problem and develop an approximate procedure for its solution that is based on the simulated annealing algorithm. In two comparisons that focused on 34 practically-sized test problems (up to 6000 items and 300 bins), the simulated annealing heuristic generally provided better solutions than were obtained when using a commercial mathematical programming software package to solve the MZOILP formulation directly.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.