Induced decision trees are an extensively-researched solution to classification tasks. For many practical tasks, the trees produced by tree-generation algorithms are not comprehensible to users due to their size and complexity. Although many tree induction
algorithms have been shown to produce simpler, more comprehensible trees (or data structures derived from trees) with good classification accuracy, tree simplification has usually been of secondary concern relative to accuracy, and no attempt has been made to survey the literature from the perspective of simplification. We present a framework
that organizes the approaches to tree simplification and summarize and critique the approaches within this framework. The purpose of this survey is to provide researchers and practitioners with a concise overview of tree-simplification approaches and insight into their relative capabilities. In our final discussion, we briefly describe some empirical findings and discuss the application of tree induction algorithms to case retrieval in case-based reasoning systems.
Previous research has shown that multicolored scales are superior to ordered brightness scales for supporting identification tasks on complex visualizations (categorization, absolute numeric value judgments, etc.), whereas ordered brightness scales are superior for relative comparison tasks (greater/less). We examined the processes by which such tasks are performed. By studying eye movements and by comparing performance on scales of different sizes, we argued that (a) people perform identification tasks by conducting a serial visual search of the legend, whose speed is sensitive to the number of scale colors and the discriminability of the colors; and (b) people perform relative comparison tasks using different processes for multicolored versus brightness scales. With multicolored scales, they perform a parallel search of the legend, whose speed is relatively insensitive to the size of the scale, whereas with brightness scales, people usually directly compare the target colors in the visualization, while making little reference to the legend. Performance of comparisons was relatively robust against increases in scale size, whereas performance of identifications deteriorated markedly, especially with brightness scales, once scale sizes reached 10 colors or more.
We interpret these findings in terms of the memory-for-goals model, which suggests that SAR consists of increased scanning in order to compensate for decay, and previously viewed cues act as associative primes that reactivate memory traces of goals and plans.
Trade-offs that must typically be made in the design of color-coded visualizations between speed and accuracy or between identification and comparison tasks may be mitigated.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.