An icicle plot is a method for presenting a hierarchical clustering. Compared with other methods of presentation, it is far easier in an icicle plot to read off which objects belong to which clusters, and which objects join or drop out from a cluster as we move up and down the levels of the hierarchy, though these benefits only appear when enough objects are being clustered. Icicle plots are described, and their benefits are illustrated using a clustering of 48 objects.
This article discusses through three examples several new methods to aid in the analysis of large contingency tables. The general goal is to give better understanding of specific contingency tables, both by comparing how various log-linear/logistic models fit and through clearer interpretations of the resulting fits. For model selection, we show how to focus on a subset of simple, good-fitting models, beginning with a plot of a goodness-of-fit statistic versus residual degrees of freedom for all of the fitted models. To assess whether a particular model is adequate, we demonstrate that certain plots of residuals can reveal interesting effects that are often otherwise hidden. For model summarization and interpretation, we plot odds-ratio factors with confidence intervals to show the effects of explanatory variables in a concise and appealing way. The first example involves the relationship of job satisfaction to demographic variables for craft employees of a large corporation. The data presented consist of a fiveway contingency table with about 10,000 counts. Job satisfaction for such employees increased with age and was higher in the Southwest and West than in the Northeast. Of four race-by-sex groups, the most satisfied was nonwhite males; the least satisfied was nonwhite females. Another example gives a six-way table with about 1,200 counts concerning whether or not high-school students think they will need mathematics in their future work. Among other results, for students planning to take a job right after graduation, those from suburban schools had odds about 2.6 times those from urban schools of thinking that mathematics will be useful. Moreover, among urban students, males had odds of finding mathematics useful about 2.1 times those for females, but there was little difference between the odds for males and females among suburban students. The third example, drawn from the literature, relates knowledge about cancer to four dichotomous variables. We compare our analysis with earlier ones.
Yehuda Vardi introduced the term network tomography and was the first to propose and study how statistical inverse methods could be adapted to attack important network problems (Vardi, 1996). More recently, in one of his final papers, Vardi proposed notions of metrics on networks to define and measure distances between a network's links, its paths, and also between different networks (Vardi, 2004). In this paper, we apply Vardi's general approach for network metrics to a real data network by using data obtained from special data network tools and testing procedures presented here. We illustrate how the metrics help explicate interesting features of the traffic characteristics on the network. We also adapt the metrics in order to condition on traffic passing through a portion of the network, such as a router or pair of routers, and show further how this approach helps to discover and explain interesting network characteristics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.