Hierarchical classification is a challenging problem yet bears a broad application in real-world tasks. Item categorization in the ecommerce domain is such an example. In a largescale industrial setting such as eBay, a vast amount of items need to be categorized into a large number of leaf categories, on top of which a complex topic hierarchy is defined. Other than the scale challenges, item data is extremely sparse and skewed distributed over categories, and exhibits heterogeneous characteristics across categories. A common strategy for hierarchical classification is the "gates-and-experts" methods, where a high-level classification is made first (the gates), followed by a low-level distinction (the experts). In this paper, we propose to leverage domain-specific feature generation and modeling techniques to greatly enhance the classification accuracy of the experts. In particular, we innovatively derive features to encode various rich domain knowledge and linguistic hints, and then adapt a SVM-based model to distinguish several very confusing category groups appeared as the performance bottleneck of a currently deployed live system at eBay. We use illustrative examples and empirical results to demonstrate the effectiveness of our approach, particularly the merit of smartly designed domainspecific features.
Design and simulation of future mobile networks will center around human interests and behavior. We propose a design paradigm for mobile networks driven by realistic models of users' on-line behavior, based on mining of billions of wireless-LAN records. We introduce a systematic method for large-scale multi-dimensional coclustering of web activity for thousands of mobile users at 79 locations. We find surprisingly that users can be consistently modeled using ten clusters with disjoint profiles. Access patterns from multiple locations show differential user behavior. This is the first study to obtain such detailed results for mobile Internet usage.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.