Clifford Brunk scite author profile

We describe an LSTM-based model which we call Byte-to-Span (BTS) that reads text as bytes and outputs span annotations of the form [start, length, label] where start positions, lengths, and labels are separate entries in our vocabulary. Because we operate directly on unicode bytes rather than languagespecific words or characters, we can analyze text in many languages with a single model. Due to the small vocabulary size, these multilingual models are very compact, but produce results similar to or better than the state-ofthe-art in Part-of-Speech tagging and Named Entity Recognition that use only the provided training datasets (no external data sources). Our models are learning "from scratch" in that they do not rely on any elements of the standard pipeline in Natural Language Processing (including tokenization), and thus can run in standalone fashion on raw text.

show abstract

Pruning decision trees with misclassification costs

Bradford

Kunz²,

Kohavi³

et al. 1998

135

View full text Add to dashboard Cite

A b s t r a c t . We describe an experimental study of pruning methods for decision tree classifiers when the goal is minimizing l o s s rather than e rror. In addition to two common methods for error minimization, CART's cost-complexity pruning and C4.5's error-based pruning, we study the extension of cost-complexity pruning to loss and one pruning variant based on the Laplace correction. We perform an empirical comparison of these methods and evaluate them with respect to loss. We found that applying the Laplace correction to estimate the probability distributions at the leaves was beneficial to all pruning methods. Unlike in error minimization, and somewhat surprisingly, performing no pruning led to results that were on par with other methods in terms of the evaluation criteria. The main advantage of pruning was in the reduction of the decision tree size, sometimes by a factor of ten. While no method dominated others on all datasets, even for the same domain different pruning mechanisms are better for different loss matrices. 1 P r u n i n g D e c i s i o n T r e e s Decision trees are a widely used symbolic modeling technique for classification tasks in machine learning. The most common approach to constructing decision tree classifiers is to grow a full tree and prune it back. Pruning is desirable because the tree that is grown may overfit the data by inferring more structure than is justified by the training set. Specifically, if there are no conflicting instances, the training set error of a fully built tree is zero, while the true error is likely to be larger. To combat this overfitting problem, the tree is pruned back with the goal of identifying the tree with the lowest error rate on previously unobserved instances, breaking ties in favor of smaller trees (Breiman, Friedman, Olshen ~c Stone 1984, Quinlan 1993. Several pruning methods have been introduced in the literature, including cost-complexity pruning, reduced error pruning, pessimistic pruning, error-based pruning, penalty pruning, and MDL pruning. Historically, most pruning algorithms have been developed to minimize the expected e r r o r r a t e of the decision tree, assuming that classification errors have the same unit cost.

show abstract

Adaptive interfaces for ubiquitous web access

Billsus¹,

Brunk²,

Evans³

et al. 2002

Commun. ACM

190

View full text Add to dashboard Cite

T T he invention of the movable type printing press launched the information age by making the mass distribution of information both feasible and economical. Newspapers, magazines, shopping catalogs, restaurant guides, and classified advertisements can trace their origins to the printing process. Five and a half centuries of technological progress in communications networks, protocols, computers, and user interface design led to the Web, online publishing, and e-commerce. Consumers and businesses have access to vast stores of information. All this information, however, used to be accessible only while users were tethered to a computer at home or in an office. Wireless data and voice access to this vast store allows unprecedented access to information from any location at any time.

show abstract

An Investigation of Noise-Tolerant Relational Concept Learning Algorithms

Brunk

Pazzani

1991

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Clifford Brunk

Reducing Misclassification Costs

Multilingual Language Processing From Bytes

Pruning decision trees with misclassification costs

Adaptive interfaces for ubiquitous web access

An Investigation of Noise-Tolerant Relational Concept Learning Algorithms

Contact Info

Product

Resources

About