In this paper we propose that the dichotomy between exemplarbased and prototype-based models of concept learning can be regarded as an instance of the tradeoff between complexity and data-fit, often referred to in the statistical learning literature as the bias-variance tradeoff. This continuum reflects differences in models' assumptions about the form of the concepts in their environments: models at one extreme, here exemplified by prototype models, assume a simple conceptual form, entailing high bias; models at the other extreme, exemplified by exemplar models, entertain more complex hypotheses, but tend to overfit the data, with a concomitant loss in generalization performance. To investigate human learners' place on this continuum, we had subjects learn concepts of varying levels of structural complexity. Concepts consisted of mixtures of Gaussian distributions, with the number of mixture components serving as the measure of complexity. We then fit subjects' responses to both a representative exemplar model and a representative prototype model. With moderately complex multimodal categories, the exemplar model generally fit subjects' performance better, due to the prototype models' overly narrow (high-bias) assumption of a unimodal concept. But with high-complexity concepts, the exemplar model's overly flexible (high-variance) assumptions made it overfit concepts relative to subjects, allowing it to outperform subjects on highly complex concepts. We conclude that neither strategy is uniformly optimal as a model of human performance.
This paper reports on methods and results of an applied research project by a team consisting of SAIC and four universities to develop, integrate, and evaluate new approaches to detect the weak signals characteristic of insider threats on organizations' information systems. Our system combines structural and semantic information from a real corporate database of monitored activity on their users' computers to detect independently developed red team inserts of malicious insider activities. We have developed and applied multiple algorithms for anomaly detection based on suspected scenarios of malicious insider behavior, indicators of unusual activities, high-dimensional statistical patterns, temporal sequences, and normal graph evolution. Algorithms and representations for dynamic graph processing provide the ability to scale as needed for enterpriselevel deployments on real-time data streams. We have also developed a visual language for specifying combinations of features, baselines, peer groups, time periods, and algorithms to detect anomalies suggestive of instances of insider threat behavior. We defined over 100 data features in seven categories based on approximately 5.5 million actions per day from approximately 5,500 users. We have achieved area under the ROC curve values of up to 0.979 and lift values of 65 on the top 50 user-days identified on two months of real data.
With the increasing reliance on social media as a dominant communication medium for current news and personal communications, communicators are capable of executing deception with relative ease. While past-related research has investigated written deception in traditional forms of computer mediated communication (e.g. email), we are interested determining if those same indicators hold in social media-like communication and if new, social-media specific linguistic cues to deception exist. Our contribution is two-fold: 1) we present results on human subjects experimentation to confirm existing and new linguistic cues to deception; 2) we present results on classifying deception from training machine learning classifiers using our best features to achieve an average 90% accuracy in cross fold validation.
We present a unified Bayesian approach to shape representation and related problems in perceptual organization, including part decomposition, shape similarity, figure/ground estimation, and 3D shape. The approach is based on the idea of estimating the skeletal structure most likely to have generated the observed shape via a process of stochastic "growth." We survey the approach briefly and show how it can be extended in a principled way to solve a wide array of related problems. Shape and perceptual organizationThe visual representation of shape is a complex problem, requiring the reduction of an essentially infinite-dimensional object (the geometry of the shape) to a few perceptually meaningful dimensions. Human infants can recognize shape from line drawings without any prior experience [17], suggesting that the ability to abstract form from the bounding contour is innate. Much research in the study of shape has involved a quest for a set of shape descriptors that will allow just the right aspects of shape to be extracted-a representation that retains enough information to support recognition, shape similarity, and other key functions. [8], and so forth-has merits. Some have compelling mathematical motivations, while others (unfortunately not usually the same ones) have demonstrable agreement with human data. Still, broadly speaking, a complete computational characterization of human shape representation remains elusive.
Although a large body of work has previously investigated various cues predicting deceptive communications, especially as demonstrated through written and spoken language (e.g., [30]), little has been done to explore predicting kinds of deception. We present novel work to evaluate the use of textual cues to discriminate between deception strategies (such as exaggeration or falsification), concentrating on intentionally untruthful statements meant to persuade in a social media context. We conduct human subjects experimentation wherein subjects were engaged in a conversational task and then asked to label the kind(s) of deception they employed for each deceptive statement made. We then develop discriminative models to understand the difficulty between choosing between one and several strategies. We evaluate the models using precision and recall for strategy prediction among 4 deception strategies based on the most relevant psycholinguistic, structural, and data-driven cues. Our single strategy model results demonstrate as much as a 58% increase over baseline (random chance) accuracy and we also find that it is more difficult to predict certain kinds of deception than others.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.