Texts of a particular type evidence a discernible, predictable schema. These schemata can be delineated, and as such provide models of their respective text-types which are of use in automatically structuring texts. We have developed a Text Structurer module which recognizes text-level structure for use within a larger information retrieval system to delineate the discourse-level organization of each document's contents. This allows those document components which are more likely to contain the type of information suggested by the user's query to be selected for higher weighting. We chose newspaper text as the first text type to implement. Several iterations of manually coding a randomly chosen sample of newspaper articles enabled us to develop a newspaper text model. This process suggested that our intellectual decomposing of texts relied on six types of linguistic information, which were incorporated into the Text Structurer module. Evaluation of the results of the module led to a revision of the underlying text model and of the Text Structurer itself.
Internet of Things (IoT) technology has been attracted lots of interests over the recent years, due to its applicability across the various domains. In particular, an IoT-based robot with artificial intelligence may be utilized in various fields of surveillance. In this paper, we propose an IoT platform with an intelligent surveillance robot using machine learning in order to overcome the limitations of the existing closed-circuit television (CCTV) which is installed fixed type. The IoT platform with a surveillance robot provides the smart monitoring as a role of active CCTV. The intelligent surveillance robot, which has been built with its own IoT server, and can carry out line tracing and acquire contextual information through the sensors to detect abnormal status in an environment. In addition, photos taken by its camera can be compared with stored images of normal state. If an abnormal status is detected, the manager receives an alarm via a smart phone. For user convenience, the client is provided with an app to control the robot remotely. In the case of image context processing it is useful to apply convolutional neural network (CNN)-based machine learning (ML), which is introduced for the precise detection and recognition of images or patterns, and from which can be expected a high performance of recognition. We designed the CNN model to support contextually-aware services of the IoT platform and to perform experiments for learning accuracy of the designed CNN model using dataset of images acquired from the robot. Experimental results showed that the accuracy of learning is over 0.98, which means that we achieved enhanced learning in image context recognition. The contribution of this paper is not only to implement an IoT platform with active CCTV robot but also to construct a CNN model for image-and-context-aware learning and intelligence enhancement of the proposed IoT platform. The proposed IoT platform, with an intelligent surveillance robot using machine learning, can be used to detect abnormal status in various industrial fields such as factory, smart farms, logistics warehouses, and public places.
The text categorization module described here provides a front-end filtering function for the larger DR-LINK text retrieval system [Liddy and Myaeing 1993]. The model evaluates a large incoming stream of documents to determine which documents are sufficiently similar to a profile at the broad subject level to warrant more refined representation and matching. To accomplish this task, each substantive word in a text is first categorized using a feature set based on the semantic Subject Field Codes (SFCs) assigned to individual word senses in a machine-readable dictionary. When tested on 50 user profiles and 550 megabytes of documents, results indicate that the feature set that is the basis of the text categorization module and the algorithm that establishes the boundary of categories of potentially relevant documents accomplish their tasks with a high level of performance. This means that the category of potentially relevant documents for most profiles would contain at least 80% of all documents later determined to be relevant to the profile. The number of documents in this set would be uniquely determined by the system's category-boundary predictor, and this set is likely to contain less than 5% of the incoming stream of documents.
The goal of our 18 month NSDL-funded project is to develop Natural Language Processing and Machine Learning technology which will accomplish automatic metadata generation for individual educational resources in digital collections. The metadata tags that the system will be learning to automatically assign are the full complement of Gateway to Educational Materials (GEM) metadata tags -from the nationally recognized consortium of organizations concerned with access to educational resources. The documents that comprise the sample for this research come from the Eisenhower National Clearinghouse on Science and Mathematics.The significance of this project in terms of the Digital Library movement is that high-quality automatic assignment of metadata is essential if we are going to break the human metadata-generation bottleneck that has plagued the common goal of providing timely access to textual resources as soon as they are ready for uploading into digital libraries. While new and specialized digital libraries are instituted and those in existence continue to expand, one very serious obstacle to quicker growth is the need for all of the items in a digital library to be manually meta-tagged to enable their access by digital library users. Therefore, to resolve this bottleneck, this project has set out to develop NLP-based technology which will increase the amount of materials available in digital libraries, as well as to provide improved access to a digital library for teachers, parents, and students.The processes that we are experimenting with to accomplish automatic assignment of metadata tags consist of both symbolic rule writing for the Natural Language Processing (NLP) approach and exemplar-based training for the Machine-Learning (ML) approach. To date, in pursuit of this goal, we have acquired an appropriate collection of lesson plans and instructional strategy reports from the Eisenhower Clearinghouse and the Gem Collection. Based on the cumulative analysis of these documents, our analysts have produced schemas of rules for recognizing instructional models as revealed in lesson plans and treatises on teaching, as well as predictable frames of the essential elements for metadata tags that occur in this genre of documents. Since our analysts have both information-access and teaching expertise, they have jointly made excellent progress in the development of models, symbolic extraction rules, and linguistic featurebased training rules for the domain of math and science education.The resulting rules, which have been incorporated as a specialized rule set in our more generalized feature recognition and tagging modules guide the NLP component, while a set of manually-tagged training examples are used to guide the ML component -both for the purpose of accomplishing high quality, consistent, automatic assignment of metatags to educational resources.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.