The growing abundance of text articles in internet requires automated tagging using key phrases. The automated key phrase generation of resources helps in the information retrieval. To generate the key phrases for texts from all possible domains, the need is an automated approach that would extract the key ideas directly from the text itself. In this paper, we have suggested a methodology that uses the noun words and phrases, their occurrence and co-occurrences to generate the keywords. The method, employing both the statistical and linguistic features, has been successful in extracting the keywords and phrases to tag a text that best summarizes its content.
Image processing is a rapidly evolving field with immense significance in science and engineering. One of the latest applications of Image processing is in Intelligent Character Recognition (ICR), that is the computer translation of handwritten text into machine-readable and machine-editable characters. ICR is an advanced version of Optical Character Recognition system that allows fonts and different styles of handwriting to be recognized during processing with high accuracy and speed. ICR, in combination with OCR and OMR (Optical Mark Recognition), is used in forms processing. Forms processing is a process by which one can capture information entered into different data fields filled in forms and convert it to an editable text. Forms processing systems can range from the processing of small application forms to large scale survey forms with multiple pages. The Recognition Engine, designed using Image Processing and Convolution Networks helps save time, labor and money in addition to the increase of accuracy.
Searching for articles of interest on publication sites can be difficult and time-consuming. Sometimes it takes lot of efforts to find the most relevant article because of which the reader looses interest completely. Recommender systems help the users find articles of their interest with personalized suggestions. In this paper, Hybrid Recommender System is implemented which is a novel combination of content-based filtering, collaborative filtering, trending article algorithm and user persona and recommend articles considering all the possible factors. User short-term interest is catered by suggesting trending articles while long-term interest is catered by observing what kind of content the user prefers to read and by finding out similar users and recommend what they are reading. The model makes the recommendation based on tags assigned to each article and knowledge of articles read by each user. This model doesn't require ratings of articles by each user as generally users usually don"t rate article after reading them as compared to giving rating to movies after watching. The model built takes into consideration many aspects including the trend emerging at current time as well the interest of the user, the time period, geographical location, browsing history etc. then make recommendations accordingly.Copy Right, IJAR, 2017,. All rights reserved. …………………………………………………………………………………………………….... Introduction:-Articles in huge numbers are published every day across different categories. The information portal sites include articles of categories like stock markets, finance, banking, insurance, entertainment, social feeds etc.Web sites are deploying recommendation systems for suggesting articles to users according to their taste. A key part of the news is that user has a long-term interest in certain categories and short-term interest in recent happenings. The short-term interest of user about some recent event can be dealt by recommending the trending article of that time on the basis of view counts of articles within a particular timeframe. As far as the long-term interest of the user is concerned, the recommendation can be done on the basis of user behavior and preferences. For this purpose, content-based filtering and collaborative filtering techniques are used to generate the recommendation. Content-based filtering finds articles which are similar on the basis of tags assigned to each article. Each article is assigned weights on the basis of term frequency and inverse document frequency of each tag. After which user probability of reading an article is calculated. On the other hand, collaborative filtering uses the correlation between the articles on the basis of the ratings given to article by different users. CorrespondingThe disadvantage of content-based filtering is that it leads to over-specialization that is the recommended article is similar to already read article and may not be useful for the user. This method does not use the interaction information between users to generate recommendations.Collaborative f...
Large amount of insights can be drawn from the articles that are published online. Instead of manually reading all the articles and assigning relevant tags to them satisfying the content, it will be highly efficient if there exists an automated process for performing the task. In this paper, an unsupervised approach for the automated tagging of articles in Chinese language has been implemented. The input is an article and output is the tags to that article. The major challenge is the segmentation of the Chinese characters, which do not make use of separators unlike the English characters. To overcome this, different approaches are combined together in order to get accurate results. Efficient tagging of articles is required, which can be used for many applications in the analysis, one of which is in Recommendation Engine. The tagging process should consider all the aspects of the article and assign the most relevant tags accordingly. The proposed algorithm was implemented for a Chinese Publication House and relevant tags were assigned to its articles of different categories. At the end of the project, the results were manually checked for, in a corpus of 10000 Chinese articles, which reflected the attainment of overall accuracy of around 85%, greater than that obtained through different traditional methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.