Background: With digital transformation and growing social media usage, kids spend considerable time on the web, especially watching videos on YouTube. YouTube is a source of education and entertainment media that has a significant impact on the skill improvement, knowledge, and attitudes of children. Simultaneously, harmful and inappropriate video content has a negative impact. Recently, researchers have given much attention to these issues, which are considered important for individuals and society. The proposed methods and approaches are to limit or prevent such threats that may negatively influence kids. These can be categorized into five main directions. They are video rating, parental control applications, analysis meta-data of videos, video or audio content, and analysis of user accounts. Objective: The purpose of this study is to conduct a systematic review of the existing methods, techniques, tools, and approaches that are used to protect kids and prevent them from accessing inappropriate content on YouTube videos. Methods: This study conducts a systematic review of research papers that were published between January 2016 and December 2022 in international journals and international conferences, especially in IEEE Xplore Digital Library, ACM Digital Library, Web of Science, Google Scholar, Springer database, and ScienceDirect database. Results: The total number of collected articles was 435. The selection and filtration process reduced this to 72 research articles that were appropriate and related to the objective. In addition, the outcome answers three main identified research questions. Significance: This can be beneficial to data mining, cybersecurity researchers, and peoples’ concerns about children’s cybersecurity and safety.
<p><span>With today’s digital revolution, many people communicate and collaborate in cyberspace. Users rely on social media platforms, such as Facebook, YouTube and Twitter, all of which exert a considerable impact on human lives. In particular, watching videos has become more preferable than simply browsing the internet because of many reasons. However, difficulties arise when searching for specific videos accurately in the same domains, such as entertainment, politics, education, video and TV shows. This problem can be solved through web video categorization (WVC) approaches that utilize video textual information, visual features, or audio approaches. However, retrieving or obtaining videos with similar content with high accuracy is challenging. Therefore, this paper proposes a novel mode for enhancing WVC that is based on user comments and weighted features from video descriptions. Specifically, this model uses supervised learning, along with machine learning classifiers (MLCs) and deep learning (DL) models. Two experiments are conducted on the proposed balanced dataset on the basis of the two proposed algorithms based on multi-classes, namely, education, politics, health and sports. The model achieves high accuracy rates of 97% and 99% by using MLCs and DL models that are based on artificial neural network (ANN) and long short-term memory (LSTM), respectively.</span></p>
Part of Speech (POS) tagging is one of the most common techniques used in natural language processing (NLP) applications and corpus linguistics. Various POS tagging tools have been developed for Arabic. These taggers differ in several aspects, such as in their modeling techniques, tag sets and training and testing data. In this paper we conduct a comparative study of five Arabic POS taggers, namely: Stanford Arabic, CAMeL Tools, Farasa, MADAMIRA and Arabic Linguistic Pipeline (ALP) which examine their performance using text samples from Saudi novels. The testing data has been extracted from different novels that represent different types of narrations. The main result we have obtained indicates that the ALP tagger performs better than others in this particular case, and that Adjective is the most frequent mistagged POS type as compared to Noun and Verb.
Arabic has recently received significant attention from corpus compilers. This situation has led to the creation of many Arabic corpora that cover various genres, most notably the newswire genre. Yet, Arabic novels, and specifically those authored by Saudi writers, lack the sufficient digital datasets that would enhance corpus linguistic and stylistic studies of these works. Thus, Arabic lags behind English and other European languages in this context. In this paper, we present the Saudi Novels Corpus, built to be a valuable resource for linguistic and stylistic research communities. We specifically present the procedures we followed and the decisions we made in creating the corpus. We describe and clarify the design criteria, data collection methods, process of annotation, and encoding. In addition, we present preliminary results that emerged from the analysis of the corpus content. We consider the work described in this paper as initial steps to bridge the existing gap between corpus linguistics and Arabic literary texts. Further work is planned to improve the quality of the corpus by adding advanced features.
The primary purpose of sequence modeling is to record long-term interdependence across interaction sequences, and since the number of items purchased by users gradually increases over time, this brings challenges to sequence modeling to a certain extent. Relationships between terms are often overlooked, and it is crucial to build sequential models that effectively capture long-term dependencies. Existing methods focus on extracting global sequential information, while ignoring deep representations from subsequences. We argue that limited item transfer is fundamental to sequence modeling, and that partial substructures of sequences can help models learn more efficient long-term dependencies compared to entire sequences. This paper proposes a sequence recommendation model named GAT4Rec (Gated Recurrent Unit And Transformer For Recommendation), which uses a Transformer layer that shares parameters across layers to model the user's historical interaction sequence. The representation learned by the gated recurrent unit is used as a gating signal to filter out better substructures of the user sequence. The experimental results demonstrate that our proposed GAT4Rec model is superior to other models and has a higher recommendation effectiveness.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.