On the Use of Deep Learning for Video Classification

Rehman, Atiq Ur; Belhaouari, Samir Brahim; Kabir, Alamgir; Khan, Adnan

doi:10.3390/app13032007

Cited by 20 publications

(5 citation statements)

References 118 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Video recognition [29][30][31] is an important direction in computer vision and video processing research. To date, many effective video recognition methods have been developed, which can be grouped into two categories: temporal-spatial-and spatial-based video recognition methods.…”

Section: Discussionmentioning

confidence: 99%

Manifolds-Based Low-Rank Dictionary Pair Learning for Efficient Set-Based Video Recognition

et al. 2023

View full text Add to dashboard Cite

As an important research direction in image and video processing, set-based video recognition requires speed and accuracy. However, the existing static modeling methods focus on computational speed but ignore accuracy, whereas the dynamic modeling methods are higher-accuracy but ignore the computational speed. Combining these two types of methods to obtain fast and accurate recognition results remains a challenging problem. Motivated by this, in this study, a novel Manifolds-based Low-Rank Dictionary Pair Learning (MbLRDPL) method was developed for a set-based video recognition/image set classification task. Specifically, each video or image set was first modeled as a covariance matrix or linear subspace, which can be seen as a point on a Riemannian manifold. Second, the proposed MbLRDPL learned discriminative class-specific synthesis and analysis dictionaries by clearly imposing the nuclear norm on the synthesis dictionaries. The experimental results show that our method achieved the best classification accuracy (100%, 72.16%, 95%) on three datasets with the fastest computing time, reducing the errors of state-of-the-art methods (JMLC, DML, CEBSR) by 0.96–75.69%.

show abstract

Section: Discussionmentioning

confidence: 99%

Manifolds-Based Low-Rank Dictionary Pair Learning for Efficient Set-Based Video Recognition

et al. 2023

View full text Add to dashboard Cite

show abstract

“…Yuyan Meng et al [35] proposed a transfer learning and attention mechanism in the ResNet model to classify and identify violent images, achieving an improved network model with an average accuracy rate of 92.20% for quick and accurate identification of violent images, thus reducing manual identification costs and supporting decision-making against rebel organization activities. Atiq ur Rehman et al [36] present a comprehensive survey paper examining the success of DL models in automated video classification. In that paper, they discuss the challenges existing in the field, highlight benchmark-based evaluations, and provide summaries of benchmark datasets and performance evaluation metrics.…”

Section: Related Workmentioning

confidence: 99%

Supervised Video Cloth Simulation: Exploring Softness and Stiffness Variations on Fabric Types Using Deep Learning

Mao,

Va,

Lee

et al. 2023

Applied Sciences

View full text Add to dashboard Cite

Physically based cloth simulation requires a model that represents cloth as a collection of nodes connected by different types of constraints. In this paper, we present a coefficient prediction framework using a Deep Learning (DL) technique to enhance video summarization for such simulations. Our proposed model represents virtual cloth as interconnected nodes that are subject to various constraints. To ensure temporal consistency, we train the video coefficient prediction using Gated Recurrent Unit (GRU), Long-Short Term Memory (LSTM), and Transformer models. Our lightweight video coefficient network combines Convolutional Neural Networks (CNN) and a Transformer to capture both local and global contexts, thus enabling highly efficient prediction of keyframe importance scores for short-length videos. We evaluated our proposed model and found that it achieved an average accuracy of 99.01%. Specifically, the accuracy for the coefficient prediction of GRU was 20%, while LSTM achieved an accuracy of 59%. Our methodology leverages various cloth simulations that utilize a mass-spring model to generate datasets representing cloth movement, thus allowing for the accurate prediction of the coefficients for virtual cloth within physically based simulations. By taking specific material parameters as input, our model successfully outputs a comprehensive set of geometric and physical properties for each cloth instance. This innovative approach seamlessly integrates DL techniques with physically based simulations, and it therefore has a high potential for use in modeling complex systems.

show abstract

“…In summary, LC sampling queries unlabeled instances where the classifier is least confident in its top predicted label, to select informative samples that can maximize model improvement. [17][18][19][20] Semi-Supervised active learning. Semi-supervised active learning (SSAL) combines active learning with semisupervised learning (SSL).…”

Section: Active Learning Query Strategiesmentioning

confidence: 99%

An innovative deep active learning approach for improving unlabeled audio classification by selectively querying informative instance

Salama

2023

International Journal of Engineering Business Management

View full text Add to dashboard Cite

Audio classification tasks like speech recognition and acoustic scene analysis require substantial labeled data, which is expensive. This work explores active learning to reduce annotation costs for a sound classification problem with rare target classes where existing datasets are insufficient. A deep convolutional recurrent neural network extracts spectro-temporal features and makes predictions. An uncertainty sampling strategy queries the most uncertain samples for manual labeling by experts and non-experts. A new alternating confidence sampling strategy and two other certainty-based strategies are proposed and evaluated. Experiments show significantly higher accuracy than passive learning baselines with the same labeling budget. Active learning generalizes well in a qualitative analysis of 20,000 unlabeled recordings. Overall, active learning with a novel sampling strategy minimizes the need for expensive labeled data in audio classification, successfully leveraging unlabeled data to improve accuracy with minimal supervision.

show abstract

On the Use of Deep Learning for Video Classification

Cited by 20 publications

References 118 publications

Manifolds-Based Low-Rank Dictionary Pair Learning for Efficient Set-Based Video Recognition

Manifolds-Based Low-Rank Dictionary Pair Learning for Efficient Set-Based Video Recognition

Supervised Video Cloth Simulation: Exploring Softness and Stiffness Variations on Fabric Types Using Deep Learning

An innovative deep active learning approach for improving unlabeled audio classification by selectively querying informative instance

Contact Info

Product

Resources

About