The effortless editing, interchanging and replication of multimedia data on the internet is growing exponentially and has created copyright protection uncertainties for content providers. Thus, in order to discourage illegal duplication and to attain the required level of protection to digital data, digital watermarking is found to be a feasible solution. Thus, this paper proposes a video watermarking technique by exploring Discrete Wavelet Transform (DWT) and Singular Value Decomposition (SVD) transform in addition to Artificial Bee colony (ABC) Optimization algorithm. In this paper, DWT is applied on every luminance frame which is divided into 8x8 blocks of the video 'V' thus producing distinct frequency sub-bands. Out of them, LL band is selected for watermark insertion. Later SVD transform is implemented on the selected dwt blocks of LL bands of all frames. The starting indices of best blocks are obtained adaptively rather than manually through ABC algorithm. At the receiving part, retrieval of watermark contents is achieved by a similar evaluation scheme practiced during the embedding procedure. The proposed optimized DWT-SVD based video watermarking method has been evaluated in the presence of video processing attacks and simulation results proved that due to cascading of two powerful mathematical transforms DWT and SVD in addition to ABC algorithm the proposed video watermarking method endures all attacks and aptly extracts the concealed watermark without significant degradation in the video quality of the watermarked video. Thus when the Peak Signal to Noise Ratio (PSNR) and Normalized Correlation (NC) performance of the proposed algorithm is correlated with other related techniques it is found that the PSNR of the proposed method is above 53 dB for all set of videos and Robustness of the scheme is superior than the existing schemes for similar set of videos in terms of NC.
Rapid growth in the transfer of multimedia information over the Internet requires algorithms to retrieve a queried image from large image database repositories. The proposed content-based image retrieval (CBIR) uses Gaussian-Hermite moments as the low-level features. Later these features are compressed with principal component analysis. The compressed feature set is multiplied with the weight matrix array, which has the same size as the feature vector. Hybrid firefly and grey wolf optimization (FAGWO) is used to prevent the premature convergence of optimization in the firefly algorithm. The retrieval of images in CBIR is carried out in an OpenCV python environment with K-nearest neighbours and random forest algorithm classifiers. The fitness function for FAGWO is the accuracy of the classifier. The FAGWO algorithm derives the optimum weights from a randomly generated initial population. When these optimized weights are applied, the proposed algorithm shows better precision/recall and efficiency than other techniques such as exact legendre moments, Region-based image retrieval, K-means clustering and Color descriptor waveletbased texture descriptor retrieval technique. In terms of optimization, hybrid FAGWO outperformed various optimization techniques (when used alone) like Particle Swarm Optmization, Genetic Algorithm, Grey-Wolf Optimization and FireFly algorithm.This is an open access article under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made.
Manual tumor diagnosis from magnetic resonance images (MRIs) is a time-consuming procedure that may lead to human errors and may lead to false detection and classification of the tumor type. Therefore, to automatize the complex medical processes, a deep learning framework is proposed for brain tumor classification to ease the task of doctors for medical diagnosis. Publicly available datasets such as Kaggle and Brats are used for the analysis of brain images. The proposed model is implemented on three pre-trained Deep Convolution Neural Network architectures (DCNN) such as AlexNet, VGG16, and ResNet50. These architectures are the transfer learning methods used to extract the features from the pre-trained DCNN architecture, and the extracted features are classified by using the Support Vector Machine (SVM) classifier. Data augmentation methods are applied on Magnetic Resonance images (MRI) to avoid the network from overfitting. The proposed methodology achieves an overall accuracy of 98.28% and 97.87% without data augmentation and 99.0% and 98.86% with data augmentation for Kaggle and Brat's datasets, respectively. The Area Under Curve (AUC) for Receiver Operator Characteristic (ROC) is 0.9978 and 0.9850 for the same datasets. The result shows that ResNet50 performs best in the classification of brain tumors when compared with the other two networks.
Purpose Road accidents, an inadvertent mishap can be detected automatically and alerts sent instantly with the collaboration of image processing techniques and on-road video surveillance systems. However, to rely exclusively on visual information especially under adverse conditions like night times, dark areas and unfavourable weather conditions such as snowfall, rain, and fog which result in faint visibility lead to incertitude. The main goal of the proposed work is certainty of accident occurrence. Design/methodology/approach The authors of this work propose a method for detecting road accidents by analyzing audio signals to identify hazardous situations such as tire skidding and car crashes. The motive of this project is to build a simple and complete audio event detection system using signal feature extraction methods to improve its detection accuracy. The experimental analysis is carried out on a publicly available real time data-set consisting of audio samples like car crashes and tire skidding. The Temporal features of the recorded audio signal like Energy Volume Zero Crossing Rate 28ZCR2529 and the Spectral features like Spectral Centroid Spectral Spread Spectral Roll of factor Spectral Flux the Psychoacoustic features Energy Sub Bands ratio and Gammatonegram are computed. The extracted features are pre-processed and trained and tested using Support Vector Machine (SVM) and K-nearest neighborhood (KNN) classification algorithms for exact prediction of the accident occurrence for various SNR ranges. The combination of Gammatonegram with Temporal and Spectral features of the validates to be superior compared to the existing detection techniques. Findings Temporal, Spectral, Psychoacoustic features, gammetonegram of the recorded audio signal are extracted. A High level vector is generated based on centroid and the extracted features are classified with the help of machine learning algorithms like SVM, KNN and DT. The audio samples collected have varied SNR ranges and the accuracy of the classification algorithms is thoroughly tested. Practical implications Denoising of the audio samples for perfect feature extraction was a tedious chore. Originality/value The existing literature cites extraction of Temporal and Spectral features and then the application of classification algorithms. For perfect classification, the authors have chosen to construct a high level vector from all the four extracted Temporal, Spectral, Psycho acoustic and Gammetonegram features. The classification algorithms are employed on samples collected at varied SNR ranges.
Monocular depth estimation is a hot research topic in autonomous car driving. Deep convolution neural networks (DCNN) comprising encoder and decoder with transfer learning are exploited in the proposed work for monocular depth map estimation of two-dimensional images. Extracted CNN features from initial stages are later upsampled using a sequence of Bilinear UpSampling and convolution layers to reconstruct the depth map. The encoder forms the feature extraction part, and the decoder forms the image reconstruction part. EfficientNetB0, a new architecture is used with pretrained weights as encoder. It is a revolutionary architecture with smaller model parameters yet achieving higher efficiencies than the architectures of state-of-the-art, pretrained networks. EfficientNet-B0 is compared with two other pretrained networks, the DenseNet-121 and ResNet50 models. Each of these three models are used in encoding stage for features extraction followed by bilinear method of UpSampling in the decoder. The Monocular image is an ill-posed problem and is thus considered as a regression problem. So the metrics used in the proposed work are F1-score, Jaccard score and Mean Actual Error (MAE) etc., between the original and the reconstructed image. The results convey that EfficientNet-B0 outperforms in validation loss, F1-score and Jaccard score compared to DenseNet-121 and ResNet-50 models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.