The global community has recognized an increasing amount of pollutants entering oceans and other water bodies as a severe environmental, economic, and social issue. In addition to prevention, one of the key measures in addressing marine pollution is the cleanup of debris already present in marine environments. Deployment of machine learning (ML) and deep learning (DL) techniques can automate marine waste removal, making the cleanup process more efficient. This study examines the performance of six well-known deep convolutional neural networks (CNNs), namely VGG19, InceptionV3, ResNet50, Inception-ResNetV2, DenseNet121, and MobileNetV2, utilized as feature extractors according to three different extraction schemes for the identification and classification of underwater marine debris. We compare the performance of a neural network (NN) classifier trained on top of deep CNN feature extractors when the feature extractor is (1) fixed; (2) fine-tuned on the given task; (3) fixed during the first phase of training and fine-tuned afterward. In general, fine-tuning resulted in better-performing models but is much more computationally expensive. The overall best NN performance showed the fine-tuned Inception-ResNetV2 feature extractor with an accuracy of 91.40% and F1-score 92.08%, followed by fine-tuned InceptionV3 extractor. Furthermore, we analyze conventional ML classifiers’ performance when trained on features extracted with deep CNNs. Finally, we show that replacing NN with a conventional ML classifier, such as support vector machine (SVM) or logistic regression (LR), can further enhance the classification performance on new data.
The main goal of any classification or regression task is to obtain a model that will generalize well on new, previously unseen data. Due to the recent rise of deep learning and many state-of-the-art results obtained with deep models, deep learning architectures have become one of the most used model architectures nowadays. To generalize well, a deep model needs to learn the training data well without overfitting. The latter implies a correlation of deep model optimization and regularization with generalization performance. In this work, we explore the effect of the used optimization algorithm and regularization techniques on the final generalization performance of the model with convolutional neural network (CNN) architecture widely used in the field of computer vision. We give a detailed overview of optimization and regularization techniques with a comparative analysis of their performance with three CNNs on the CIFAR-10 and Fashion-MNIST image datasets.
In recent years Generative Adversarial Networks (GANs) have achieved remarkable results in the task of realistic image synthesis. Despite their continued success and advances, there still lacks a thorough understanding of how precisely GANs map random latent vectors to realistic-looking images and how the priors set on the latent space affect the learned mapping. In this work, we analyze the effect of the chosen latent dimension on the final quality of synthesized images of human faces and learned data representations. We show that GANs can generate images plausibly even with latent dimensions significantly smaller than the standard dimensions like 100 or 512. Although one might expect that larger latent dimensions encourage the generation of more diverse and enhanced quality images, we show that an increase of latent dimension after some point does not lead to visible improvements in perceptual image quality nor in quantitative estimates of its generalization abilities.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.