While convolutional neural networks (CNNs) have successfully been applied for skin lesion classification, previous studies have generally considered only a single clinical/macroscopic image and output a binary decision. In this work, we have presented a method which combines multiple imaging modalities together with patient metadata to improve the performance of automated skin lesion diagnosis. We evaluated our method on a binary classification task for comparison with previous studies as well as a five class classification task representative of a real-world clinical scenario. We showed that our multimodal classifier outperforms a baseline classifier that only uses a single macroscopic image in both binary melanoma detection (AUC 0.866 vs 0.784) and in multiclass classification (mAP 0.729 vs 0.598). In addition, we have quantitatively showed the automated diagnosis of skin lesions using dermatoscopic images obtains a higher performance when compared to using macroscopic images. We performed experiments on a new data set of 2917 cases where each case contains a dermatoscopic image, macroscopic image and patient metadata.
Presenting visually similar images based on features from a neural network shows comparable accuracy with the softmax probability-based diagnoses of convolutional neural networks. CBIR may be more helpful than a softmax classifier in improving diagnostic accuracy of clinicians in a routine clinical setting.
Figure 1: Overview of the proposed method -one-shot learning leveraging other categories. Given one example for an event of interest (Event 1), we implicitly infer the relevance between it and other events, and emphasise more on the most relevant ones in the multi-task learning. The learned classifier for the event of interest is applied to retrieve instances from a video test set. This paper proposes a new multi-task learning method with implicit intertask relevance estimation, and applies it to complex Internet video event detection, which is a challenging and important problem in practice, yet seldom has been addressed. In this paper, "detection" means to detect videos corresponding to the event of interest from a (large) video dataset, not to localize the event spatially or temporally in a video. In the problem definition, one positive and plenty of negative samples of one event are given as training data, and the goal is to return the videos of the same event from a large video dataset. In addition, we assume samples of other events are available. Fig. 1 shows an overview of the proposed methods. The widths of the lines between the one-exemplar event and others represent the inter-event relevance, which is unknown a priori in our problem settings. However, the proposed method can implicity infer the relevance and utilize the most relevant event(s) more in multi-task learning, where the shared information from the relevant events helps to build a better model from the one exemplar. The proposed method does not assume the relevance between other events, as indicated by the red line. Although the learning algorithm outputs models of all input events, only that of the one-exemplar event is applied to detect videos of the event of interest from the video set.Our method builds on the approach of graph-guided multi-task learning [1], which is described first. The training set {(x ti , y ti ) ∈ R D ×{−1, +1},t = 1, 2, . . . , T, i = 1, 2, . . . , N t } is grouped into T related tasks, which are further organized as a graph G =< V, E >. The tasks correspond to the elements in the vertex set V , and the pairwise relevance between Task t and k are represented by the weight r tk on edges e tk ∈ E. The more relevant the two tasks are, the larger the edge weight is. The graph guided multi-task learning algorithm learns the corresponding T models jointly, by solving the optimization problemwhere w t ∈ R D and b t ∈ R are the model weight vector and bias term of Task t, respectively, W = (w 1 , w 2 , . . . , w T ) is the matrix whose columns are model weight vectors,is the graph-guided penalty term. For the significantly relevant tasks, their model weight vectors are forced to be similar due to the large edge weights, and the information could be transferred between relevant tasks. Loss(·, ·) is the loss function. In our work we use logistic loss Loss(s, y) = log(1 + exp(−ys)),which is smooth and leads to an easier optimization problem compared to the hinge loss.In the one-shot learning setting for event detection, it is ...
Recent advances in artificial intelligence (AI) and computer-aided decision support methods have produced various efficient ways to allow for learning about skin problems. 1 In particular, advances in machine learning have spurred novel retrieval algorithms and aroused interest in content-based image retrieval (CBIR) techniques, where computer vision methods are applied to search for similar images to a "query" image based on the content of the image and visual clues such as color, shape, and pattern, from large databases. 2 In the medical domain, CBIR is designed to assist with finding similar, labeled, medical images from a curated database. Within the dermatology context, CBIR can assist with diagnosis or education by comparing visually similar skin lesion images, 3 removing the difficulties that can arise when trying to describe images with words. Since the database and the algorithms for these systems are curated for a specific area or problem, users are less likely to encounter irrelevant images, one of the main problems with generic search engines. Despite the proposed benefits of modern CBIR systems, most CBIR-related research to date has focused on improving the accuracy of AI systems for diagnostic decisions 4,5 : we know little about the perceived utility and usability of CBIR systems for end users from a human-computer interaction (HCI) perspective. 6 In this paper, we describe a pilot study on how an interactive dermoscopic
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.