This paper proposes a Computer Aided Detection (CADe) system for early detection of lung nodules from low dose computed tomography (LDCT) images. The proposed system initially pre-process the raw data to improve the contrast of the low dose images. Compact deep learning features are then extracted by investigating different deep learning architectures, including Alex, VGG16, and VGG19 networks. To optimize the extracted set of features, a genetic algorithm (GA) is trained to select the most relevant features for early detection. Finally, different types of classifiers are tested in order to accurately detect the lung nodules. The system is tested on 320 LDCT images from 50 different subjects, using an online public lung database, i.e., the International Early Lung Cancer Action Project, I-ELCAP. The proposed system, using VGG19 architecture and SVM classifier, achieves the best detection accuracy of 96.25%, sensitivity of 97.5%, and specificity of 95%. Compared to other state-of-the-art methods, the proposed system shows a promising results.
<p>In this paper, a computer-aided detection system is developed to detect lung nodules at an early stage using Computed Tomography (CT) scan images where lung nodules are one of the most important indicators to predict lung cancer. The developed system consists of four stages. First, the raw Computed Tomography lung images were preprocessed to enhance the image contrast and eliminate noise. Second, an automatic segmentation procedure for human's lung and pulmonary nodule canddates (nodules, blood vessels) using a two-level thresholding technique and morphological operations. Third, a feature fusion technique that fuses four feature extraction techniques: the statistical features of first and second order, value histogram features, histogram of oriented gradients features, and texture features of gray level co-occurrence matrix based on wavelet coefficients was utilised to extract the main features. The fourth stage is the classifier. Three classifiers were used and their performance was compared in order to obtain the highest classification accuracy. These are; multi-layer feed-forward neural network, radial basis function neural network and support vector machine. The performance of the proposed system was assessed using three quantitative parameters. These are: the classification accuracy rate, the sensitivity and the specificity. Forty standard computed tomography images containing 320 regions of interest obtained from an early lung cancer action project association were used to test and evaluate the developed system. The images consists of 40 computed tomography scan images. The results have shown that the fused features vector resulting from genetic algorithm as a feature selection technique and the support vector machine classifier give the highest classification accuracy rate, sensitivity and specificity values of 99.6%, 100% and 99.2%, respectively.</p>
Biological pairwise sequence alignment can be used as a method for arranging two biological sequence characters to identify regions of similarity. This operation has elicited considerable interest due to its significant influence on various critical aspects of life (e.g., identifying mutations in coronaviruses). Sequence alignment over large databases cannot yield results within a reasonable time, power, and cost. heuristic methods, such as FASTA, the BLAST family have been demonstrated to perform 40 times faster than DP-based (e.g., Needleman-Wunsch) techniques they cannot guarantee an optimum alignment result An optimized software platform of a widely used DNA sequence alignment algorithm called the Needleman-Wunsch (NW) algorithm based on a lookup table, is described in this study. This global alignment algorithm is the best approach for identifying similar regions between sequences. This study presents a new application of classical machine learning (ML) to global sequence alignment. Customized ML models are used to implement NW global alignment. An accuracy of 99.7% is achieved when using a multilayer perceptron with the ADAM optimizer, and up to 2912 Giga cell updates per second are realized on two real DNA sequences with a length of 4.1 M nucleotides. Our implementation is valid for RNA/DNA sequences. This study aims to parallelize the computation steps involved in the algorithm to accelerate its performance by using ML algorithms. All datasets used in this study are available from https://ieeedataport.org/documents/dna-sequence-alignment-datasets-based-nw-algorithm. INDEX TERMSBioinformatics, DNA, RNA, Pairwise sequence alignment (PWSA), Needleman-Wunsch (NW) algorithm, Machine learning (ML) algorithms, Multilayer perceptron (MLP), XGBoost algorithm. CONTRIBUTION:This study presented six DNA/RNA sequence alignment datasets for one of the most common alignment algorithms, namely, the Needleman-Wunsch (NW) algorithm. It proposed a fast and parallel implementation of the NW algorithm by using machine learning techniques. This research is an extension and improved version of our previous work [1]. The current implementation achieved 99.7% accuracy by using a multilayer perceptron with the ADAM optimizer and up to 2912 Giga cell updates per second on two real DNA sequences with an of length 4.1 M nucleotides. Our implementation is valid for extremely long sequences by using the divide-and-conquer strategy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.