Computational Technique for an Efficient Classification of Protein Sequences With Distance‐Based Sequence Encoding Algorithm

Iqbal, Muhammad; Faye, Ibrahima; Said, Abas Md; Belhaouari, Samir Brahim

doi:10.1111/coin.12069

Cited by 3 publications

(3 citation statements)

References 48 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The length of the sequences is from few amino acids to thousands of amino acids. There are many methods for protein sequences encoding out there including distance based encoding [4], [16] that captures statistical characteristics of protein sequences. In our work we transformed the labels in to one hot vector representation using LabelBinarizer from sklearn.preprocessing.…”

Section: Methodsmentioning

confidence: 99%

“…However, it is very expensive to characterize functions for biological experiments and also, it is really necessary to find the association between the information of datasets to create and improve medical tools. For classification purpose, several classification techniques were developed [3], [4]. These techniques can be divided in two parts: Sequence alignment and Machine learning algorithms.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Classification of Biological Data using Deep Learning Technique

Javed

Iqbal

2022

NIJEC

View full text Add to dashboard Cite

A huge amount of newly sequenced proteins is being discovered on daily basis. The mainconcern is how to extract the useful characteristics of sequences as the input features for thenetwork. These sequences are increasing exponentially over the decades. However, it is veryexpensive to characterize functions for biological experiments and also, it is really necessaryto find the association between the information of datasets to create and improve medicaltools. Recently machine learning algorithms got huge attention and are widely used. Thesealgorithms are based on deep learning architecture and data-driven models. Previous workfailed to properly address issues related to the classification of biological sequences i.e.protein including efficient encoding of variable length biological sequence data andimplementation of deep learning based neural network models to enhance the performance ofclassification/ recognition systems. To overcome these issues, we have proposed a deeplearning based neural network architecture so that classification performance of the systemcan be increased. In our work, we have proposed 1D-convolution neural network whichclassifies the protein sequences to 10 top common classes. The model extracted features fromthe protein sequences labels and learned through the dataset. We have trained and evaluateour model on protein sequences downloaded from protein data bank (PDB). The modelmaximizes the accuracy rate up to 96%.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Classification of Biological Data using Deep Learning Technique

Javed

Iqbal

2022

NIJEC

View full text Add to dashboard Cite

show abstract

“…There are 20 essential Amino acids (Figure 1) in nature. Each Amino acid has a unique chemical structure that gives it specific properties [3]. The Amino acid signatures represent encoded instances of network transactions using essential Amino acid labels that encode numerical structural properties using a Vigesimal numbering system to map ASCII codes to Amino acids.…”

mentioning

confidence: 99%

Network Intrusion Detection Based on Amino Acid Sequence Structure Using Machine Learning

Ibaisi,

Kuhn,

Kaiiali

et al. 2023

Electronics

View full text Add to dashboard Cite

The detection of intrusions in computer networks, known as Network-Intrusion-Detection Systems (NIDSs), is a critical field in network security. Researchers have explored various methods to design NIDSs with improved accuracy, prevention measures, and faster anomaly identification. Safeguarding computer systems by quickly identifying external intruders is crucial for seamless business continuity and data protection. Recently, bioinformatics techniques have been adopted in NIDSs’ design, enhancing their capabilities and strengthening network security. Moreover, researchers in computer science have found inspiration in molecular biology’s survival mechanisms. These nature-designed mechanisms offer promising solutions for network security challenges, outperforming traditional techniques and leading to better results. Integrating these nature-inspired approaches not only enriches computer science, but also enhances network security by leveraging the wisdom of nature’s evolution. As a result, we have proposed a novel Amino-acid-encoding mechanism that is bio-inspired, utilizing essential Amino acids to encode network transactions and generate structural properties from Amino acid sequences. This mechanism offers advantages over other methods in the literature by preserving the original data relationships, achieving high accuracy of up to 99%, transforming original features into a fixed number of numerical features using bio-inspired mechanisms, and employing deep machine learning methods to generate a trained model capable of efficiently detecting network attack transactions in real-time.

show abstract