Quang-Hien Kha scite author profile

2022

J. Chem. Inf. Model.

Background: SNARE proteins play a vital role in membrane fusion and cellular physiology and pathological processes. Many potential therapeutics for mental diseases or even cancer based on SNAREs are also developed. Therefore, there is a dire need to predict the SNAREs for further manipulation of these essential proteins, which demands new and efficient approaches. Methods: Some computational frameworks were proposed to tackle the hurdles of biological methods, which take plenty of time and budget to conduct the identification of SNAREs. However, the performances of existing frameworks were insufficiently satisfied, as they failed to retain the SNARE sequence order and capture the mass hidden features from SNAREs. This paper proposed a novel model constructed on the multiscan convolutional neural network (CNN) and position-specific scoring matrix (PSSM) profiles to address these limitations. We employed and trained our model on the benchmark dataset with fivefold cross-validation and two different independent datasets. Results: Overall, the multiscan CNN was cross-validated on the training set and excelled in the SNARE classification reaching 0.963 in AUC and 0.955 in AUPRC. On top of that, with the sensitivity, specificity, accuracy, and MCC of 0.842, 0.968, 0.955, and 0.767, respectively, our proposed framework outperformed previous models in the SNARE recognition task. Conclusions: It is truly believed that our model can contribute to the discrimination of SNARE proteins and general proteins.

Risk Score Generated from CT-Based Radiomics Signatures for Overall Survival Prediction in Non-Small Cell Lung Cancer

Hung

et al. 2021

Cancers

This study aimed to create a risk score generated from CT-based radiomics signatures that could be used to predict overall survival in patients with non-small cell lung cancer (NSCLC). We retrospectively enrolled three sets of NSCLC patients (including 336, 84, and 157 patients for training, testing, and validation set, respectively). A total of 851 radiomics features for each patient from CT images were extracted for further analyses. The most important features (strongly linked with overall survival) were chosen by pairwise correlation analysis, Least Absolute Shrinkage and Selection Operator (LASSO) regression model, and univariate Cox proportional hazard regression. Multivariate Cox proportional hazard model survival analysis was used to create risk scores for each patient, and Kaplan–Meier was used to separate patients into two groups: high-risk and low-risk, respectively. ROC curve assessed the prediction ability of the risk score model for overall survival compared to clinical parameters. The risk score, which developed from ten radiomics signatures model, was found to be independent of age, gender, and stage for predicting overall survival in NSCLC patients (HR, 2.99; 95% CI, 2.27–3.93; p < 0.001) and overall survival prediction ability was 0.696 (95% CI, 0.635–0.758), 0.705 (95% CI, 0.649–0.762), 0.657 (95% CI, 0.589–0.726) (AUC) for 1, 3, and 5 years, respectively, in the training set. The risk score is more likely to have a better accuracy in predicting survival at 1, 3, and 5 years than clinical parameters, such as age 0.57 (95% CI, 0.499–0.64), 0.552 (95% CI, 0.489–0.616), 0.621 (95% CI, 0.544–0.689) (AUC); gender 0.554, 0.546, 0.566 (AUC); stage 0.527, 0.501, 0.459 (AUC), respectively, in 1, 3 and 5 years in the training set. In the training set, the Kaplan–Meier curve revealed that NSCLC patients in the high-risk group had a lower overall survival time than the low-risk group (p < 0.001). We also had similar results that were statistically significant in the testing and validation set. In conclusion, risk scores developed from ten radiomics signatures models have great potential to predict overall survival in NSCLC patients compared to the clinical parameters. This model was able to stratify NSCLC patients into high-risk and low-risk groups regarding the overall survival prediction.

Development and Validation of an Efficient MRI Radiomics Signature for Improving the Predictive Performance of 1p/19q Co-Deletion in Lower-Grade Gliomas

Hung

et al. 2021

Cancers

The prognosis and treatment plans for patients diagnosed with low-grade gliomas (LGGs) may significantly be improved if there is evidence of chromosome 1p/19q co-deletion mutation. Many studies proved that the codeletion status of 1p/19q enhances the sensitivity of the tumor to different types of therapeutics. However, the current clinical gold standard of detecting this chromosomal mutation remains invasive and poses implicit risks to patients. Radiomics features derived from medical images have been used as a new approach for non-invasive diagnosis and clinical decisions. This study proposed an eXtreme Gradient Boosting (XGBoost)-based model to predict the 1p/19q codeletion status in a binary classification task. We trained our model on the public database extracted from The Cancer Imaging Archive (TCIA), including 159 LGG patients with 1p/19q co-deletion mutation status. The XGBoost was the baseline algorithm, and we combined the SHapley Additive exPlanations (SHAP) analysis to select the seven most optimal radiomics features to build the final predictive model. Our final model achieved an accuracy of 87% and 82.8% on the training set and external test set, respectively. With seven wavelet radiomics features, our XGBoost-based model can identify the 1p/19q codeletion status in LGG-diagnosed patients for better management and address the drawbacks of invasive gold-standard tests in clinical practice.

An interpretable deep learning model for classifying adaptor protein complexes from sequence information

Tran

Nguyen

et al. 2022

Methods

Development and Validation of an Explainable Machine Learning-Based Prediction Model for Drug–Food Interactions from Chemical Structures

Hung

et al. 2023

Sensors

Possible drug–food constituent interactions (DFIs) could change the intended efficiency of particular therapeutics in medical practice. The increasing number of multiple-drug prescriptions leads to the rise of drug–drug interactions (DDIs) and DFIs. These adverse interactions lead to other implications, e.g., the decline in medicament’s effect, the withdrawals of various medications, and harmful impacts on the patients’ health. However, the importance of DFIs remains underestimated, as the number of studies on these topics is constrained. Recently, scientists have applied artificial intelligence-based models to study DFIs. However, there were still some limitations in data mining, input, and detailed annotations. This study proposed a novel prediction model to address the limitations of previous studies. In detail, we extracted 70,477 food compounds from the FooDB database and 13,580 drugs from the DrugBank database. We extracted 3780 features from each drug–food compound pair. The optimal model was eXtreme Gradient Boosting (XGBoost). We also validated the performance of our model on one external test set from a previous study which contained 1922 DFIs. Finally, we applied our model to recommend whether a drug should or should not be taken with some food compounds based on their interactions. The model can provide highly accurate and clinically relevant recommendations, especially for DFIs that may cause severe adverse events and even death. Our proposed model can contribute to developing more robust predictive models to help patients, under the supervision and consultants of physicians, avoid DFI adverse effects in combining drugs and foods for therapy.