Two common hemoglobinopathies, sickle cell disease (SCD) and β-thalassemia, arise from genetic mutations within the β-globin gene. In this work, we identified a 500-bp motif (Fetal Chromatin Domain, FCD) upstream of human ϒ-globin locus and showed that the removal of this motif using CRISPR technology reactivates the expression of ϒ-globin. Next, we present two different cell morphology-based machine learning approaches that can be used identify human blood cells (KU-812) that harbor CRISPR-mediated FCD genetic modifications. Three candidate models from the first approach, which uses multilayer perceptron algorithm (MLP 20-26, MLP26-18, and MLP 30-26) and flow cytometry-derived cellular data, yielded 0.83 precision, 0.80 recall, 0.82 accuracy, and 0.90 area under the ROC (receiver operating characteristic) curve when predicting the edited cells. In comparison, the candidate model from the second approach, which uses deep learning (T2D5) and DIC microscopy-derived imaging data, performed with less accuracy (0.80) and ROC AUC (0.87). We envision that equivalent machine learning-based models can complement currently available genotyping protocols for specific genetic modifications which result in morphological changes in human cells.
Two common hemoglobinopathies, sickle cell disease (SCD) and β-thalassemia, arise from genetic mutations within the β-globin gene. A 500-bp motif termed Fetal Chromatin Domain (FCD), upstream of human ϒ-globin locus, may function as a transcriptional regulatory element driving inhibition of the ϒ-globin gene. Here, we hypothesize that the removal of this motif using CRISPR technology may reactivate the expression of ϒ-globin and subsequently restore fetal hemoglobin functionality. In this work we present two different cell morphology-based machine learning approaches that can be used identify cells that harbor FCD genetic modifications. Three candidate models from the first, which uses multilayer perceptron algorithm (MLP 20–26, MLP26-18, and MLP 30 − 26) and flow cytometry-derived cellular data, yielded 0.83 precision, 0.80 recall, 0.82 accuracy, and 0.90 area under the ROC (receiver operating characteristic) curve when predicting the edited cells. In comparison, the candidate model from the second approach, which uses deep learning (T2D5) and DIC microscopy-derived imaging data, performed with less accuracy (0.80) and ROC AUC (0.87). We envision both assays could be valuable and complementary to currently available genotyping protocols for specific genetic modifications which result in morphological changes in human cells.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.