The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) outbreak is a public health emergency of international concern. The spike glycoprotein (S protein) of SARS-CoV-2 is a key target of antiviral drugs. Focusing on the existing S protein structure, molecular docking was used in this study to calculate the binding energy and interaction sites between 14 antiviral molecules with different structures and the SARS-CoV-2 S protein, and the potential drug candidates targeting the SARS-CoV-2 S protein were analyzed. Tizoxanide, dolutegravir, bictegravir, and arbidol were found to have high binding energies, and they effectively bind key sites of the S1 and S2 subunits, inhibiting the virus by causing conformational changes in S1 and S2 during the fusion of the S protein with host cells. Based on the interactions among the drug molecules, the S protein and the amino acid environment around the binding sites, rational structure-based optimization was performed using the molecular connection method and bioisosterism strategy to obtain Ti-2, BD-2, and Ar-3, which have much stronger binding ability to the S protein than the original molecules. This study provides valuable clues for identifying S protein inhibitor binding sites and the mechanism of the anti-SARS-CoV-2 effect as well as useful inspiration and help for the discovery and optimization of small molecule S protein inhibitors.
BackgroundConsumer-generated content, such as postings on social media websites, can serve as an ideal source of information for studying health care from a consumer’s perspective. However, consumer-generated content on health care topics often contains spelling errors, which, if not corrected, will be obstacles for downstream computer-based text analysis.ObjectiveIn this study, we proposed a framework with a spelling correction system designed for consumer-generated content and a novel ontology-based evaluation system which was used to efficiently assess the correction quality. Additionally, we emphasized the importance of context sensitivity in the correction process, and demonstrated why correction methods designed for electronic medical records (EMRs) failed to perform well with consumer-generated content.MethodsFirst, we developed our spelling correction system based on Google Spell Checker. The system processed postings acquired from MedHelp, a biomedical bulletin board system (BBS), and saved misspelled words (eg, sertaline) and corresponding corrected words (eg, sertraline) into two separate sets. Second, to reduce the number of words needing manual examination in the evaluation process, we respectively matched the words in the two sets with terms in two biomedical ontologies: RxNorm and Systematized Nomenclature of Medicine -- Clinical Terms (SNOMED CT). The ratio of words which could be matched and appropriately corrected was used to evaluate the correction system’s overall performance. Third, we categorized the misspelled words according to the types of spelling errors. Finally, we calculated the ratio of abbreviations in the postings, which remarkably differed between EMRs and consumer-generated content and could largely influence the overall performance of spelling checkers.ResultsAn uncorrected word and the corresponding corrected word was called a spelling pair, and the two words in the spelling pair were its members. In our study, there were 271 spelling pairs detected, among which 58 (21.4%) pairs had one or two members matched in the selected ontologies. The ratio of appropriate correction in the 271 overall spelling errors was 85.2% (231/271). The ratio of that in the 58 spelling pairs was 86% (50/58), close to the overall ratio. We also found that linguistic errors took up 31.4% (85/271) of all errors detected, and only 0.98% (210/21,358) of words in the postings were abbreviations, which was much lower than the ratio in the EMRs (33.6%).ConclusionsWe conclude that our system can accurately correct spelling errors in consumer-generated content. Context sensitivity is indispensable in the correction process. Additionally, it can be confirmed that consumer-generated content differs from EMRs in that consumers seldom use abbreviations. Also, the evaluation method, taking advantage of biomedical ontology, can effectively estimate the accuracy of the correction system and reduce manual examination time.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.