2019
DOI: 10.3791/59108
|View full text |Cite
|
Sign up to set email alerts
|

Cloud-Based Phrase Mining and Analysis of User-Defined Phrase-Category Association in Biomedical Publications

Abstract: The rapid accumulation of biomedical textual data has far exceeded the human capacity of manual curation and analysis, necessitating novel text-mining tools to extract biological insights from large volumes of scientific reports. The Context-aware Semantic Online Analytical Processing (CaseOLAP) pipeline, developed in 2016, successfully quantifies user-defined phrase-category relationships through the analysis of textual data. CaseOLAP has many biomedical applications. We have developed a protocol for a cloud-… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
2
2

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 9 publications
0
3
0
Order By: Relevance
“…Data mining refers to the recognition of useful information from considerable, fuzzy, noisy, incomplete, and random datasets [20]. The main purpose of applying data mining technology in gene analysis is to process massive gene expression profile data by strong analytical capacity, find the relationship networks existing among genes, and provide the basis for the study on gene changes [21,22]. In Internet hybrid treatment, clinical treatment tests offered considerable data from various sources.…”
Section: Discussionmentioning
confidence: 99%
“…Data mining refers to the recognition of useful information from considerable, fuzzy, noisy, incomplete, and random datasets [20]. The main purpose of applying data mining technology in gene analysis is to process massive gene expression profile data by strong analytical capacity, find the relationship networks existing among genes, and provide the basis for the study on gene changes [21,22]. In Internet hybrid treatment, clinical treatment tests offered considerable data from various sources.…”
Section: Discussionmentioning
confidence: 99%
“…In a pilot study performed in Long Beach, we conducted We conducted a text mining algorithm for 8,325 heart related proteins over 33M publications on PubMed to quantify these according to their correlation in CHD and created a knowledge graph. Detailed description and methods are described previously (27,28).…”
Section: Methodsmentioning
confidence: 99%
“…Hence, a vast amount of unstructured text data in over 33M publications (PubMed) was mined (~400 manuscripts/sec) according to a Text Mining algorithm to calculate proteindisease relationships developed by our collaborator (27,28,64). Out of 8,325 heart related protein, 937 proteins associated to CHD were identified.…”
Section: Bioinformatics Data Mining and ML In The Age Of Information ...mentioning
confidence: 99%