A logistic regression model based on tongue image information for prediction precancerous lesions and early stage esophageal cancer in China

Jia, Liqun; Duan, Jun; Deng, Bo; Bai, Weibin; Liu, M.; Li, D.; Jia, Baochang

doi:10.1093/annonc/mdw385.06

Cited by 2 publications

(1 citation statement)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Once done, these models could be deployed in a specialised clinic to help the junior rheumatologist summarise the clinical care plan of a report, or could be used as a chatbot for patients seeking answers to their questions after office hours. For instance, authors in Jia et al [2024], developed OncoGPT, this was done by fine-tuning Llama with oncology-related conversations.…”

Section: Discussionmentioning

confidence: 99%

From Web to RheumaLpack: Creating a Linguistic Corpus for Exploitation and Knowledge Discovery in Rheumatology

Madrid-García,

Merino-Barbancho,

Freites-Núñez

et al. 2024

Preprint

View full text Add to dashboard Cite

This study introduces RheumaLinguisticpack (RheumaLpack), the first specialised linguistic web corpus designed for the field of musculoskeletal disorders. By combining web mining (i.e., web scraping) and natural language processing (NLP) techniques, as well as clinical expertise, RheumaLpack systematically captures and curates data across a spectrum of web sources including clinical trials registers (i.e., ClinicalTrials.gov), bibliographic databases (i.e., PubMed), medical agencies (i.e. EMA), social media (i.e., Reddit), and accredited health websites (i.e., MedlinePlus, Hardvard Health Publishing, and Cleveland Clinic). Given the complexity of rheumatic and musculoskeletal diseases (RMDs) and their significant impact on quality of life, this resource can be proposed as a useful tool to train algorithms that could mitigate the diseases' effects. Therefore, the corpus aims to improve the training of artificial intelligence (AI) algorithms and facilitate knowledge discovery in RMDs. The development of RheumaLpack involved a systematic six-step methodology covering data identification, characterisation, selection, collection, processing, and corpus description. The result is a non-annotated, monolingual, and dynamic corpus, featuring almost 3 million records spanning from 2000 to 2023. RheumaLpack represents a pioneering contribution to rheumatology research, providing a useful resource for the development of advanced AI and NLP applications. This corpus highlights the value of web data to address the challenges posed by musculoskeletal diseases, illustrating the corpus's potential to improve research and treatment paradigms in rheumatology. Finally, the methodology shown can be replicated to obtain data from other medical specialities.

show abstract

Section: Discussionmentioning

confidence: 99%