2024
DOI: 10.7717/peerj-cs.1888
|View full text |Cite
|
Sign up to set email alerts
|

exKidneyBERT: a language model for kidney transplant pathology reports and the crucial role of extended vocabularies

Tiancheng Yang,
Ilia Sucholutsky,
Kuang-Yu Jen
et al.

Abstract: Background Pathology reports contain key information about the patient’s diagnosis as well as important gross and microscopic findings. These information-rich clinical reports offer an invaluable resource for clinical studies, but data extraction and analysis from such unstructured texts is often manual and tedious. While neural information retrieval systems (typically implemented as deep learning methods for natural language processing) are automatic and flexible, they typically require a large… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(1 citation statement)
references
References 25 publications
0
1
0
Order By: Relevance
“…Zhang et al fine-tuned a BERT model to extract various concepts (e.g., site, degree of differentiation) from breast cancer reports, achieving an overall precision of 0.927 and recall of 0.939 [30]. Yang et al trained a BERT model that can accurately predict the type of rejection and IFTA (interstitial fibrosis and tubular atrophy) in renal pathology reports [33]. Liu et al developed a BERT deidentification pipeline using 2100 pathology reports, achieving a best F1-score of 0.9659 in identifying sensitive health information.…”
Section: Information Extractionmentioning
confidence: 99%
“…Zhang et al fine-tuned a BERT model to extract various concepts (e.g., site, degree of differentiation) from breast cancer reports, achieving an overall precision of 0.927 and recall of 0.939 [30]. Yang et al trained a BERT model that can accurately predict the type of rejection and IFTA (interstitial fibrosis and tubular atrophy) in renal pathology reports [33]. Liu et al developed a BERT deidentification pipeline using 2100 pathology reports, achieving a best F1-score of 0.9659 in identifying sensitive health information.…”
Section: Information Extractionmentioning
confidence: 99%