Hierarchical graph representations in digital pathology

Pati, Pushpak; Jaume, Guillaume; Foncubierta–Rodríguez, Antonio; Feroce, Florinda; Anniciello, Anna Maria; Scognamiglio, Giosuè; Brancati, Nadia; Fiche, Maryse; Dubruc, Estelle; Riccio, Daniel; Bonito, Maurizio Di; Pietro, Giuseppe De; Botti, Gerardo; Thiran, Jean‐Philippe; Frucci, Maria; Göksel, Orçun; Gabrani, Maria

doi:10.1016/j.media.2021.102264

Cited by 96 publications

(50 citation statements)

References 60 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…CNN's architectures tend to favor texture-based features [12], at the other end of the spectrum are graph neural network-based methods, which model the global dependencies between the local representations and thus rely more on the shape cues. We further compare with graph-based methods with a significant emphasis on HACT-Net [26], which holds the current state-of-art for TRoIs classification on the BRACS. ScoreNet reaches a new state-of-the-art weighted F1-score of 64.4% on the BRACS TRoIs classification task outperforming HACT- [22,29] 32.3 ± 4.6 39.0 ± 0.8 23.7 ± 1.7 18.0 ± 0.8 37.7 ± 2.9 47.3 ± 2.0 70.7 ± 0.5 39.4 ± 1.9 CNN (10× + 20×) [22,29] 48.3 ± 2.0 45.7 ± 0.5 41.7 ± 5.0 32.3 ± 0.9 46.3 ± 1.4 59.3 ± 2.0 85.7 ± 1.9 52.3 ± 1.9 CNN (10× + 20× + 40×) [22,29] [24] 58.8 ± 6.8 40.9 ± 3.0 46.8 ± 1.9 40.0 ± 3.6 63.7 ± 10.5 53.8 ± 3.9 81.1 ± 3.3 55.9 ± 1.0 CG-GNN [24] 63.6 ± 4.9 47.7 ± 2.9 39.4 ± 4.…”

Section: Trois Classification Results and Discussionmentioning

confidence: 99%

“…To evaluate all samples, we per- form stratified 5-fold cross-validation. For HACT-Net, we use the available pre-trained weights and follow the code implementation of [26]. As HACT-Net sometimes fails to generate embedding, to have a fair comparison, we only evaluate those samples where HACT-Net could successfully produce embedding (around 95% of the BACH and 80% of CAMELYON16 dataset).…”

Section: Trois Classification Results and Discussionmentioning

confidence: 99%

“…Datasets. The primary dataset is the BReAst Carcinoma Sub-typing (BRACS) dataset [26]. Experimental Setup.…”

Section: Methodsmentioning

confidence: 99%

“…Therefore their patch extraction is fixed and not datadriven as ours. To cope with that loss of information, graph neural network-based methods [26,39] have been proposed to consider global contextual information and the dependencies between the instances. These approaches build a graph model that operates on the cell-level structure or combines the cell-level and tissue-level context.…”

Section: Related Workmentioning

confidence: 99%

“…We extract two auxiliary datasets from the BRACS dataset [25]: a tile dataset at 40× and a low-resolution thumbnail dataset at 40 s ×, where s is the down-scaling ratio. The former dataset is used to pre-train the fine-grained attention module, whereas the latter serves to pre-train the recommendation stage.…”

Section: B3 Datasetsmentioning

confidence: 99%

See 4 more Smart Citations

ScoreNet: Learning Non-Uniform Attention and Augmentation for Transformer-Based Histopathological Image Classification

Stegmüller¹,

Bozorgtabar²,

Spahr³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

Progress in digital pathology is hindered by highresolution images and the prohibitive cost of exhaustive localized annotations. The commonly used paradigm to categorize pathology images is patch-based processing, which often incorporates multiple instance learning (MIL) to aggregate local patch-level representations yielding imagelevel prediction. Nonetheless, diagnostically relevant regions may only take a small fraction of the whole tissue, and MIL-based aggregation operation assumes that all patch representations are independent and thus mislays the contextual information from adjacent cell and tissue microenvironments. Consequently, the computational resources dedicated to a specific region are independent of its information contribution. This paper proposes a transformer-based architecture specifically tailored for histopathological image classification, which combines fine-grained local attention with a coarse global attention mechanism to learn meaningful representations of high-resolution images at an efficient computational cost. More importantly, based on the observation above, we propose a novel mixing-based dataaugmentation strategy, namely ScoreMix, by leveraging the distribution of the semantic regions of images during the training and carefully guiding the data mixing via sampling the locations of discriminative image content. Thorough experiments and ablation studies on three challenging representative cohorts of Haematoxylin & Eosin (H&E) tumour regions-of-interest (TRoIs) datasets have validated the superiority of our approach over existing state-of-theart methods and effectiveness of our proposed components, e.g., data augmentation in improving classification performance. We also demonstrate our method's interpretability, robustness, and cross-domain generalization capability.

show abstract