A sketch semantic segmentation method based on point-segment level interaction

Zhang, Shihui; Wang, Lei; Han, Xueqiang; Wang, Shi

doi:10.1016/j.engappai.2023.105996

Cited by 3 publications

(2 citation statements)

References 28 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Among them, transformers have attracted considerable attention for computer vision tasks because of their strong representation capabilities and efficiency [ 16 , 17 ]. Their performance is comparable to that of popular convolutional neural networks (CNNs) and has prompted researchers to attempt to solve vision problems based on transformers [ 18 – 24 ]. In particular, in image segmentation tasks, transformers have been extensively applied to natural images [ 25 ], medical images [ 26 ], and remote-sensing images [ 27 ].…”

Section: Introductionmentioning

confidence: 99%

Transformer with difference convolutional network for lightweight universal boundary detection

Li,

Liu,

Chen

et al. 2024

PLoS ONE

View full text Add to dashboard Cite

Although deep-learning methods can achieve human-level performance in boundary detection, their improvements mostly rely on larger models and specific datasets, leading to significant computational power consumption. As a fundamental low-level vision task, a single model with fewer parameters to achieve cross-dataset boundary detection merits further investigation. In this study, a lightweight universal boundary detection method was developed based on convolution and a transformer. The network is called a “transformer with difference convolutional network” (TDCN), which implies the introduction of a difference convolutional network rather than a pure transformer. The TDCN structure consists of three parts: convolution, transformer, and head function. First, a convolution network fused with edge operators is used to extract multiscale difference features. These pixel difference features are then fed to the hierarchical transformer as tokens. Considering the intrinsic characteristics of the boundary detection task, a new boundary-aware self-attention structure was designed in the transformer to provide inductive bias. By incorporating the proposed attention loss function, it introduces the direction of the boundary as strongly supervised information to improve the detection ability of the model. Finally, several head functions with multiscale feature inputs were trained using a bidirectional additive strategy. In the experiments, the proposed method achieved competitive performance on multiple public datasets with fewer model parameters. A single model was obtained to realize universal prediction even for different datasets without retraining, demonstrating the effectiveness of the method. The code is available at https://github.com/neulmc/TDCN.

show abstract

Section: Introductionmentioning

confidence: 99%

Transformer with difference convolutional network for lightweight universal boundary detection

Li,

Liu,

Chen

et al. 2024

PLoS ONE

View full text Add to dashboard Cite

show abstract

“…In addition, almost all widely used segmentation models, such as UNet [20], FCNs [21], and DeepLab [22], are classification-based in nature and the output probability maps are relatively unstructured, thus lacking the capability of capturing global structures of the target objects. To characterize the long-range data dependency, transformer [23]- [25] has been introduced for semantic image segmentation, such as TransUNet [26], SwinUNet [27], DS-TransUNet [28], and nnFormer [29], which, however, substantially increases the inference cost and memory complexity of the segmentation models. Recent research has demonstrated that, compared to the segmentation CNNs alone, the integration of a graphical model such as conditional random fields (CRFs) into CNNs enhances the robustness of the method to adversarial perturbations [30]- [32].…”

Section: Introductionmentioning

confidence: 99%

Deep graph learning of inter-protein contacts

Xie

2021

Preprint

View full text Add to dashboard Cite

Motivation: Inter-protein (interfacial) contact prediction is very useful for in silico structural characterization of protein-protein interactions. Although deep learning has been applied to this problem, its accuracy is not as good as intra-protein contact prediction. Results: We propose a new deep learning method GLINTER (Graph Learning of INTER-protein contacts) for interfacial contact prediction of dimers, leveraging a rotational invariant representation of protein tertiary structures and a pretrained language model of multiple sequence alignments (MSAs). Tested on the 13th and 14th CASP-CAPRI datasets, the average top L/10 precision achieved by GLINTER is 54.35% on the homodimers and 51.56% on all the dimers, much higher than 30.43% obtained by the latest deep learning method DeepHomo on the homodimers and 14.69% obtained by BIPSPI on all the dimers. Our experiments show that GLINTER-predicted contacts help improve selection of docking decoys.

show abstract