2023
DOI: 10.3390/electronics12030673
|View full text |Cite
|
Sign up to set email alerts
|

Table Structure Recognition Method Based on Lightweight Network and Channel Attention

Abstract: The table recognition model rows and columns aggregated network (RCANet) uses a semantic segmentation approach to recognize table structure, and achieves better performance in table row and column segmentation. However, this model uses ResNet18 as the backbone network, and the model has 11.35 million parameters and a volume of 45.5 M, which is inconvenient to deploy to lightweight servers or mobile terminals. Therefore, from the perspective of model compression, this paper proposes the lightweight rows and col… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 25 publications
0
3
0
Order By: Relevance
“…Each branch has an equal number of feature channels. The left branch performs identity mapping [27]. The right branch undergoes two 1 × 1 ordinary convolutions and one 3 × 3 depthwise convolution (DWConv) while maintaining an equal number of input and output channels.…”
Section: The Improved Yolov5 Algorithm 231 the Shufflenet Modulementioning
confidence: 99%
“…Each branch has an equal number of feature channels. The left branch performs identity mapping [27]. The right branch undergoes two 1 × 1 ordinary convolutions and one 3 × 3 depthwise convolution (DWConv) while maintaining an equal number of input and output channels.…”
Section: The Improved Yolov5 Algorithm 231 the Shufflenet Modulementioning
confidence: 99%
“…The overall process of the CBAM module can be divided into two parts, as shown in Figure 10. Channel attention is mainly used to capture the correlation between different channels [49]. This module first obtained two channels through global average pooling and maximum pooling operations: the 1 × 1 × C channels.…”
Section: Cbam Modulementioning
confidence: 99%
“…In Equation ( 3), 𝜎 denotes the Sigmoid function, MLP denotes the multilayer perceptron, AvgPool denotes the average pooling layer, MaxPool denotes the maximum pooling layer, 𝑊 0 and 𝑊 1 denote the two weight matrices, 𝐹 𝑎𝑣𝑔 and 𝐹 𝑚𝑎𝑥 denote the results of the input data after the AvgPool and MaxPool operations, and the superscript 𝐶 Channel attention is mainly used to capture the correlation between different channels [49]. This module first obtained two channels through global average pooling and maximum pooling operations: the 1 × 1 × C channels.…”
Section: Cbam Modulementioning
confidence: 99%