TextCaps: Handwritten Character Recognition With Very Small Datasets

Jayasundara, Vinoj; Jayasekara, Sandaru; Jayasekara, Hirunima; Rajasegaran, Jathushan; Seneviratne, Suranga; Rodrigo, Ranga

doi:10.1109/wacv.2019.00033

Cited by 62 publications

(41 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In comparison with the fully-connected layer decoder [19], this captures more spatial relationships while reconstructing the images. Further, we use binary cross entropy as the loss function for improved performance [12].…”

Section: Class Independent Decoder Networkmentioning

confidence: 99%

DeepCaps: Going Deeper With Capsule Networks

Rajasegaran

Jayasundara

Jayasekara

et al. 2019

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Self Cite

197

175

View full text Add to dashboard Cite

Capsule Network is a promising concept in deep learning, yet its true potential is not fully realized thus far, providing sub-par performance on several key benchmark datasets with complex data. Drawing intuition from the success achieved by Convolutional Neural Networks (CNNs) by going deeper, we introduce DeepCaps 1 , a deep capsule network architecture which uses a novel 3D convolution based dynamic routing algorithm. With DeepCaps, we surpass the state-of-the-art results in the capsule network domain on CIFAR10, SVHN and Fashion MNIST, while achieving a 68% reduction in the number of parameters. Further, we propose a class-independent decoder network, which strengthens the use of reconstruction loss as a regularization term. This leads to an interesting property of the decoder, which allows us to identify and control the physical attributes of the images represented by the instantiation parameters.

show abstract

Section: Class Independent Decoder Networkmentioning

confidence: 99%

DeepCaps: Going Deeper With Capsule Networks

Rajasegaran

Jayasundara

Jayasekara

et al. 2019

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Self Cite

197

175

View full text Add to dashboard Cite

show abstract

“…An interesting work using capsule layers (a concept recently introduced by Sabour et al [76]) is TextCaps [77], a CNN with 3 convolutional layers and two capsule layers, which obtain accuracies of 95.36% in EMNIST Letters and 99.79% in EMNIST Digits. Another interesting work was published by dos Santos et al [78], where they proposed the use of deep convolutional extreme learning machine, where gradient descent is not used to train the network, allowing a very fast learning stage (only 21 min using CPU, according to authors).…”

Section: State Of the Artmentioning

confidence: 99%

A Survey of Handwritten Character Recognition with MNIST and EMNIST

2019

View full text Add to dashboard Cite

This paper summarizes the top state-of-the-art contributions reported on the MNIST dataset for handwritten digit recognition. This dataset has been extensively used to validate novel techniques in computer vision, and in recent years, many authors have explored the performance of convolutional neural networks (CNNs) and other deep learning techniques over this dataset. To the best of our knowledge, this paper is the first exhaustive and updated review of this dataset; there are some online rankings, but they are outdated, and most published papers survey only closely related works, omitting most of the literature. This paper makes a distinction between those works using some kind of data augmentation and works using the original dataset out-of-the-box. Also, works using CNNs are reported separately; as they are becoming the state-of-the-art approach for solving this problem. Nowadays, a significant amount of works have attained a test error rate smaller than 1% on this dataset; which is becoming non-challenging. By mid-2017, a new dataset was introduced: EMNIST, which involves both digits and letters, with a larger amount of data acquired from a database different than MNIST's. In this paper, EMNIST is explained and some results are surveyed.

show abstract

“…We also selected two derivatives of the standard capsule network: MS-CapsNet 12 and TextCaps. 13 Both of these derivatives were applied to the Fashion-MNIST and CIFAR-10 datasets, and the main improvements from both variants also lie in their feature extraction processes. In terms of accuracy, both of these models exceed that of the standard capsule network.…”

Section: Resultsmentioning

confidence: 99%

“…TextCaps 13 introduced a technique for generating new training samples from existing samples that simulates the actual changes in human handwriting input by adding random noise to the corresponding instantiation parameters. This strategy is useful in character recognition for localized languages that lack large amounts of labeled training data.…”

Section: Related Workmentioning

confidence: 99%

Feature and spatial relationship coding capsule network

et al. 2020

View full text Add to dashboard Cite

A capsule network encodes entity features into a capsule and maps a spatial relationship from the local feature to the overall feature by dynamic routing. This structure allows the capsule network to fully capture feature information but inevitably leads to a lack of spatial relationship guidance, sensitivity to noise features, and easy susceptibility to falling into local optimization. Therefore, we propose a novel capsule network based on feature and spatial relationship coding (FSc-CapsNet). Feature and spatial relationship extractors are introduced to capture features and spatial relationships, respectively. The feature extractor abstracts feature information from bottom to top, while attenuating interference from noise features, and the spatial relationship extractor provides spatial relationship guidance from top to bottom. Then, instead of dynamic routing, a feature and spatial relationship encoder is proposed to find the optimal combination of features and spatial relationships. The encoder abandons the idea of iterative optimization but adds the optimization process to the backpropagation. The experimental results show that, compared with the capsule network and its multiple derivatives, the proposed FSc-CapsNet achieves significantly better performance on both the Fashion-MNIST and CIFAR-10 datasets. In addition, compared with some mainstream deep learning frameworks, FSc-CapsNet performs quite competitively on Fashion-MNIST. © The Authors. Published by SPIE under a Creative Commons Attribution 4.0 Unported License. Distribution or reproduction of this work in whole or in part requires full attribution of the original publication, including its DOI.Traditional convolutional neural networks (CNNs) 1 have obvious limitations for exploring spatial relationships. The general method for classifying images of the same type taken from different angles is to train multiple neurons to process features and then add a top-level detection neuron to detect the classification results. This approach tends to remember the dataset rather than summarizing the solution, and it requires large amounts of training data to cover different variants and avoid overfitting. This characteristic also makes CNNs very vulnerable when dealing with tasks based on moved, rotated, or resized samples.Unlike CNNs, capsule networks (CapsuleNet) 2 use capsules 3 to capture a series of features and their variants. In the capsule network, higher-layer capsules are used to capture the overall features, such as "face" or "car," while the lower-layer capsules are used to capture local entity features such as "nose," "mouth," or "wheels," leading to a completely different approach than a convolutional network when abstracting the overall feature from local features. However, this is not enough. A complete identification process requires both bottom-up feature abstraction and top-down spatial relationship guidance. The capsule network defines a transformation matrix between adjacent capsule layers to implement feature abstraction. Then, dynamic rou...

show abstract

TextCaps: Handwritten Character Recognition With Very Small Datasets

Cited by 62 publications

References 19 publications

DeepCaps: Going Deeper With Capsule Networks

DeepCaps: Going Deeper With Capsule Networks

A Survey of Handwritten Character Recognition with MNIST and EMNIST

Feature and spatial relationship coding capsule network

Contact Info

Product

Resources

About