Yair Kittenplon scite author profile

Text spotting end-to-end methods have recently gained attention in the literature due to the benefits of jointly optimizing the text detection and recognition components. Existing methods usually have a distinct separation between the detection and recognition branches, requiring exact annotations for the two tasks. We introduce TextTranSpotter (TTS), a transformer-based approach for text spotting and the first text spotting framework which may be trained with both fully-and weakly-supervised settings. By learning a single latent representation per word detection, and using a novel loss function based on the Hungarian loss, our method alleviates the need for expensive localization annotations. Trained with only text transcription annotations on real data, our weakly-supervised method achieves competitive performance with previous state-of-the-art fullysupervised methods. When trained in a fully-supervised manner, TextTranSpotter shows state-of-the-art results on multiple benchmarks 1 .

show abstract

Towards Models that Can See and Read

Roy¹,

Nuriel²,

Aberdam³

et al. 2023

Preprint

View full text Add to dashboard Cite

Towards Weakly-Supervised Text Spotting using a Multi-Task Transformer

Kittenplon¹,

Lavi²,

Fogel³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

FlowStep3D: Model Unrolling for Self-Supervised Scene Flow Estimation

Kittenplon

Eldar

Raviv

2020

Preprint

View full text Add to dashboard Cite

Estimating the 3D motion of points in a scene, known as scene flow, is a core problem in computer vision. Traditional learning-based methods designed to learn end-toend 3D flow often suffer from poor generalization. Here we present a recurrent architecture that learns a single step of an unrolled iterative alignment procedure for refining scene flow predictions. Inspired by classical algorithms, we demonstrate iterative convergence toward the solution using strong regularization. The proposed method can handle sizeable temporal deformations and suggests a slimmer architecture than competitive all-to-all correlation approaches. Trained on FlyingThings3D synthetic data only, our network successfully generalizes to real scans, outperforming all existing methods by a large margin on the KITTI self-supervised benchmark. 1

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yair Kittenplon

FlowStep3D: Model Unrolling for Self-Supervised Scene Flow Estimation

Towards Weakly-Supervised Text Spotting using a Multi-Task Transformer

Towards Models that Can See and Read

Towards Weakly-Supervised Text Spotting using a Multi-Task Transformer

FlowStep3D: Model Unrolling for Self-Supervised Scene Flow Estimation

Contact Info

Product

Resources

About