ORBIT: A Real-World Few-Shot Dataset for Teachable Object Recognition

Massiceti, Daniela; Zintgraf, Luisa; Bronskill, John; Theodorou, Lida; Harris, Matthew Tobias; Cutrell, Edward; Morrison, Cecily; Hofmann, Katja; Stumpf, Simone

doi:10.1109/iccv48922.2021.01064

Cited by 28 publications

(29 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Second, we evaluate FIT against BiT on the challenging VTAB-1k benchmark [31], where BiT has been shown to outperform all meta-learners [10,28]. Third, we show how FIT can be used in a personalization scenario on the ORBIT [5] dataset, where a smaller updateable model is an important evaluation metric. Finally, we apply FIT to a few-shot federated learning scenario where minimizing the number of parameter updates and their size is a key requirement.…”

Section: Methodsmentioning

confidence: 99%

“…In our experiments, we use ORBIT [5], a real-world few-shot video dataset recorded by people who are blind/low-vision. A blind or vision-impaired user collects a series of short videos on their smartphone of objects that they would like to recognize.…”

Section: Personalizationmentioning

confidence: 99%

“…The personalization experiments were carried out on a single NVIDIA GeForce RTX 3090 with 24GB of memory. It takes approximately 5 hours to train FIT-LDA personalization models for all the ORBIT [5] test tasks. We derived all hyperparameters empirically from a small number of runs.…”

Section: A52 Personalization On Orbit Experimentsmentioning

confidence: 99%

“…With the success of the commercial application of deep learning in many fields such as computer vision [1], natural language processing [2], speech recognition [3], and language translation [4], an increasing number of models are being trained on central servers and then deployed on remote devices, often to personalize a model to a specific user's needs. Personalization requires models that can be updated inexpensively by minimizing the number of parameters that need to be stored and/or transmitted and frequently calls for few-shot learning methods as the amount of training data from an individual user may be small [5]. At the same time, for privacy, security, and performance reasons, it can be advantageous to use federated learning where a model is trained on an array of remote devices, each with different data, and share gradient or parameter updates instead of training data with a central server [6].…”

Section: Introductionmentioning

confidence: 99%

“…In this work, we focus on designing deep learning network architectures and associated training protocols that allow image classification models to be updated with only a small subset of the total model parameters, without sacrificing prediction performance when there is only a small number of training examples available. This leads to reduced storage and transmission costs for updating personalized models on remote devices [5], distributed training in federated learning [6], and efficient ensemble realization [13], among other applications. To realize our goal of small model updates, we pursue a transfer learning approach that takes advantage of image classification backbones that have been pretrained on large upstream datasets [12].…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

FiT: Parameter Efficient Few-shot Transfer Learning for Personalized and Federated Image Classification

Shysheya¹,

Bronskill²,

Patacchiola³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

Modern deep learning systems are increasingly deployed in situations such as personalization and federated learning where it is necessary to support i) learning on small amounts of data, and ii) communication efficient distributed training protocols. In this work we develop FiLM Transfer (FIT) which fulfills these requirements in the image classification setting. FIT uses an automatically configured Naive Bayes classifier on top of a fixed backbone that has been pretrained on large image datasets. Parameter efficient FiLM layers are used to modulate the backbone, shaping the representation for the downstream task. The network is trained via an episodic fine-tuning protocol. The approach is parameter efficient which is key for enabling few-shot learning, inexpensive model updates for personalization, and communication efficient federated learning. We experiment with FIT on a wide range of downstream datasets and show that it achieves better classification accuracy than the state-of-the-art Big Transfer (BiT) algorithm at low-shot and on the challenging VTAB-1k benchmark, with fewer than 1% of the updateable parameters. Finally, we demonstrate the parameter efficiency of FIT in distributed low-shot applications including model personalization and federated learning where model update size is an important performance metric.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Personalizationmentioning

confidence: 99%

Section: A52 Personalization On Orbit Experimentsmentioning

confidence: 99%