Gabriele Prato scite author profile

Gabriele Prato

5Publications

52Citation Statements Received

85Citation Statements Given

How they've been cited

How they cite others

Affiliations

Université de Montréal, Politecnico di Milano

Publications

Order By: Most citations

Fully Quantized Transformer for Machine Translation

Prato¹,

Charlaix²,

Rezagholizadeh³

2020

View full text Add to dashboard Cite

State-of-the-art neural machine translation methods employ massive amounts of parameters. Drastically reducing computational costs of such methods without affecting performance has been up to this point unsuccessful. To this end, we propose FullyQT: an allinclusive quantization strategy for the Transformer. To the best of our knowledge, we are the first to show that it is possible to avoid any loss in translation quality with a fully quantized Transformer. Indeed, compared to fullprecision, our 8-bit models score greater or equal BLEU on most tasks. Comparing ourselves to all previously proposed methods, we achieve state-of-the-art quantization results.

show abstract

Fully Quantized Transformer for Machine Translation

Prato¹,

Charlaix²,

Rezagholizadeh³

2019

Preprint

View full text Add to dashboard Cite

Towards Lossless Encoding of Sentences

Prato

Duchesneau

Chandar

et al. 2019

View full text Add to dashboard Cite

A lot of work has been done in the field of image compression via machine learning, but not much attention has been given to the compression of natural language. Compressing text into lossless representations while making features easily retrievable is not a trivial task, yet has huge benefits. Most methods designed to produce feature rich sentence embeddings focus solely on performing well on downstream tasks and are unable to properly reconstruct the original sequence from the learned embedding. In this work, we propose a near lossless method for encoding long sequences of texts as well as all of their sub-sequences into feature rich representations 1 . We test our method on sentiment analysis and show good performance across all sub-sentence and sentence embeddings.

show abstract

Scaling Laws for the Few-Shot Adaptation of Pre-trained Image Classifiers

Prato¹,

Guiroy²,

Caballero³

et al. 2021

Preprint

View full text Add to dashboard Cite

Empirical science of neural scaling laws is a rapidly growing area of significant importance to the future of machine learning, particularly in the light of recent breakthroughs achieved by large-scale pre-trained models such as GPT-3, CLIP and DALL-e. Accurately predicting the neural network performance with increasing resources such as data, compute and model size provides a more comprehensive evaluation of different approaches across multiple scales, as opposed to traditional point-wise comparisons of fixed-size models on fixed-size benchmarks, and, most importantly, allows for focus on the best-scaling, and thus most promising in the future, approaches. In this work, we consider a challenging problem of few-shot learning in image classification, especially when the target data distribution in the few-shot phase is different from the source, training, data distribution, in a sense that it includes new image classes not encountered during training. Our current main goal is to investigate how the amount of pre-training data affects the few-shot generalization performance of standard image classifiers. Our key observations are that (1) such performance improvements are well-approximated by power laws (linear log-log plots) as the training set size increases, (2) this applies to both cases of target data coming from either the same or from a different domain (i.e., new classes) as the training data, and (3) few-shot performance on new classes converges at a faster rate than the standard classification performance on previously seen classes. Our findings shed new light on the relationship between scale and generalization.

show abstract

PatchBlender: A Motion Prior for Video Transformers

Prato¹,

Song²,

Rajendran³

et al. 2022

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Gabriele Prato

Fully Quantized Transformer for Machine Translation

Fully Quantized Transformer for Machine Translation

Towards Lossless Encoding of Sentences

Scaling Laws for the Few-Shot Adaptation of Pre-trained Image Classifiers

PatchBlender: A Motion Prior for Video Transformers

Contact Info

Product

Resources

About