Matching module plays a critical role in display advertising systems. Different from sponsored search where user intentions can be captured naturally through query, display advertising has no explicit information about user intentions. Thus, it is challenging for display advertising systems to match user traffic and ads suitably w.r.t. both user experience and advertising performance. From the advertiser's view, system packs up a group of users with common properties, such as the same gender or similar shopping interests, into a crowd. Here term crowd can be viewed as a tag over users in the same crowd. Then advertisers bid for different crowds and deliver their ads to those targeted users. From the advertising system's view, things turn to be a little different. So far as we know, matching module in most industrial display advertising systems follows a two-stage paradigm. When receiving a user visit request, matching system (i) finds the crowds that the user belongs to; (ii) retrieves all ads that have targeted those crowds. However, in real world applications, such as the display advertising at Alibaba, with volume of crowds reaching up to tens of millions and volume of ads reaching up to millions, both stages of matching have to truncate the long-tailed user-crowd or crowd-ad pairs for online serving, under limited latency and computation cost requirements. That is to say, not all advertisers that bid for a given user have the chance to participate in the online matching process. This results in sub-optimal advertising performance for advertisers. Besides, it also brings loss of revenue of the advertising platform.In this paper, we study carefully the truncation problem and propose a Truncation-Free Matching System (TFMS). The basic idea of TFMS is to decouple the matching computation from the online processing pipeline. Instead of executing the two-stage matching when user visits, TFMS utilizes a near-line truncation-free matching module to pre-calculate and store those top valuable ads for each user. Then, the online pipeline just needs to fetch the pre-stored candidate ads as the result of matching. In this way, near-line matching can jump out of the online system's latency and computation cost limitations and leverage flexible computation resources to finish the user-ad matching process. Moreover, we can employ arbitrary advanced models to conduct the top-n candidate selection in the near-line matching system over all candidate ad set, bringing superior performance compared with original roughly truncated online * Both authors contributed equally to this work. matching system. Since 2019, TFMS has been deployed in our productive display advertising system, bringing (i) more than 50% improvement of impressions for advertisers who encountered truncation before, (ii) 9.4% RPM (Revenue Per Mile) gain for advertising system, which is significant enough for the business.
Continual learning (CL) aims to learn new tasks without forgetting previous tasks. However, existing CL methods require a large amount of raw data, which is often unavailable due to copyright considerations and privacy risks. Instead, stakeholders usually release pre-trained machine learning models as a service (MLaaS), which users can access via APIs. This paper considers two practical-yet-novel CL settings: data-efficient CL (DECL-APIs) and data-free CL (DFCL-APIs), which achieve CL from a stream of APIs with partial or no raw data. Performing CL under these two new settings faces several challenges: unavailable full raw data, unknown model parameters, heterogeneous models of arbitrary architecture and scale, and catastrophic forgetting of previous APIs. To overcome these issues, we propose a novel data-free cooperative continual distillation learning framework that distills knowledge from a stream of APIs into a CL model by generating pseudo data, just by querying APIs. Specifically, our framework includes two cooperative generators and one CL model, forming their training as an adversarial game. We first use the CL model and the current API as fixed discriminators to train generators via a derivative-free method. Generators adversarially generate hard and diverse synthetic data to maximize the response gap between the CL model and the API. Next, we train the CL model by minimizing the gap between the responses of the CL model and the black-box API on synthetic data, to transfer the API's knowledge to the CL model. Furthermore, we propose a new regularization term based on network similarity to prevent catastrophic forgetting of previous APIs. Our method performs comparably to classic CL with full raw data on the MNIST and SVHN datasets in the DFCL-APIs setting. In the DECL-APIs setting, our method achieves 0.97×, 0.75× and 0.69× performance of classic CL on the more challenging CIFAR10, CIFAR100, and MiniImageNet, respectively.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.