IsraelWe consider a social planner faced with a stream of myopic selfish agents. The goal of the social planner is to maximize the social welfare, however, it is limited to using only information asymmetry (regarding previous outcomes) and cannot use any monetary incentives. The planner recommends actions to agents, but her recommendations need to be Bayesian Incentive Compatible to be followed by the agents.Our main result is an optimal algorithm for the planner, in the case that the actions realizations are deterministic and have limited support, making significant important progress on this open problem. Our optimal protocol has two interesting features. First, it always completes the exploration of a priori more beneficial actions before exploring a priori less beneficial actions. Second, the randomization in the protocol is correlated across agents and actions (and not independent at each decision time).(This can be due to regulatory constraints, business model, social norms, or any other reason.) The main advantage of the planner in our model is the information asymmetry, namely, the fact that the planner has much more information than the agents. As a motivating example for information asymmetry, consider a GPS driving application. The application (social planner) is recommending to the drivers (agents) the best route to drive (action), given the changing road delays, and observes the actual road delays when the route is driven. While the application can recommend driving routes, ultimately, the driver decides which route to actually drive. The application needs periodically to send drivers on exploratory routes, where it has uncertainty regarding the actual delay, in order to observe their delay. The driver is aware that the application has updated information regarding the current delays on various roads. For this reason, the driver would be willing to follow the recommendation even if she knows that there is a small probability that she is asked to explore. On the other extreme, if the driver would assume that with high probability a certain recommended route has a higher delay, she might drive an alternate route. This inherent balancing of exploration and exploitation while satisfying agents' incentives, is at the core of this work.The abstract model that we consider is the following. There is a finite set of actions, and for each action there is a prior distribution on its rewards. A social planner is faced with a sequence of myopic selfish agents, and each agent appears only once. The social planner would like to maximize the social welfare, the sum of the agents' utilities. The social planner recommends to each agent an action, and if the recommendation is Bayesian incentive compatible (henceforth, BIC), the agent will follow the action. This model was presented in Kremer et al. [10] and studied in [11][12][13]. The work of Kremer et al. [10] presented an optimal algorithm for the social planner in the case of two actions with deterministic outcome. (Deterministic outcome implies that each time the ...