This paper considers resource allocation among producers (agents) in the case where the Principal knows nothing about their cost functions while the agents have Markovian awareness about his/her strategies. We use a dynamic setup of the stochastic inverse Stackelberg game as the model. We suggest an algorithm for solving this game based on Q-learning. The associated Bellman equations contain functions of one variable for the Principal and also for the agents. The new results are illustrated by numerical examples.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.