2019
DOI: 10.48550/arxiv.1904.10554
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Deep Q-Learning for Nash Equilibria: Nash-DQN

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
14
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 8 publications
(14 citation statements)
references
References 0 publications
0
14
0
Order By: Relevance
“…Pioneered by [12,35,36,45,46,47], various reinforcement learning algorithms are implemented and perform extremely successful in portfolio optimization problem with transaction costs. Reinforcement learning can even solve the Nash equilibria, see [16] for details. The key idea is to directly parametrize the optimal trading rate and optimize the discretized version of preference (2.6).…”
Section: Deep Learning-based Numerical Algorithmsmentioning
confidence: 99%
See 2 more Smart Citations
“…Pioneered by [12,35,36,45,46,47], various reinforcement learning algorithms are implemented and perform extremely successful in portfolio optimization problem with transaction costs. Reinforcement learning can even solve the Nash equilibria, see [16] for details. The key idea is to directly parametrize the optimal trading rate and optimize the discretized version of preference (2.6).…”
Section: Deep Learning-based Numerical Algorithmsmentioning
confidence: 99%
“…In the meantime, with the development of modern model-free techniques, reinforcement learning algorithms are also widely used in single-agent optimization problems. Indeed, as shown in the groundbreaking papers [12,14,15,16,35,36,45,46,47], we treat the utility functions as targets and directly parametrize and learn the optimal trading policy. Moreover, reinforcement learning frameworks are introduced and analyzed rigorously in [53,54].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…LeCun et al, 2015;Silver et al, 2016;Goodfellow et al, 2016), especially in financial mathematics (see e.g. Al-Aradi et al, 2018;Hu, 2019;Casgrain et al, 2019;Horvath et al, 2021;Campbell et al, 2021;Carmona and Laurière, 2021). The use of compositions of simple functions (usually referred to as propagation and activation functions) through several layers does a good job in modeling complicated functions.…”
Section: Actor-critic Algorithmmentioning
confidence: 99%
“…While it has worked well in practice, IQL struggles when learning more difficult multiagent coordination and control tasks due to non-stationarity instability. Strategies such as Nash-Q learning [2], [11], [13], [36], [37], Minimax [11], [19], [36], [37], and Friend or Foe Q-Learning [11], [18], [36], [37] have been proposed to solve stochastic games by finding a Nash Equilibrium policy. While these methods work in stochastic game settings, the complexity of the task at hand becomes a significant bottleneck as non-stationarity causes unstable learning.…”
Section: Introductionmentioning
confidence: 99%