Reinforcement learning for optimized trade execution

Nevmyvaka, Yuriy; Feng, Yi; Kearns, Michael

doi:10.1145/1143844.1143929

Cited by 204 publications

(182 citation statements)

References 9 publications

Supporting

Mentioning

171

Contrasting

Unclassified

Order By: Relevance

“…Developing tractable models that account for such data remains a challenge. One initiative to incorporate limit order book data into the decision process is presented by Nevmyvaka et al (2006).…”

Section: Resultsmentioning

confidence: 99%

Strategic execution in the presence of an uninformed arbitrageur

Moallemi

Park

Roy

2012

Journal of Financial Markets

View full text Add to dashboard Cite

We consider a trader who aims to liquidate a large position in the presence of an arbitrageur who hopes to profit from the trader's activity. The arbitrageur is uncertain about the trader's position and learns from observed price fluctuations. This is a dynamic game with asymmetric information. We present an algorithm for computing perfect Bayesian equilibrium behavior and conduct numerical experiments. Our results demonstrate that the trader's strategy differs significantly from one that would be optimal in the absence of the arbitrageur. In particular, the trader must balance the conflicting desires of minimizing price impact and minimizing information that is signaled through trading. Accounting for information signaling and the presence of strategic adversaries can greatly reduce execution costs.

show abstract

Section: Resultsmentioning

confidence: 99%

Strategic execution in the presence of an uninformed arbitrageur

Moallemi

Park

Roy

2012

Journal of Financial Markets

View full text Add to dashboard Cite

show abstract

“…A model that fits the problem of manipulation under the representation Fig. 1 is that of a Markov Decision Process (MDP) [18], [22]. In general, an MDP is defined by the tuple {S, A, T, R}, where S and A are sets of states and actions, respectively (s ∈ S and a ∈ A), R is the set of rewards (r ∈ R), and T is a set of transition probabilities ({P (s |s, a)} ∈ T , where P (s |s, a) represents the probability of transitioning to state s from s after action a).…”

Section: A Spoofing As a Markov Decision Processmentioning

confidence: 99%

Learning unfair trading: A market manipulation analysis from the reinforcement learning perspective

Martínez-Miranda

McBurney

Howard

2016

2016 IEEE Conference on Evolving and Adaptive Intelligent Systems (EAIS)

View full text Add to dashboard Cite

Abstract-Market manipulation is a strategy used by traders to alter the price of financial assets. One type of manipulation is based on the process of buying or selling assets by using several trading strategies, among them spoofing is a popular strategy and is considered illegal by market regulators. Some promising tools have been developed to detect manipulation, but cases can still be found in the markets. In this paper we model spoofing and pinging trading from a macroscopic perspective of profit maximisation, two strategies that differ in the legal background but share the same elemental concept of market manipulation. We use a reinforcement learning framework within the full and partial observability of Markov decision processes and analyse the underlying behaviour of the manipulators by finding the causes of what encourages the traders to perform fraudulent activities. Procedures can be applied to counter the problem as our model predicts the activity of the manipulators.

show abstract

“…that can for example vary their aggressiveness, are superior to static strategies. Nevmyvaka [30] proposed dynamic price adjustment strategy, where limit order's price is revised every 30 seconds adapting to the changing market state. Wang [35] proposed a dynamic focus strategy, which dynamically adjusts volume according to real-time update of state variables such as inventory and order book imbalance, and showed that dynamic focus strategy can outperforms a static limit order strategy.…”

Section: Typesmentioning

confidence: 99%

“…The objective function we used here is ratio of the difference between the VWAPs of the 30 orders and the entire executed orders generated from the ASM simulation to the entire executed orders' VWAP, which are V W AP 30 and V W AP global respectively. For both buy and sell orders, the smaller the VWAP Ratio, the better the strategy is.…”

Section: Ga Strategiesmentioning

confidence: 99%

Evolutionary Computation and Trade Execution

Cui

Brabazon

O’Neill

2010

Natural Computing in Computational Finance

View full text Add to dashboard Cite

Summary. Although there is a plentiful literature on the use of evolutionary methodologies for the trading of financial assets, little attention has been paid to the issue of efficient trade execution. Trade execution is concerned with the actual mechanics of buying or selling the desired amount of a financial instrument of interest. This chapter introduces the concept of trade execution and outlines the limited prior work applying evolutionary computing methods for this task. Furthermore, we build an Agent-based Artificial Stock Market and apply a Genetic Algorithm to evolve an efficient trade execution strategy. Finally, we suggest a number of opportunities for future research.

show abstract

Reinforcement learning for optimized trade execution

Cited by 204 publications

References 9 publications

Strategic execution in the presence of an uninformed arbitrageur

Strategic execution in the presence of an uninformed arbitrageur

Learning unfair trading: A market manipulation analysis from the reinforcement learning perspective

Evolutionary Computation and Trade Execution

Contact Info

Product

Resources

About