Multi-agent deep reinforcement learning and GAN-based market simulation for derivatives pricing and dynamic hedging

Advances in computing power have enabled machine learning algorithms to learn directly from large amounts of data. Deep reinforcement learning is a particularly powerful approach that uses agents to learn by interacting with a data environment. While many traders and investment managers rely on traditional statistical and stochastic methods to price assets and develop trading and hedging strategies, deep reinforcement learning has proven to be an effective way to learn optimal strategies for pricing and hedging. Machine learning removes the need for various parametric assumptions about underlying market dynamics by learning directly from data . This study examines the use of machine learning methods to develop a data-driven approach to derivatives pricing and dynamic hedging . However, machine learning methods like reinforcement learning require a lot of data to learn. We explore the implementation of methods based on generative adversarial networks to generate realistic market data from past historical data. This data is used to train the reinforcement learning framework and evaluate its robustness. The results demonstrate the effectiveness of deep reinforcement learning methods for pricing derivatives and hedging positions in the proposed GAN-based systematic market simulation framework .

Improving generalization of reinforcement learning-based trading by using generative adversarial market models

Improving Generalization in Reinforcement Learning–Based Trading by Using a Generative Adversarial Market Model

With the increasing sophistication of artificial intelligence, reinforcement learning (RL) has been widely used in portfolio management. However, disadvantages remain. Specifically, since the training environments of RL-based portfolio optimization frameworks are usually constructed based on historical price data in the literature, the agent may

1) Violating the definition of a Markov decision process (MDP),

2) ignores its own market impact, or

3) Failure to explain the causal relationship in the interaction process;

These ultimately lead the agent to make poor generalizations. To overcome these issues—in particular, to help RL-based portfolio agents generalize better—we introduce an interactive training environment that utilizes a generative model called the limit order book generative adversarial model ( LOB-GAN) , to simulate financial markets. Specifically, LOB-GAN models market ordering behavior and uses LOB-GAN's generator as a market behavior simulator. The market behavior simulator combines the real securities matching system to construct a simulated financial market, which is called a virtual market. The virtual marketplace is then used as an interactive training environment for RL-based portfolio agents. Experimental results show that our framework improves out-of-sample portfolio performance by 4%, outperforming other generalization strategies.

おすすめ

転載: blog.csdn.net/sinat_37574187/article/details/130301525