[PhD Thesis of Oxford University] Adaptive Robust Control Combined with Statistical Learning

fc6121e45316b1475115457853e7db76.png

来源:专知
本文为论文介绍,建议阅读5分钟在本论文中,我们的目标是研究一个鲁棒的随机控制问题,其中代理不知道基础过程的参数值。

be340e5c3c7e570fe033f06bf351f210.png

In stochastic control problems, an agent chooses an optimal policy to maximize or minimize a performance criterion. The performance criterion can be the expectation of the reward function for standard control problems or the nonlinear expectation for robust control problems. In parametric stochastic control problems, the agent needs to know the values ​​of the model parameters in the stochastic system in order to correctly specify the optimal policy. However, the cases where the agent knows the values ​​of the model parameters are almost non-existent.

In this paper, our goal is to study a robust stochastic control problem where the agent does not know the parameter values ​​of the underlying process. Therefore, we formulate a stochastic control problem assuming that the agent does not know the values ​​of the model parameters. However, the agent uses observable processes to estimate the values ​​of model parameters while solving stochastic control problems within a robust framework.

This new stochastic control problem has two key components. The first component is the parameter estimation part, where the agent uses the implementation of the underlying process to estimate the unknown parameters in the stochastic system. We pay special attention to online parameter estimation. Online estimators are an important ingredient for our stochastic control problems because this type of estimator allows the agent to obtain an optimal policy in the form of feedback. The second component is the stochastic control part, where the question is how to design a time-consistent stochastic control problem so that the agent can also simultaneously estimate parameters and optimize its strategy. In this paper, we address each component of the above problem in a continuous-time setting, and then take a close look at the utility maximization problem under this framework.

In this paper, we study stochastic control problems in which the agent does not have sufficient knowledge of the parameter values ​​in the model, and over time new observations are used to estimate the parameters and simultaneously update the optimal policy. This question is interesting from both a theoretical and a practical point of view. Standard stochastic control problems often assume that the agent knows the values ​​of the model parameters, a strong assumption that does not hold in practice. By relaxing the assumption on parameter knowledge, we can apply the new stochastic control framework to many classical stochastic control problems, such as utility maximization, where the agent does not have full knowledge of the model parameter values ​​in a stochastic system. There are two key components in these stochastic control problems. First, the values ​​of the parameters are estimated over time and as more information becomes available. In this paper, we focus on online parameter estimation. Online estimators are an important component of the stochastic control problems we study because online estimators allow agents to obtain policies (Markovian) in the form of feedback. Second, design a time-consistent stochastic control problem that allows the agent to estimate parameters online while simultaneously deriving an optimal policy. In this paper, we address each component of the above problem in a continuous-time setting.

a8e96ff17d2dd41ca0e40f6662046fbd.png

a864691077d572f54bf4034241b909dd.png

ef0fc41d196c5e86ee239c686ea60367.png

Guess you like

Origin blog.csdn.net/tMb8Z9Vdm66wH68VX1/article/details/131388294
PHD