Jack's Car Rental is a classic problem of Markov decision process (MDP) is. The figure comes Reinforcement Learning: An Introduction This book of 98.
Term Project C language programming courses is to solve this problem, where the record about the content.
Jack Car problems are summarized as follows:
- Jack has two rental points, Location 1 and Location 2;
- Car park each point up to 20 cars;
- Rent out a car every 10 dollars profit;
- The number of cars and recovered vehicles are rented out every day Poisson distribution:
- Location 1:
- Leased: λ = 3
- Recovery: λ = 3
- Location 2:
- Leased: λ = 4
- Recovery: λ = 2
- Location 1:
- Jack night every day between the two points of the vehicle rental adjustment, up to 5 cars per schedule, and cost $ 2 per vehicle.
- Discount rate γ = 0.9;
- Ask what you can make profitable deployment strategies to optimize?
From the information provided above, you can analyze the following:
- Each rental point up to the number of states is 20 → 21 * 21 = 441 (including the vehicle is zero)
- Up to 5 per deployment of vehicles, there are sets of actions: A = {(-5, 5), (-4,. 4), ..., (0, 0), (. 1, -1), ... , (5, -5)}, where (a1, a2) is represented as (Location number 1 out of the vehicle, Location number and out of the vehicle 2), exactly represents the negative sign is shown.
Reproduced in: https: //www.jianshu.com/p/681bb5a91e82