The word "dynamic programming" is too scary, it can actually be called "state cache"

Abstract: When I usually practice algorithm problems to learn algorithm knowledge, I often find that "dynamic programming" is written in the problem solution, which is a complex dp formula. For newcomers, in addition to saying "wonderful", the rest is doubt. How did he come up with this formula? Can I think of it? Does this thing work?

This article is shared from HUAWEI CLOUD Community " How did you think of dynamic programming? 【Run it! JAVA] ", the original author: breakDraw.

When I usually practice algorithm problems to learn algorithm knowledge, I often find that "dynamic programming" is written in the problem solution, which is a complex dp formula. For newcomers, in addition to talking

The rest is doubt, how did he come up with this formula? Can I think of it? Does this thing work?
Add the high-end name of "dynamic programming", and then persuade many people who try to understand him.

Dynamic programming sounds too scary to put it another way

In my heart, I prefer to call him "state cache".
If it is service development, I believe you are familiar with this term, and use cache to speed up the response speed of some repeated requests.
The feature of this cache is that it is associated with other caches.

For example, our service needs to calculate the sum of a certain amount of money within 7 days, and cache it after the calculation.
Later, I received a request to calculate the sum of money in 8 days,
then we just need to take the sum of money in 7 days calculated before and add the money in the 8th day.

1+4 thinking routine

I have summed up my own thinking routine for dynamic programming. I call him 1 set of examples and 4 questions, which is called 1+4. Through these 5 processes, we can stand from the perspective of ordinary people (that is, the non-acm bosses). species) to understand how dynamic programming is thought of

Write an example of a set of calculation processes on the idea of timeout
On the basis of the timeout example, what are the repetitive and wasteful places?
How to define a dp array
What is the direction of state change and how does it change?
what is the boundary state

Simple example

Take a simple question as an example:
Climbing stairs: https://leetcode-cn.com/problems/climbing-stairs/

At this time, you need to calm down and observe whether there is a scene of repeated experience in the example of this solution, and this scene of repeated experience is called a state.
When I work on a dynamic programming problem, I ask myself three questions, which I usually solve without a hitch.

① Write an example of a set of calculation processes on the idea of timeout

If we consider the simplest solution, it is to start from the starting point, choose to take 1 step or 2 steps each time, and see if we can reach the end point. If we can, then the number of methods +1.
But this method is destined to timeout (O(n^2)),
but I still simulated it according to this process, and randomly listed a few
1 -> 2-> 3-> 4-> 5
1 -> 2 -> 3 -> 5
1->3->4->5
1->3->5

② On the basis of the timeout example, what are the repeated and wasteful places?

In the above, I found the duplicate

That is to say
, there are 2 routes from 3 to 5, which have been calculated after 1->2. When I go from 1 to 3 and then back, there is no need to calculate it.
In other words, when I get to 3, I already know how many moves are left.
Once you find the duplicates, you can start building the dp formula.

③ How to define the dp array?

Define the dp array, which is where the repetition mentioned above is defined. Look at the previous sentence again.
When I reach 3, I can actually know how many moves are left behind.
So dp[3] represents how many ways to go from 3 onwards.

④What is the change direction of the state and how does it change?

First, think about the direction of change in the state.
Look again at this sentence:

When I get to 3, I can already know how many moves are left.

Explain that the result depends on the state to the back , so we have to calculate the state of the back first, that is, from the back to the front

Then think about the relationship between this later state and the current state, and how it changes

This is generally included in the question conditions.
According to the meaning of the question, either take 2 steps or take 1 step, so whenever I go to the first floor, the next two states can change.
Then for the 3rd layer, he has 2 follow-up ways, 1 step or 2 steps
, then his situation is dp[3] = dp[3+1] + dp{3+2}
If the number of layers is set is i, then this change is
dp[i] = dp[i+1] + dp[i+2]

⑤ What is the boundary state?

The boundary state is the state in which the result can be obtained directly without relying on the subsequent state.
It must be the last layer dp[n] here, and the last layer is a move by default. dp[n]=1

accomplish

According to the above process, I define this state and change by myself

Definition: dp[i] : Represents how many ways to move from the i-th layer onwards
Direction and change: dp[i] = dp[i+1] + dp[i+2];
Boundary: dp[n] = 1
It is easy to write code according to this
code:

    public int climbStairs(int n) {
        int[] dp = new int[n + 1];
        dp[n] = 1;
        dp[n-1] = 1;
        for(int i = n-2; i >=0;i--) {
            dp[i] = dp[i+1] + dp[i+2];
        }
        return dp[0];
    }

Advanced version, two-dimensional dynamic programming

https://leetcode-cn.com/problems/number-of-ways-to-stay-in-the-same-place-after-some-steps/

① Write an example of a set of calculation processes on the idea of timeout

The idea of the timeout is definitely to simulate all the walking like a search.
Let's assume 1 steps=5, arrlen=3,
just list a few first. Simulate a constantly walking position. The number refers to the current location.
0->1->2->1->0->0
0->1->2->1->1->0
0->1->1->1->1->0
0- >1->1->1->0->0
0->0->1->1->1->0
…

② On the basis of the timeout example, what are the repeated and wasteful places?

0->1->2-> 1->0->0
0->1->2-> 1->1->0
0->1->1-> 1->1->0
0- >1->1-> 1->0->0
0->0->1-> 1->1->0
0->0->1-> 1->0->0
I found this part The bolded part is repeated.

in other words

When I have 2 steps left and the current position is 1, I already know how many moves there are.

③ How to define the dp array?

Reread this sentence:

When I have 2 steps left and the current position is 1, I already know how many moves there are.

Two key factors are involved: the number of remaining steps and the current value, so a two-dimensional array must be used

therefore

dp[realstep][index]

It represents how many moves are left when the remaining number of steps is step and the position is index.

④What is the change direction of the state and how does it change?

Think about the direction of change

"When I have 2 steps left and my current position is 1, I already know how many moves there are. "

What does this refer to, and what will happen later?

It must be the case that the number of steps is getting smaller and smaller, and the position will change according to the law. Therefore, the direction of change is that the number of steps is reduced, and the position is changed according to the regulations.

Then this "remaining number of steps", which is fixed less and less, is the direction of change of the core.

When we calculate, we can first calculate the state of the small remaining steps, and then calculate the large remaining steps.

how to change

According to the meaning and direction of the question, the number of remaining steps must be -1, and then there are 3 options for the position (minus 1, unchanged, plus 1), then the method is the addition of the 3 options.

dp[step][index] = dp[step-1][index-1] + dp[step-1][index] + dp[step-1][index+1]

⑤ What is the boundary state?

When the remaining number of steps is 0, only the current position is 0 is the final solution we want, set the value to 1 and provide it for later use, other positions and the number of steps are 0 are considered as 0.

dp[0][0] = 1；

dp[0][index] = 0；（index>0)

accomplish

Then it finally came out

Definition: dp{realstep][index]: When the remaining steps are step and the position is index, how many ways are left in the follow-up.
Direction and change: dp[step][index] = dp[step-1][index-1] + dp[step-1][index] + dp[step-1][index+1]
bounds: dp[0][0] = 1;

memory overflow handling

However, because this question is a difficult question, a small difficulty is set up for the above formula:

The length of the array is very large, so if we choose the range of index to be 0~arrLen-1, then the maximum case dp[500][10^6] is destined to time out the memory range.

At this time, we need to think about whether it is unnecessary to set the index so large.

In general, we can list small examples of this situation ourselves, such as

step=2, arr=10

Then see if it is necessary to set the index to 0~9, just take a few steps

0->1->0

0->0->0

Um? I found that there are only 3 cases, so long after arr is not necessary?

So found the rule:

The remaining steps must support him to return to the origin!

That is to say, in fact, the maximum range of index is step/2 at most.

So problem solved.

Other similar exercises

https://leetcode-cn.com/problems/minimum-cost-for-tickets/

Click Follow to learn about HUAWEI CLOUD's new technologies for the first time~