Label-setting algorithm to solve the shortest path problem with time window (ESPPRC)

The following article comes from the data magician, author Deng Faheng, Zhou Hang

Foreword

Hello everyone ~!

Presumably everyone will feel a little at a loss when they first start learning the operations research model? For example, a lot of magical nouns, all kinds of constraints. . . Anyway, at first I was in a state of ignorance.

So this time we bring a basic shortest path problem with time window (Shortest Path Problem with Time Windows, referred to as SPPTW) , using a basic accurate algorithm, namely the label-setting algorithm, to solve it. Since the references are relatively old, this method has now been greatly optimized. Of course, only by starting with the basics and climbing step by step can we continue to solve more difficult problems. (Frankly, it ’s difficult for the editor to be spicy)

Without further ado, start this article!

To download the relevant code and examples in this article, please pay attention to the public number [program sound], the background reply [SPPTWCPP] does not include []

table of Contents

1. Introduction to SPPTW 2. Introduction to
Label-setting algorithm
3. Dominant pruning: dominate
4. Order of label processing: dictionary sorting
5. Algorithm flow and examples
6. C ++ source code sharing

SPPTW Introduction

Let's first briefly introduce the problem to be dealt with.

Shortest Path Problem (SPP) :

In a graph, each edge has a number associated with it. We call such a number a weight . For the following figure G = (N, A), each edge has only one weight to represent the cost (can be understood as the length of the edge). Given the starting point p, find the path with the smallest cost for p to reach the remaining points .

For example, the picture above. Each side has a right to represent costs. The traditional shortest path problem requires us to find the minimum cost path from the starting point (such as v_0) to the remaining points. For example, the shortest path from v_0 to v_4 is v_0 → v_2 → v_3 → v_4, and the total cost is 19; and the path of v_0 → v_4, the total cost is 30, so it is not the shortest path from v_0 to v_4.

Note that in the classic shortest path problem, the weights on the edges are generally positive .

In SPPTW:

Each edge in the figure has two weights , one of which represents the time spent and one represents the cost of hearing the edge . Each node i has a time window [a_i, b_i], and the path needs to meet the time window constraints when accessing the node, namely:

If the time to reach the point i is earlier than the time a_i when the time window is opened, you need to wait until the time window is opened before entering; if the arrival time exceeds the time closed time b_i, you cannot access the node.

(D_ij represents time, c_ij represents cost, [xx, yy] represents time window. See definition below for details)

On this basis, finding the path with the smallest total cost from the starting point p (point v_1) to the remaining points is the problem we have to solve.

In the figure we can see that the cost weight of v_1 → v_4 is negative. The algorithm in this paper can not only solve the case where the cost is positive, but also solve the case where the cost is negative . Just make sure that the time consumption is positive .

On this basis, the model of the problem is established:

The path X_1 ^ 0 can be represented by the following diagram:

Traditional short-circuit problem modeling can directly remove some definitions, and will not repeat them. Let us first look at the labeling method to deal with the traditional shortest path problem.

Introduction to Label-setting algorithm

Labeling algorithms (Labeling algorithms) is an important method to solve the shortest path problem, but also the core part of most shortest path algorithms.

According to different processing strategies for label nodes, labeling algorithms can be divided into two systems: label setting (LS) and label correction (LC).

Two classic algorithms related to the shortest path problem, Dijkstra's algorithm and Bellman-Ford algorithm, belong to LS and LC, respectively.

The LS algorithm gradually revises the label through the iteration process. Each iteration selects the smallest label in the candidate node set to exit the candidate node set, and changes the node label from the temporary label to the permanent label for a long time. This is a shortest path algorithm based on the greedy strategy. Each time the label converted into a permanent label represents the shortest path to the current node, the "current best" is considered.

The LC algorithm does not necessarily change any node label from a temporary label to a permanent label at each iteration, but only corrects the temporary label once, and all node labels are still temporary labels; only when all iterations terminate, all nodes The label is also converted to a permanent label. The LC algorithm considers the "final optimal", and the shortest path needs to wait for multiple iterations until the entire algorithm is completed before it can be determined.

We mainly introduce the LS algorithm. Here we introduce the Dijkstra algorithm to solve the shortest path problem without time window constraints. In this algorithm, for node i, the label is (C [i], p [i]), where C [i] represents the shortest distance from the starting point to node i, and p [i] is recorded under d [i] distance , In the path from the starting point to node i, the node number before node i. s_0 represents the starting point. c_ij represents the distance through the edge (i, j). The execution process is as follows:

Step0: Initialize. Let S be empty, S * = N, C [s_0] = 0, p [s_0] =-1; let the initial distance label C [j] = ∞ for vertex i (i ≠ s_0) in N.

Step1: boundary judgment. If S = N, C [j] is the shortest path length, and the shortest path can be obtained by backtracking through the information recorded by p [j]. End. Otherwise, continue to step2.

Step2: Update the mark. From S to find the minimum total cost of node i, S it from deletion, addition of S. For all succeeding points j starting from i, if C [j]> C [i] + c_ij, then let C [j] = C [i] + c_ij, p [j] = i. Go to step1.

The main calculation of this algorithm is the step2 cycle. It includes two processes: the process of finding a node (finding the node i with the smallest cost from S *) and the process of updating the total cost (the cost of updating the node adjacent to the node i).

However, the simple Dijkstra algorithm cannot handle the time window constraint, nor can it handle negative weight edges: in the process of continuous loops, some edges are actually ignored by us. Even if its weight value is negative, we can optimize the cost. Won't care.

Below we will propose an improved version of the LS algorithm, which can both handle time window constraints and satisfy negative weight edges.

Dominant pruning: dominate

After understanding the LS algorithm to solve the shortest path problem, we return to the shortest problem under the time window constraint. Because of the weight of time, our marker can no longer record only one variable cost as in the previous section. We make a label for each state of each path to each point , (T, C) , and record the total time and total cost of the path when it reaches the point .

According to the definition, we can give the processing method of the mark:

Of course, you can use exhaustion to directly solve the problem with a method similar to Dijkstra. But we hope to find an effective pruning method to avoid the high time complexity caused by exhaustion. Thankfully, not all markers are valid for finding the shortest path from the starting point to each point. We illustrate by example:

dominate  rule allows us to filter out invalid tags .

We can use a function to visually express this relationship:

Obviously, in the figure, if the slope k> = 0 between the two points, the end point is dominated. (Such as X_i ^ 1 dominateX_i ^ 5) Because the two paths represented by the two markers will reach the same point, and the path at the end of the slope has a higher time and cost, of course, worse. When k = 0, we drew several straight lines in the figure . Each line is drawn by a point (representing a mark), and the next point ends. This means that in the time corresponding to this line, the cost of the mark is the minimum cost. In other cases, it is not possible to determine which path is better.

We filter through a function EFF (). In the introduction of the first part of LS, we mentioned the concept of permanent markers , which means that we have ensured the validity of permanent markers, and their marker values ​​will not change during the subsequent expansion process. We give a method to expand the permanent mark corresponding to node j.

definition:

Q_j is the set of permanent tags for node j. (The minimum cost of all markers in Q_j is the shortest path from p to j)

Expand Q_j in the following ways:

The extension here actually implies that all labels that may dominate the new label must exist in Q_j. How to guarantee this? We give a solution in the next section.

Mark processing order: dictionary sorting

In the process of LS processing tags, we expand the tags in the order of nodes, so for multiple tags of a node, we need to process them in a sequence. This sequence is best to find all invalid points in the expansion process, that is, EFF search while expanding.

In the function image, we use the slope k to represent the dominant relationship, it is easy to think of judging k from left to right , and find all the line segments where k> = 0. After conversion, it is sorted in the order of comparing T and then C. Therefore, we also consider storing in the order of judging T before judging C when storing the mark, and the processing is processed from small to large. This is called dictionary sorting . Obviously, this is a total sorting , which meets our needs.

We have the following three propositions:

The lexicographic order is born to cooperate with the dominate judgment. These are pruning- like operations to avoid exhaustion. After adding these two operations, you will find many infeasible paths during the enumeration process. Once it is not feasible, the expansion of the path will immediately stop .

We divide all the tags into three parts:

Q is a permanently marked collection

P is a collection of processed tags.

T is the set of unprocessed tags.

We sort all the tags in lexicographic order to ensure that all the tags in T cannot dominate the tags in P. Because the time d_ij of each edge is positive, the new mark that has been expanded must be arranged after the original mark, and the original mark cannot be dominate. The dominate relationship is transitive . According to the induction method , the mark in T cannot be dominate for any P mark.

We can also use the definition of P, Q, T to give a relationship:

We can use this formula to calculate T in the algorithm.

Algorithm flow and examples

A simple example:

Code sharing

The C ++ code is provided below. Chestnuts use the simple chestnuts above, named according to the above definition. After understanding the flow of the algorithm, the code itself is not difficult. The focus of the code here is to cooperate with the explanation, as a reference, so I did not choose complex data structures and grammatical skills. Friends in need can try it out as an exercise.

To download the relevant code and examples in this article, please pay attention to the public number [program sound], the background reply [SPPTWCPP] does not include []

This article is almost over here ~

This tweet has been revised many times. Thanks to Senior Director Deng Faheng and Teacher Qin Hu for their support, I have provided a lot of amendments! thank you very much!

The editor will work hard to write more exciting tweets for everyone!

See you next time ( _ ) / ~~

Guess you like

Origin www.cnblogs.com/dengfaheng/p/12672757.html