Table of contents
Common methods and techniques of search pruning
Keywords search method, pruning
Common methods and techniques of search pruning
keyword search method , pruning
Summary
Search is a commonly used method in computer problem solving, which is essentially the application of enumeration. Since it is equivalent to the enumeration method, its efficiency is quite high. Therefore, in order to improve the efficiency of search, people have come up with many pruning methods, such as branch and bound, heuristic search and so on. In the competition, we should not only master these methods proficiently, but also use some skills according to local conditions to improve the efficiency of search.
text
The efficiency of search is very low, even if pruning is good, it cannot make up for its defect in time complexity. Therefore, in solving problems, search should be used only when all other methods fail.
Now that the search is adopted, pruning is very necessary. Even simply setting a threshold or adding one or two judgments can have an amazing impact on the efficiency of the search. For example, for the question after N, if you judge after playing the queen, there will be a pause when you only count to 7, and it will exceed 20 seconds when you reach 8, and if you judge while playing, even if it reaches 10, there will be no pause. Therefore, you must prune when using search.
There are at least two aspects to pruning. One is to pruning from the method, such as using branch and bound, heuristic search, etc., which has a wider scope of application; the other is to use some small techniques. Although the applicability of this method is not as good as that of the first category, Sometimes it can only be applied to one question, but it is also very effective, and almost every question has some pruning skills of one kind or another, but each question is different.
Question 1: (shortest number sequence)
Table A and Table B each contain k (k<=20) elements, and the element numbers are from 1 to k. Each element in both tables is a string consisting of 0 and 1. (not spaces) The length of the string is <= 20. For example, there are two tables A and B in the following table, each of which contains 3 elements (k=3).
element number |
string |
1 |
1 |
2 |
10111 |
3 |
10 |
Table A Table B
element number |
string |
1 |
111 |
2 |
10 |
3 |
0 |
For Table A and Table B, there is a sequence of element numbers 2113, and the strings in Table A and Table B are used to replace the corresponding element numbers, and the same string sequence 101111110 can be obtained, as shown in the table below.
element number sequence |
2 |
1 |
1 |
3 |
Replace with the string from table A |
10111 |
1 |
1 |
10 |
Replace with the string from table B |
10 |
111 |
111 |
0 |
For Table A and Table B, the element number sequence with the above properties is called S(AB). For the above example S(AB)=2113.
Write a program: read in each element of Table A and Table B from the file, and find an element number sequence S(AB) with the shortest length that has the above properties. (If no number sequence with length <=100 can be found, output "No Answer".
For this question, since Table A and Table B are uncertain, it is impossible to find a mathematical method. Because what is looking for is the optimal solution, and the depth-first search can easily enter a dead end and waste time, so the breadth-first search method must be used. However, breadth-first search also has its drawbacks. When there are too many elements in table A and table B, there are quite a lot of expanded nodes, and the time spent on searching cannot meet the requirements of the test. In order to solve this problem, it is necessary to improve the search algorithm. Branch and bound doesn't seem to work because the cost cannot be determined. Moreover, since the target is uncertain, it is also impossible to set a valuation function. However, because the rules of this question can be used both forward and reverse, a two-way search can be used.
After the big method is determined, the framework of the algorithm has been basically formed, but even so, the algorithm still has room for improvement.
- It takes a lot of space to store the current A string and B string, but because most of the A string and B string are the same, it is only necessary to record the different parts and make a mark. Then replace it with dynamic storage.
- In order to ensure that the speed of expanding nodes in the two directions is relatively balanced, the direction of expanding the number of nodes each time can be adopted instead of expanding in two directions in turn.
In this way, the search efficiency is significantly improved compared to the simple breadth-first search.
(with program sab.pas)
Sometimes, the search also has different search methods (such as multiprocessor scheduling problems), and also produces different efficiencies.
Question 2: (task arrangement)
There are N cities, and several cities are connected by roads. A car transports goods between cities, always starting from city 1 and returning to city 1. The car needs to complete several tasks at a time, and each task requires the car to transport goods from one city to another. For example, if task 2->6 is to be completed, a journey of the car must contain a sub-path. First to 2, then to 6.
As shown in the figure below, if the required task is 2->3, 2->4, 3->1, 2->5, 6->4, then a path to complete all tasks is 1->2->3 ->1->2->5->6->4->1.
4
1 2 6
5
3 7
The programming reads the road distribution matrix from the file, and then finds a travel route for several tasks that are required to be completed, so that the total number of cities passed through is the least under the premise of completing the most tasks. As in the above example, the total number of cities passed is 8, and cities 1 and 2 are counted as 2 times each (the starting point is not counted), and N<60.
For this question, because it is difficult to find mathematical laws, the only way is to use search.
First of all, the first feeling is: start from city i, search all adjacent cities, and then determine the completion of the task according to the current city, and find the optimal solution from it. The efficiency of this search is extremely low, and the biggest reason is that the target is not clear.
According to the meaning of the question, we only need to reach the cities where goods need to be loaded and unloaded, and other cities are only used as intermediate processes, not as goals. Therefore, we must first determine the possible and impossible tasks, and then find the shortest path between any two cities. When searching, you only need to consider the city where the goods are to be shipped, or the goods to be transported to the city are all on the car, and other considerations do not need to be considered. At the same time, two simple thresholds can also be set. If the current cost + the city to be reached >= the current optimal solution, or the current cost + the cost of returning to city 1 >= the current optimal solution, then there is no need to continue searching.
This approach is very different from the first sense approach. (with program travell.pas)
Question 3: (Multiprocessor Scheduling Problem)
It is assumed that there are several identical processors P1, P2 Pn, and m independent jobs J1, J2 jm. The processors process jobs independently of each other. It is agreed that any job can run on any processor, but It is not allowed to interrupt the job before it is completed, and the job cannot be split into smaller jobs. It is known that the job Ji needs to be processed by the processor for Ti (i=1, 2 m). Programming accomplishes the following two tasks:
Task 1: Assuming that there are n processors and m jobs and the processing time Ti required for each job is stored in a file, find an optimal scheduling scheme to minimize the total man-hours and output of the m jobs working hours.
Task 2: Assuming that the job schedule and the limited completion time T are in the file, find the minimum number of processors and scheduling scheme required to complete the batch of jobs within the limited time T.
There are two search methods for this question:
Method 1: Search each job sequentially. When searching for a job, it is searched once per processor.
Method 2: Search each processor sequentially. When searching for a processor, place each job on it once.
Comparing the above two methods, it can be found that method two is easier to pruning than method one.
The following is a comparison of the two methods of pruning:
For method 1: only compare the time consumption of the processor with the longest time determined so far with the best solution at present.
For method two: Time[1]>Time[2]>Time[3]>>Time[n] can be agreed (Time[i] represents the processing time of the i-th processor), so that the threshold can be set: as The processing time of the current processor >= the current best solution, or the number of remaining processors × the processing time of the previous processor < the processing time required by the remaining jobs, then backtrack. Because under the previous constraints, it is impossible to have a solution.
Therefore, from the above comparison, the second method is obviously better than the first one. The following is a more in-depth discussion of the second method.
For task 1, first, the upper bound of Time[1] can be obtained greedily. Then, you can also use the lower bound of Time[1], UP (total job time/number of processors). (UP means the smallest integer greater than or equal to the decimal). The search starts from the upper bound, and after finding a solution, the search can be stopped if it is equal to the lower bound.
(with program jobs_1.pas)
For task two, depth + variable lower bound can be used. The lower bound is: UP (total operation time/limited time), that is, at least the number of processors required. And set the upper bound of Time[1] as T.
(attached program jobs_2.pas)
summary
The use of search is quite extensive, and almost every question can use the search method. Even so, search must not be abused. Search is only available when there is no pattern to find the problem. Once you decide to use search, you must find a way to prune it. Whether it is the common method of pruning or some small search techniques, although it cannot reduce the time complexity of the search, it is always beneficial.
program
1. Shortest number sequence: sab.pas
program sab;
type aa=string[100];
ltype=record
f:integer; {parent pointer}
k,d,la,lb:shortint;
{k--the remaining string flag, d--the number of elements in the sequence, la, lb--the length of the two strings of A and B}
st:^aa; {remaining string}
end;
const maxn=1300;
var t,h:array[0..1] of integer; {h--team head pointer, t-team tail pointer, 0 means forward, 1 means reverse}
p:array[0..1,1..maxn] of ltype; {p[0]--forward search table, p[1]--reverse search table}
strs:array[1..2,1..20] of string[20]; {strs[1]--table A element, strs[2]--table B element}
n:integer; {Number of elements in table A and table B}
procedure readp; {read data}
where f:text;
st:string;
i,j:integer;
begin
write('File name:');
readln(st);
assign(f,st);
reset(f);
readln(f,n);
for i:=1 to n do
readln(f,strs[1,i]);
for i:=1 to n do
readln(f,strs[2,i]);
close(f);
end;
procedure print(q,k:integer); {Starting from k, output the element number searched along the direction of q}
begin
if k<>1 then begin
if q=1 then
writeln(p[q,k].d);
print(q,p[q,k].f);
if q=0 then
writeln(p[q,k].d);
end;
end;
procedure check(q:shortint); {judging whether the two directions coincide, q means the opposite direction of the direction of the newly generated node}
where i:integer;
begin
for i:=1 to t[1-q]-1 do
if (p[q,t[q]].k<>p[1-q,i].k) and (p[q,t[q]].st^=p[1-q,i].st^) and
(p[q,t[q]].la+p[1-q,i].la<=100) and (p[q,t[q]].lb+p[1-q,i].lb<=100)
then begin
if q=0 then
begin
print(0,t[q]);
print(1,i);
end
else begin
print(0,i);
print(1,t[q]);
end;
halt;
end;
end;
procedure find(q:shortint); {expand a layer of nodes along the q direction}
where i:integer;
sa, sb: aa;
begin
for i:=1 to n do
if (p[q,h[q]].la+length(strs[1,i])<=100) and
(p[q,h[q]].lb+length(strs[2,i])<=100) then
begin
sa:='';sb:='';
if p[q,h[q]].k=1
then sa:=p[q,h[q]].st^
else sb:=p[q,h[q]].st^;
if q=0 then {add the element number i to the sequence in different directions}
begin
in:=in+strs[1,i];
sb:=sb+strs[2,i];
while (sa<>'') and (sb<>'') and (sa[1]=sb[1]) do
begin
delete(to,1,1);
delete(sb,1,1);
end
end
else begin
sa:=strs[1,i]+sa;sb:=strs[2,i]+sb;
while (sa<>'') and (sb<>'') and
(sa[length(sa)]=sb[length(sb)]) do
begin
delete(in,length(in),1);
delete(sb,length(sb),1);
end;
end;
if (sa='') or (sb='') then {generate a new node}
with p[q,t[q]] do
begin
f:=h[q];d:=i;
la:=p[q,h[q]].la+length(strs[1,i]);
lb:=p[q,h[q]].lb+length(strs[2,i]);
new(st);
if sa='' then
begin
k:=2;st^:=sb
end
else begin
k:=1;st^:=sa;
end;
check(q);
inc(t[q]);
end;
end;
inc(h[q]);
end;
begin
readp;
h[0]:=1;h[1]:=1;
t[0]:=2;t[1]:=2;
new(p[0,1].st);p[0,1].st^:='';
new(p[1,1].st);p[1,1].st^:='';
{queue initialization}
while (h[0]<t[0]) and (h[1]<t[1]) and (t[0]<maxn) and (t[1]<maxn) do
if t[0]<t[1] {Compare the number of nodes in the two directions, and expand to the direction with fewer nodes}
then find(0)
else find(1);
writeln('No answer!');
end.
- Task arrangement: travell.pas
program travell;
var path, {path[i,j]--the jth transportation destination starting from i}
next, {next[i,j]--the next vertex of vertex i in the shortest path from i to j}
dist, {dist[i,j]--the shortest path length from i to j}
road:array[1..60,1..60] of integer; {road adjacency matrix}
head, {head[i]--the number of tasks starting from i}
tail, {tail[i]--0 means no task or completed with i as the end point}
{1 means that all vertices of the task ending at i are in the path to complete the task}
{k+1 means all tasks with i as the end point, and there are still k vertices not reached}
arrive:array[1..60] of integer; {arrive[i]--the passing times of vertex i}
d, {complete task path}
bestd:array[1..100] of integer; {the current best path to complete the task}
left, {the number of nodes that must be passed through}
cost, {current cost}
mincost, {best cost to complete the task}
s, {number of vertices passed through}
bests, {the number of vertices that best complete the task}
m, {number of tasks}
n:integer; {Number of cities}
procedure findshortest; {Find the shortest path between any two points}
was i,j,k:integer;
begin
for i:=1 to n do
for j:=1 to n do
if road[i,j]=1 then
begin
dist[i,j]:=1;
next[i,j]:=j
end
else dist[i,j]:=100;
for k:=1 to n do
for i:=1 to n do
for j:=1 to n do
if dist[i,k]+dist[k,j]<dist[i,j] then
begin
dist[i,j]:=dist[i,k]+dist[k,j];
next[i,j]:=next[i,k];
end;
end;
procedure init; {read in data and initialize data}
was i,j,k:integer;
st:string;
f:text;
begin
write('File name:');
readln(st);
assign(f,st);
reset(f);
readln(f,n);
for i:=1 to n do
for j:=1 to n do
read(f,road[i,j]);
findshortest;
readln(f,m);
for i:=1 to m do
begin
read(f,j,k);
if (dist[1,j]<100) and (dist[1,k]<100) then
begin
inc(head[j]);
inc(tail[k]);
path[j,head[j]]:=k;
end;
end;
close(f);
for i:=1 to m do
if tail[i]>0 then
inc(tail[i]);
for i:=1 to head[1] do
dec(tail[path[1,i]]);
head[1]:=0;inc(s);d[s]:=1;left:=0;
cost:=0;mincost:=maxint;
for i:=2 to n do
if (head[i]>0) or (tail[i]>0) then
inc(left);
end;
procedure try; {search procedure}
was i,j,k:integer;
p:boolean;
begin
if (cost+left>=mincost) or (cost+dist[1,d[s]]>=mincost) then exit;
if left=0 then {whether all tasks are completed}
begin
mincost:=cost+dist[1,d[s]];
bestd:=d;
bests:=s;
inc(bests);bestd[bests]:=1;
exit;
end;
for i:=2 to n do
if (head[i]>0) or (tail[i]=1) then {if it is necessary to go to i vertex}
begin
inc(cost,dist[d[s],i]);
inc(arrive[i]);
inc(s);
d[s]:=i;
if arrive[i]=1 then
{If the i vertex arrives for the first time, the end tail value of all tasks starting from i will be reduced by 1}
for j:=1 to head[i] do
dec(tail[path[i,j]]);
k:=head[i];
head[i]:=0;
if tail[i]=1 then
{If all tasks with i as the end point are completed, this point does not need to go through}
begin
p:=true;
dec(tail[i]);
end
else p:=false;
if tail[i]=0 then
dec(left);
try;
{restore data before recursion}
if tail[i]=0 then
inc(left);
if true then
inc(tail[i]);
head[i]:=k;
if arrive[i]=1 then
for j:=1 to head[i] do
inc(tail[path[i,j]]);
dec(s);
dec(arrive[i]);
dec(cost,dist[d[s],i]);
end;
end;
procedure show(i,j:integer); {output the shortest path from i to j}
begin
while i<>j do begin
write('-->',next[i,j]);i:=next[i,j];
end;
end;
procedure print; {output the best task scheduling scheme}
where i:integer;
begin
write(1);
for i:=1 to bests-1 do
show(bestd[i],bestd[i+1]);
writeln;
writeln('Min cost=',mincost);
end;
begin
init;
try;
print;
end.
- Multiprocessor scheduling problem:
Task 1: jobs_1.pas
program jobs_1;
const maxn=100; {maximum number of processors}
maxm=100; {maximum number of jobs}
was
t:array[1..maxm] of timeint; {t[i]--the time required to process job i}
time, {time[i]--the processing time of the i processor}
l, {l[i]--the number of jobs processed by the i-th processor}
l1:array[0..maxn] of timeint; {l1[i]--the number of jobs processed by processor i in the current optimal solution}
a, {a[i,j]--The time spent on the jth job processed by the i-th processor}
a1:array[1..maxn,1..maxm] of integer;
{a1[i,j]--the time spent on the jth job processed by the i-th processor in the current optimal solution}
done:array[1..maxm] of boolean; {done[i]--true means job i has been completed, false means it has not been completed}
least, {lower bound of processing time}
i,j,k,n,m,
min, {the processing time of the current optimal solution}
rest:integer; {total time of remaining jobs}
procedure print; {output optimal solution}
was i,j:integer;
begin
for i:=1 to n do
begin
write(i,':');
for j:=1 to l1[i] do
write(a1[i,j]:4);
writeln;
end;
writeln('T0=',time[0]+1);
end;
procedure readp; {read data}
was
f:text;
st:string;
i,j,k:integer;
begin
write('File name:');readln(st);
assign(f,st);reset(f);
readln(f,n,m);
for i:=1 to m do
begin
read(f,t[i]);inc(rest,t[i]);
end;
close(f);
least:=(rest-1) div n+1; {set the lower bound}
for i:=1 to m-1 do {排序}
for j:=i+1 to m do
if t[j]>t[i] then
begin k:=t[i];t[i]:=t[j];t[j]:=k end;
end;
procedure try(p,q:integer); {select jobs from p--m and put them on processor q}
was j:integer;
z:boolean;
begin
z:=true;
for j:=p to m do
if not done[j] and (time[q]+t[j]<=time[q-1]) then {choose a suitable job}
begin
z:=false;done[j]:=true;
inc(l[q]);a[q,l[q]]:=t[j];inc(time[q],t[j]);dec(rest,t[j]);
try(j+1,q);
dec(l[q]);dec(time[q],t[j]);inc(rest,t[j]);done[j]:=false;
if time[1]>time[0] then exit;
{Return to processor 1 after finding the solution, need to update time[1] to reduce it to time[0]}
{2--n processors do not need to search any more}
end;
if z and ((nq)*time[q]>=rest) then {if processor q is unable to place any jobs}
if rest=0 then {find a set of solutions}
begin
a1:=a;l1:=l;
time[0]:=time[1]-1;
if time[1]=least then
begin print;halt end;
end
else if q<n then {continue searching}
try(1,q+1);
end;
begin
readp;
fillchar(time,sizeof(time),0);
fillchar(a,sizeof(a),0);
fillchar(l1,sizeof(l1),0);
fillchar(l,sizeof(l),0);
for i:=1 to m do {greedy seeking upper bound}
begin
k:=1;
for j:=2 to n do
if time[j]<time[k] then k:=j;
time[k]:=time[k]+t[i];
l1[k]:=l1[k]+1;
a1[k,l1[k]]:=t[i];
end;
min:=time[1];time[0]:=min-1;
for i:=2 to n do
if time[i]>min then min:=time[i];
if min=least then {if the upper and lower bounds are equal}
begin
print;
halt;
end;
fillchar(time,sizeof(time),0);
time[0]:=min-1; {Reduce the upper bound by 1 to find a better solution}
try(1,1);
print;
end.
Task 2: jobs_2.pas
program jobs_2;
const maxn=100; {maximum number of processors}
maxm=100; {maximum number of jobs}
was
t:array[1..maxm] of timeint; {t[i]--the time required to process job i}
time, {time[i]--the processing time of the i processor}
l, {l[i]--the number of jobs processed by the i-th processor}
l1:array[0..maxn] of timeint; {l1[i]--the number of jobs processed by processor i in the current optimal solution}
a, {a[i,j]--The time spent on the jth job processed by the i-th processor}
a1:array[1..maxn,1..maxm] of integer;
{a1[i,j]--the time spent on the jth job processed by the i-th processor in the current optimal solution}
done:array[1..maxm] of boolean; {done[i]--true means job i has been completed, false means it has not been completed}
least, {lower bound of processing time}
i,j,k,tmax,m,
min, {the processing time of the current optimal solution}
rest:integer; {total time of remaining jobs}
procedure print; {output optimal solution}
was i,j:integer;
begin
for i:=1 to least do
begin
write(i,':');
for j:=1 to l1[i] do
write(a1[i,j]:4);
writeln;
end;
writeln('Min=',least);
end;
procedure readp; {read data}
was
f:text;
st:string;
i,j,k:integer;
begin
write('File name:');readln(st);
assign(f,st);reset(f);
readln(f,tmax,m);
for i:=1 to m do begin
read(f,t[i]);inc(rest,t[i]);
end;
close(f);
least:=(rest-1) div tmax+1; {determine the lower bound}
for i:=1 to m-1 do {排序}
for j:=i+1 to m do
if t[j]>t[i] then
begin k:=t[i];t[i]:=t[j];t[j]:=k end;
end;
procedure try(p,q:integer); {select jobs from p--m and put them on processor q}
was j:integer;
z:boolean;
begin
z:=true;
for j:=p to m do
if not done[j] and (time[q]+t[j]<=time[q-1]) then {find a suitable job}
begin
z:=false;done[j]:=true;
inc(l[q]);a[q,l[q]]:=t[j];inc(time[q],t[j]);dec(rest,t[j]);
try(j+1,q);
dec(l[q]);dec(time[q],t[j]);inc(rest,t[j]);done[j]:=false;
end;
if z and ((least-q)*time[q]>=rest) then {if processor q is unable to place any jobs}
if rest=0 then {find the optimal solution}
begin
a1:=a;l1:=l;
print;stop
end
else if q<min then {continue searching}
try(1,q+1);
end;
begin
readp;
for i:=1 to m do {judging no solution, that is, the time required for a certain task exceeds the specified time}
if t[i]>tmax then
begin writeln('No answer!');exit end;
repeat
fillchar(time,sizeof(time),0);
fillchar(l,sizeof(l),0);
fillchar(l1,sizeof(l1),0);
for i:=1 to m do {greedy seeking upper bound}
begin
k:=1;
for j:=2 to least do
if time[j]<time[k] then k:=j;
time[k]:=time[k]+t[i];
l1[k]:=l1[k]+1;
a1[k,l1[k]]:=t[i];
end;
min:=time[1];
for i:=2 to least do
if time[i]>min then min:=time[i];
if min=least then {if greedy gets the optimal solution}
begin
print;stop;
end;
fillchar(time,sizeof(time),0);
time[0]:=tmax;
try(1,1);
inc(least); {lower bound plus one}
until least>m;
print;
end.
bibliography
- "International and Domestic Youth Informatics (Computer) Olympiad Questions Analysis (1994-1995", edited by Wu Wenhu and Wang Jiande, Tsinghua University Press.
- "Olympic Computer (Informatics) Introduction", edited by Jiang Wenzai, Shanghai Jiaotong University Press.