Common methods and techniques of c++ search pruning

Table of contents

      

            Common methods and techniques of search pruning

Keywords search method, pruning

Summary

text

  summary

program

bibliography


      

            Common methods and techniques of search pruning

keyword search method , pruning  

Summary

Search is a commonly used method in computer problem solving, which is essentially the application of enumeration. Since it is equivalent to the enumeration method, its efficiency is quite high. Therefore, in order to improve the efficiency of search, people have come up with many pruning methods, such as branch and bound, heuristic search and so on. In the competition, we should not only master these methods proficiently, but also use some skills according to local conditions to improve the efficiency of search.

text

The efficiency of search is very low, even if pruning is good, it cannot make up for its defect in time complexity. Therefore, in solving problems, search should be used only when all other methods fail.

Now that the search is adopted, pruning is very necessary. Even simply setting a threshold or adding one or two judgments can have an amazing impact on the efficiency of the search. For example, for the question after N, if you judge after playing the queen, there will be a pause when you only count to 7, and it will exceed 20 seconds when you reach 8, and if you judge while playing, even if it reaches 10, there will be no pause. Therefore, you must prune when using search.

There are at least two aspects to pruning. One is to pruning from the method, such as using branch and bound, heuristic search, etc., which has a wider scope of application; the other is to use some small techniques. Although the applicability of this method is not as good as that of the first category, Sometimes it can only be applied to one question, but it is also very effective, and almost every question has some pruning skills of one kind or another, but each question is different.

Question 1: (shortest number sequence)

Table A and Table B each contain k (k<=20) elements, and the element numbers are from 1 to k. Each element in both tables is a string consisting of 0 and 1. (not spaces) The length of the string is <= 20. For example, there are two tables A and B in the following table, each of which contains 3 elements (k=3).

    element number

     string

       1

      1

       2

      10111

       3

      10

Table A Table B

    element number

     string

       1

      111

       2

      10

       3

      0

For Table A and Table B, there is a sequence of element numbers 2113, and the strings in Table A and Table B are used to replace the corresponding element numbers, and the same string sequence 101111110 can be obtained, as shown in the table below.

  element number sequence

    2

    1

    1

    3

  Replace with the string from table A

   10111

    1

    1

    10

  Replace with the string from table B

    10

   111

   111

    0

For Table A and Table B, the element number sequence with the above properties is called S(AB). For the above example S(AB)=2113.

Write a program: read in each element of Table A and Table B from the file, and find an element number sequence S(AB) with the shortest length that has the above properties. (If no number sequence with length <=100 can be found, output "No Answer".

For this question, since Table A and Table B are uncertain, it is impossible to find a mathematical method. Because what is looking for is the optimal solution, and the depth-first search can easily enter a dead end and waste time, so the breadth-first search method must be used. However, breadth-first search also has its drawbacks. When there are too many elements in table A and table B, there are quite a lot of expanded nodes, and the time spent on searching cannot meet the requirements of the test. In order to solve this problem, it is necessary to improve the search algorithm. Branch and bound doesn't seem to work because the cost cannot be determined. Moreover, since the target is uncertain, it is also impossible to set a valuation function. However, because the rules of this question can be used both forward and reverse, a two-way search can be used.

After the big method is determined, the framework of the algorithm has been basically formed, but even so, the algorithm still has room for improvement.

  1. It takes a lot of space to store the current A string and B string, but because most of the A string and B string are the same, it is only necessary to record the different parts and make a mark. Then replace it with dynamic storage.
  2. In order to ensure that the speed of expanding nodes in the two directions is relatively balanced, the direction of expanding the number of nodes each time can be adopted instead of expanding in two directions in turn.

In this way, the search efficiency is significantly improved compared to the simple breadth-first search.

(with program sab.pas)

Sometimes, the search also has different search methods (such as multiprocessor scheduling problems), and also produces different efficiencies.

Question 2: (task arrangement)

There are N cities, and several cities are connected by roads. A car transports goods between cities, always starting from city 1 and returning to city 1. The car needs to complete several tasks at a time, and each task requires the car to transport goods from one city to another. For example, if task 2->6 is to be completed, a journey of the car must contain a sub-path. First to 2, then to 6.

As shown in the figure below, if the required task is 2->3, 2->4, 3->1, 2->5, 6->4, then a path to complete all tasks is 1->2->3 ->1->2->5->6->4->1.

 4

1   2    6

  5

  3 7

    The programming reads the road distribution matrix from the file, and then finds a travel route for several tasks that are required to be completed, so that the total number of cities passed through is the least under the premise of completing the most tasks. As in the above example, the total number of cities passed is 8, and cities 1 and 2 are counted as 2 times each (the starting point is not counted), and N<60.

For this question, because it is difficult to find mathematical laws, the only way is to use search.

First of all, the first feeling is: start from city i, search all adjacent cities, and then determine the completion of the task according to the current city, and find the optimal solution from it. The efficiency of this search is extremely low, and the biggest reason is that the target is not clear.

According to the meaning of the question, we only need to reach the cities where goods need to be loaded and unloaded, and other cities are only used as intermediate processes, not as goals. Therefore, we must first determine the possible and impossible tasks, and then find the shortest path between any two cities. When searching, you only need to consider the city where the goods are to be shipped, or the goods to be transported to the city are all on the car, and other considerations do not need to be considered. At the same time, two simple thresholds can also be set. If the current cost + the city to be reached >= the current optimal solution, or the current cost + the cost of returning to city 1 >= the current optimal solution, then there is no need to continue searching.

This approach is very different from the first sense approach. (with program travell.pas)

Question 3: (Multiprocessor Scheduling Problem)

It is assumed that there are several identical processors P1, P2 Pn, and m independent jobs J1, J2 jm. The processors process jobs independently of each other. It is agreed that any job can run on any processor, but It is not allowed to interrupt the job before it is completed, and the job cannot be split into smaller jobs. It is known that the job Ji needs to be processed by the processor for Ti (i=1, 2 m). Programming accomplishes the following two tasks:

Task 1: Assuming that there are n processors and m jobs and the processing time Ti required for each job is stored in a file, find an optimal scheduling scheme to minimize the total man-hours and output of the m jobs working hours.

Task 2: Assuming that the job schedule and the limited completion time T are in the file, find the minimum number of processors and scheduling scheme required to complete the batch of jobs within the limited time T.

There are two search methods for this question:

Method 1: Search each job sequentially. When searching for a job, it is searched once per processor.

Method 2: Search each processor sequentially. When searching for a processor, place each job on it once.

Comparing the above two methods, it can be found that method two is easier to pruning than method one.

The following is a comparison of the two methods of pruning:

For method 1: only compare the time consumption of the processor with the longest time determined so far with the best solution at present.

For method two: Time[1]>Time[2]>Time[3]>>Time[n] can be agreed (Time[i] represents the processing time of the i-th processor), so that the threshold can be set: as The processing time of the current processor >= the current best solution, or the number of remaining processors × the processing time of the previous processor < the processing time required by the remaining jobs, then backtrack. Because under the previous constraints, it is impossible to have a solution.

    Therefore, from the above comparison, the second method is obviously better than the first one. The following is a more in-depth discussion of the second method.

   For task 1, first, the upper bound of Time[1] can be obtained greedily. Then, you can also use the lower bound of Time[1], UP (total job time/number of processors). (UP means the smallest integer greater than or equal to the decimal). The search starts from the upper bound, and after finding a solution, the search can be stopped if it is equal to the lower bound.

(with program jobs_1.pas)

    For task two, depth + variable lower bound can be used. The lower bound is: UP (total operation time/limited time), that is, at least the number of processors required. And set the upper bound of Time[1] as T.

(attached program jobs_2.pas)

  summary

The use of search is quite extensive, and almost every question can use the search method. Even so, search must not be abused. Search is only available when there is no pattern to find the problem. Once you decide to use search, you must find a way to prune it. Whether it is the common method of pruning or some small search techniques, although it cannot reduce the time complexity of the search, it is always beneficial.

program

1. Shortest number sequence: sab.pas

program sab;

type aa=string[100];

     ltype=record

                 f:integer; {parent pointer}

                 k,d,la,lb:shortint;

   {k--the remaining string flag, d--the number of elements in the sequence, la, lb--the length of the two strings of A and B}

                 st:^aa; {remaining string}

     end;

const maxn=1300;

var t,h:array[0..1] of integer; {h--team head pointer, t-team tail pointer, 0 means forward, 1 means reverse}

    p:array[0..1,1..maxn] of ltype; {p[0]--forward search table, p[1]--reverse search table}

    strs:array[1..2,1..20] of string[20]; {strs[1]--table A element, strs[2]--table B element}

    n:integer; {Number of elements in table A and table B}

procedure readp; {read data}

where f:text;

    st:string;

    i,j:integer;

begin

     write('File name:');

     readln(st);

     assign(f,st);

     reset(f);

     readln(f,n);

     for i:=1 to n do

         readln(f,strs[1,i]);

     for i:=1 to n do

         readln(f,strs[2,i]);

     close(f);

end;

procedure print(q,k:integer); {Starting from k, output the element number searched along the direction of q}

begin

     if k<>1 then begin

        if q=1 then

           writeln(p[q,k].d);

        print(q,p[q,k].f);

        if q=0 then

           writeln(p[q,k].d);

     end;

end;

procedure check(q:shortint); {judging whether the two directions coincide, q means the opposite direction of the direction of the newly generated node}

where i:integer;

begin

     for i:=1 to t[1-q]-1 do

         if (p[q,t[q]].k<>p[1-q,i].k) and (p[q,t[q]].st^=p[1-q,i].st^) and

            (p[q,t[q]].la+p[1-q,i].la<=100) and (p[q,t[q]].lb+p[1-q,i].lb<=100)

            then begin

                      if q=0 then

                         begin

                              print(0,t[q]);

                              print(1,i);

                         end

                      else begin

                                print(0,i);

                                print(1,t[q]);

                           end;

                      halt;

            end;

end;

procedure find(q:shortint); {expand a layer of nodes along the q direction}

where i:integer;

    sa, sb: aa;

begin

     for i:=1 to n do

         if (p[q,h[q]].la+length(strs[1,i])<=100) and

            (p[q,h[q]].lb+length(strs[2,i])<=100) then

            begin

                 sa:='';sb:='';

                 if p[q,h[q]].k=1

                    then sa:=p[q,h[q]].st^

                    else sb:=p[q,h[q]].st^;

                 if q=0 then {add the element number i to the sequence in different directions}

                    begin

                         in:=in+strs[1,i];

                         sb:=sb+strs[2,i];

                         while (sa<>'') and (sb<>'') and (sa[1]=sb[1]) do

                               begin

                                    delete(to,1,1);

                                    delete(sb,1,1);

                               end

                    end

                 else begin

                           sa:=strs[1,i]+sa;sb:=strs[2,i]+sb;

                           while (sa<>'') and (sb<>'') and

                                 (sa[length(sa)]=sb[length(sb)]) do

                                 begin

                                      delete(in,length(in),1);

                                      delete(sb,length(sb),1);

                                 end;

                      end;

                 if (sa='') or (sb='') then {generate a new node}

                    with p[q,t[q]] do

                         begin

                              f:=h[q];d:=i;

                              la:=p[q,h[q]].la+length(strs[1,i]);

                              lb:=p[q,h[q]].lb+length(strs[2,i]);

                              new(st);

                              if sa='' then

                                 begin

                                      k:=2;st^:=sb

                                 end

                              else begin

                                        k:=1;st^:=sa;

                                   end;

                              check(q);

                              inc(t[q]);

                         end;

            end;

     inc(h[q]);

end;

begin

     readp;

     h[0]:=1;h[1]:=1;

     t[0]:=2;t[1]:=2;

     new(p[0,1].st);p[0,1].st^:='';

     new(p[1,1].st);p[1,1].st^:='';

     {queue initialization}

     while (h[0]<t[0]) and (h[1]<t[1]) and (t[0]<maxn) and (t[1]<maxn) do

           if t[0]<t[1] {Compare the number of nodes in the two directions, and expand to the direction with fewer nodes}

              then find(0)

              else find(1);

     writeln('No answer!');

end.

  1. Task arrangement: travell.pas

program travell;

var path, {path[i,j]--the jth transportation destination starting from i}

    next, {next[i,j]--the next vertex of vertex i in the shortest path from i to j}

    dist, {dist[i,j]--the shortest path length from i to j}

    road:array[1..60,1..60] of integer; {road adjacency matrix}

    head, {head[i]--the number of tasks starting from i}

tail, {tail[i]--0 means no task or completed with i as the end point}

   {1 means that all vertices of the task ending at i are in the path to complete the task}

   {k+1 means all tasks with i as the end point, and there are still k vertices not reached}

    arrive:array[1..60] of integer; {arrive[i]--the passing times of vertex i}

    d, {complete task path}

    bestd:array[1..100] of integer; {the current best path to complete the task}

    left, {the number of nodes that must be passed through}

    cost, {current cost}

    mincost, {best cost to complete the task}

    s, {number of vertices passed through}

    bests, {the number of vertices that best complete the task}

    m, {number of tasks}

    n:integer; {Number of cities}

procedure findshortest; {Find the shortest path between any two points}

was i,j,k:integer;

begin

     for i:=1 to n do

         for j:=1 to n do

             if road[i,j]=1 then

                begin

                     dist[i,j]:=1;

                     next[i,j]:=j

                end

             else dist[i,j]:=100;

     for k:=1 to n do

         for i:=1 to n do

             for j:=1 to n do

                 if dist[i,k]+dist[k,j]<dist[i,j] then

                    begin

                         dist[i,j]:=dist[i,k]+dist[k,j];

                         next[i,j]:=next[i,k];

                    end;

end;

procedure init; {read in data and initialize data}

was i,j,k:integer;

    st:string;

    f:text;

begin

     write('File name:');

     readln(st);

     assign(f,st);

     reset(f);

     readln(f,n);

     for i:=1 to n do

         for j:=1 to n do

             read(f,road[i,j]);

     findshortest;

     readln(f,m);

     for i:=1 to m do

         begin

              read(f,j,k);

              if (dist[1,j]<100) and (dist[1,k]<100) then

                 begin

                      inc(head[j]);

                      inc(tail[k]);

                      path[j,head[j]]:=k;

                 end;

         end;

     close(f);

     for i:=1 to m do

         if tail[i]>0 then

            inc(tail[i]);

     for i:=1 to head[1] do

         dec(tail[path[1,i]]);

     head[1]:=0;inc(s);d[s]:=1;left:=0;

     cost:=0;mincost:=maxint;

     for i:=2 to n do

         if (head[i]>0) or (tail[i]>0) then

            inc(left);

end;

procedure try; {search procedure}

was i,j,k:integer;

    p:boolean;

begin

     if (cost+left>=mincost) or (cost+dist[1,d[s]]>=mincost) then exit;

     if left=0 then {whether all tasks are completed}

        begin

             mincost:=cost+dist[1,d[s]];

             bestd:=d;

             bests:=s;

             inc(bests);bestd[bests]:=1;

             exit;

        end;

     for i:=2 to n do

         if (head[i]>0) or (tail[i]=1) then {if it is necessary to go to i vertex}

            begin

                 inc(cost,dist[d[s],i]);

                 inc(arrive[i]);

                 inc(s);

                 d[s]:=i;

                 if arrive[i]=1 then  

{If the i vertex arrives for the first time, the end tail value of all tasks starting from i will be reduced by 1}

                    for j:=1 to head[i] do

                        dec(tail[path[i,j]]);

                 k:=head[i];

                 head[i]:=0;

                 if tail[i]=1 then  

 {If all tasks with i as the end point are completed, this point does not need to go through}

                    begin

                         p:=true;

                         dec(tail[i]);

                    end

                    else p:=false;

                 if tail[i]=0 then  

                    dec(left);

                 try;

   {restore data before recursion}

                 if tail[i]=0 then

                    inc(left);

                 if true then

                    inc(tail[i]);

                 head[i]:=k;

                 if arrive[i]=1 then

                    for j:=1 to head[i] do

                        inc(tail[path[i,j]]);

                 dec(s);

                 dec(arrive[i]);

                 dec(cost,dist[d[s],i]);

         end;

end;

procedure show(i,j:integer); {output the shortest path from i to j}  

begin

     while i<>j do begin

           write('-->',next[i,j]);i:=next[i,j];

     end;

end;

procedure print; {output the best task scheduling scheme}

where i:integer;

begin

     write(1);

     for i:=1 to bests-1 do

         show(bestd[i],bestd[i+1]);

     writeln;

     writeln('Min cost=',mincost);

end;

begin

     init;

     try;

     print;

end.

  1. Multiprocessor scheduling problem:

Task 1: jobs_1.pas

program jobs_1;

const maxn=100; {maximum number of processors}

      maxm=100; {maximum number of jobs}

was

   t:array[1..maxm] of timeint; {t[i]--the time required to process job i}

   time, {time[i]--the processing time of the i processor}

   l, {l[i]--the number of jobs processed by the i-th processor}

   l1:array[0..maxn] of timeint; {l1[i]--the number of jobs processed by processor i in the current optimal solution}

   a, {a[i,j]--The time spent on the jth job processed by the i-th processor}

   a1:array[1..maxn,1..maxm] of integer;

   {a1[i,j]--the time spent on the jth job processed by the i-th processor in the current optimal solution}

   done:array[1..maxm] of boolean; {done[i]--true means job i has been completed, false means it has not been completed}

   least, {lower bound of processing time}

   i,j,k,n,m,

   min, {the processing time of the current optimal solution}

   rest:integer; {total time of remaining jobs}

procedure print; {output optimal solution}

was i,j:integer;

begin

     for i:=1 to n do

         begin

              write(i,':');

              for j:=1 to l1[i] do

                  write(a1[i,j]:4);

              writeln;

         end;

     writeln('T0=',time[0]+1);

end;

procedure readp; {read data}

was

   f:text;

   st:string;

   i,j,k:integer;

begin

     write('File name:');readln(st);

     assign(f,st);reset(f);

     readln(f,n,m);

     for i:=1 to m do

  begin

          read(f,t[i]);inc(rest,t[i]);

         end;

     close(f);

     least:=(rest-1) div n+1; {set the lower bound}

     for i:=1 to m-1 do  {排序}

         for j:=i+1 to m do

             if t[j]>t[i] then

                begin k:=t[i];t[i]:=t[j];t[j]:=k end;

end;

procedure try(p,q:integer); {select jobs from p--m and put them on processor q}

was j:integer;

    z:boolean;

begin

     

     z:=true;

     for j:=p to m do

         if not done[j] and (time[q]+t[j]<=time[q-1]) then {choose a suitable job}

            begin

                 z:=false;done[j]:=true;

                 inc(l[q]);a[q,l[q]]:=t[j];inc(time[q],t[j]);dec(rest,t[j]);

                 try(j+1,q);

                 dec(l[q]);dec(time[q],t[j]);inc(rest,t[j]);done[j]:=false;

   if time[1]>time[0] then exit;

   {Return to processor 1 after finding the solution, need to update time[1] to reduce it to time[0]}

   {2--n processors do not need to search any more}

            end;

     if z and ((nq)*time[q]>=rest) then {if processor q is unable to place any jobs}

        if rest=0 then {find a set of solutions}

           begin

                a1:=a;l1:=l;

                time[0]:=time[1]-1;

                if time[1]=least then

                   begin print;halt end;

           end

        else if q<n then {continue searching}

                try(1,q+1);

end;

begin

     readp;

     fillchar(time,sizeof(time),0);

     fillchar(a,sizeof(a),0);

     fillchar(l1,sizeof(l1),0);

     fillchar(l,sizeof(l),0);

     for i:=1 to m do {greedy seeking upper bound}

         begin

              k:=1;

              for j:=2 to n do

                  if time[j]<time[k] then k:=j;

              time[k]:=time[k]+t[i];

              l1[k]:=l1[k]+1;

              a1[k,l1[k]]:=t[i];

         end;

     min:=time[1];time[0]:=min-1;

     for i:=2 to n do

         if time[i]>min then min:=time[i];

     if min=least then {if the upper and lower bounds are equal}

        begin

             print;

             halt;

        end;

     fillchar(time,sizeof(time),0);

     time[0]:=min-1; {Reduce the upper bound by 1 to find a better solution}

     try(1,1);

     print;

end.

 Task 2: jobs_2.pas

program jobs_2;

const maxn=100; {maximum number of processors}

      maxm=100; {maximum number of jobs}

was

   t:array[1..maxm] of timeint; {t[i]--the time required to process job i}

   time, {time[i]--the processing time of the i processor}

   l, {l[i]--the number of jobs processed by the i-th processor}

   l1:array[0..maxn] of timeint; {l1[i]--the number of jobs processed by processor i in the current optimal solution}

   a, {a[i,j]--The time spent on the jth job processed by the i-th processor}

   a1:array[1..maxn,1..maxm] of integer;

   {a1[i,j]--the time spent on the jth job processed by the i-th processor in the current optimal solution}

   done:array[1..maxm] of boolean; {done[i]--true means job i has been completed, false means it has not been completed}

   least, {lower bound of processing time}

   i,j,k,tmax,m,

   min, {the processing time of the current optimal solution}

   rest:integer; {total time of remaining jobs}

procedure print; {output optimal solution}

was i,j:integer;

begin

     for i:=1 to least do

         begin

              write(i,':');

              for j:=1 to l1[i] do

                  write(a1[i,j]:4);

              writeln;

         end;

     writeln('Min=',least);

end;

procedure readp; {read data}

was

   f:text;

   st:string;

   i,j,k:integer;

begin

     write('File name:');readln(st);

     assign(f,st);reset(f);

     readln(f,tmax,m);

     for i:=1 to m do begin

         read(f,t[i]);inc(rest,t[i]);

     end;

     close(f);

     least:=(rest-1) div tmax+1; {determine the lower bound}

     for i:=1 to m-1 do  {排序}

         for j:=i+1 to m do

             if t[j]>t[i] then

                begin k:=t[i];t[i]:=t[j];t[j]:=k end;

end;

procedure try(p,q:integer); {select jobs from p--m and put them on processor q}

was j:integer;

    z:boolean;

begin

     z:=true;

     for j:=p to m do

         if not done[j] and (time[q]+t[j]<=time[q-1]) then {find a suitable job}

            begin

                 z:=false;done[j]:=true;

                 inc(l[q]);a[q,l[q]]:=t[j];inc(time[q],t[j]);dec(rest,t[j]);

                 try(j+1,q);

                 dec(l[q]);dec(time[q],t[j]);inc(rest,t[j]);done[j]:=false;

            end;

     if z and ((least-q)*time[q]>=rest) then {if processor q is unable to place any jobs}

        if rest=0 then {find the optimal solution}

           begin

                a1:=a;l1:=l;

              print;stop

           end

        else if q<min then {continue searching}

                try(1,q+1);

end;

begin

     readp;

     for i:=1 to m do {judging no solution, that is, the time required for a certain task exceeds the specified time}

         if t[i]>tmax then

            begin writeln('No answer!');exit end;

     repeat

           fillchar(time,sizeof(time),0);

           fillchar(l,sizeof(l),0);

           fillchar(l1,sizeof(l1),0);

           for i:=1 to m do {greedy seeking upper bound}

               begin

                    k:=1;

                    for j:=2 to least do

                        if time[j]<time[k] then k:=j;

                    time[k]:=time[k]+t[i];

                    l1[k]:=l1[k]+1;

                    a1[k,l1[k]]:=t[i];

               end;

           min:=time[1];

           for i:=2 to least do

               if time[i]>min then min:=time[i];

           if min=least then {if greedy gets the optimal solution}

              begin

                   print;stop;

              end;

           fillchar(time,sizeof(time),0);

           time[0]:=tmax;

           try(1,1);

           inc(least); {lower bound plus one}

     until least>m;

     print;

end.

bibliography

  1. "International and Domestic Youth Informatics (Computer) Olympiad Questions Analysis (1994-1995", edited by Wu Wenhu and Wang Jiande, Tsinghua University Press.
  2. "Olympic Computer (Informatics) Introduction", edited by Jiang Wenzai, Shanghai Jiaotong University Press.

Guess you like

Origin blog.csdn.net/lljloimjo/article/details/132598772