2020 soft job working personal projects - the intersection of plane geometry statistics

2020 soft job working personal projects - the intersection of plane geometry statistics

project content
Course Link Spring 2020 Computer Software Engineering Institute (Roger Ren Jian)
Work requirements Individual project work
Course targets Learning theory and system software development process, software development experience accumulated through practice
This blog harvest Familiar with C ++ syntax, familiar VS tools, the process is simple practice of book development projects
Teaching classes 005
project address https://github.com/eitbar/IntersectProject.git

First, the estimated cost of each module in the development time of the program

PSP2.1 Personal Software Process Stages Estimated time consuming (minutes) The actual time-consuming (minutes)
Planning plan
· Estimate • Estimate how much time this task requires 10 10
Development Develop
· Analysis · Needs analysis (including learning new technologies) 60 60
· Design Spec Generate design documents 40 100
· Design Review · Design Review (and his colleagues reviewed the design documents) 10 50
· Coding Standard · Code specifications (development of appropriate norms for the current development) 0 0
· Design · Specific design 60 60
· Coding · Specific coding 150 150
· Code Review · Code Review 10 20
· Test · Test (self-test, modify the code, submit modifications) 60 120
Reporting report
· Test Report · testing report 30 30
· Size Measurement · Computing workload 10 10
· Postmortem & Process Improvement Plan · Hindsight, and propose process improvement plan 60 120
total 580 730

Seen from the table, the design phase, testing phase and summary stage and the estimated time were quite different.

I think the problem-solving algorithm and code generation design included in design documents, and students to discuss how to do it classified as design review. Afterwards feel the two parts should not take that long, too much about performance and indecisive, hesitating not a good thing, and from the final results, tangled for a long time or simply use "violent", two part-time It is "wasted" out.

The testing phase is because in the case are not familiar with unit testing error testing takes time and research to enhance the performance of the method also takes a long time.

Second, problem-solving ideas description

demand analysis

Functional Requirements: given N of straight lines, the number of interrogation points in a plane at least two on a given line. Enter the subject of guarantee only a finite number of answers.

Test requirements: The program submitted by the incoming parameters and the corresponding need to support the agreement.

Additional development needs: drawing a round.

With loose popular language description problem that the straight line N on the plane will not be repeated a total number of the intersection. Note the following points:

  • A plurality of lines may intersect at one point.
  • Information is determined by two straight integer coordinate points, the intersection coordinates may be fractional.
  • Project requirements to support command-line parameters, data file read, write the output to a file.
  • The total number of intersection is less than 5,000,000, means that there may be a large number of duplicate point calculations.
  • N can be significant.

It requires knowledge: linear and circular representation. The method of calculating the intersection line and line requirements. Calculation circle and the circle to find the intersections. The method to find the intersections of the circle and the straight line.

Problem-solving ideas

The most direct an idea, N is the number of lines twenty-two calculated intersection coordinates and saved, saved not repeat the same coordinates, the final statistics stored coordinates is the subject of the answer. Hash weight determination using, calculated assuming a constant time complexity is O (k), then the approach time complexity of O (N ^ 2 ^ * k), the spatial complexity is O (N ^ 2 ^). Group slightly estimate O (10 ^ 8 ^) 1 second words, and adding calculated constant, then, when N is less than equal to 1000, this approach results can be obtained within 60 seconds, needs accuracy scores, but considering the performance part of the process needs to be improved.

Further reflection, certainly no intersection between parallel lines, whether it traversed when you can ignore the parallel lines do? We can calculate the slope of each line to the packet straight to the same slope of the line is a group, the same set of lines without traversing. If the last divided \ (m \) groups \ (L_i \) straight lines, and therefore will reduce the total \ (2 \ \ sum l_i ^ ) calculations, in some cases, there may be some performance improvement but still can not guarantee sure results in less than 60s.

Think again, the main bottleneck algorithm that calculated the point of repetition , can reduce the computational point of repeating it in some way? N straight lines will be sequentially added to the entire drawings, in calculating the intersection, the intersection is recorded how many straight lines intersect. However, because C ++ experience relatively scarce, and not to estimate the time complexity of calculating the intersection point using the time complexity of C ++ Standard Library data structures, and thus give up.

Finally, taking into account the development needs round, it can first straight processed by the above method, and then deal direct violence circle.

I inquired to the network-related issues, but did not get a good algorithm, hope after the end of the job, from teachers and other students blog gain new knowledge.

Third, the design and implementation process

First, the need to store the information input geometry, i.e. with linear, round, Class to be calculated in advance may be stored after the part to save time. Then in our algorithm, also we need to build the intersection of intersection geometry class.

Three classes specifically designed to:

  • Intersection categories:
    • Attribute: double: abscissa;
    • Attribute: double: ordinate;
  • Straight categories:
    • Attribute: double,double,double: three parameters generally linear equations ( \ (AX + by + C = 0 \) );
    • Attribute: double,double: slope (positive slope when infinity, defined as INF), the intercept (the time is positive infinity, defined as INF);
    • Attribute: double,double: a straight line through any point;
    • Method: Point: find the intersections with another line
  • Round classes:
    • Attribute: double,double: center coordinates;
    • Attribute: double: radius;
    • Attribute: double,double,double: three parameters general equation of a circle ( \ (^ X ^ 2 + Y 2 + DX + EY + F = 0 \) )
    • Method: std::vector<Point>: find the intersections with another line
    • Method: std::vector<Point>: find the intersections of the circle with another

Use the intersection map, setand so you can quickly query data structure for storage. Using linear array, set and other data structures may be stored in the order of tissue. Circle can be traversed using any data structure stored.

The main function processing flow of the read command line parameters, the simple parsing, data is read and stored in an array or other data structure. The problem is then split into an intersection line and the intersection of the straight line, circle and other geometry. Call solveLine()function statistics and the number of straight line intersection, then call solveCycle()the number of the intersection of statistics circle and line functions. Finally, the output to a file.

solveLine()Function main processes: first, the slope of the straight line according to a packet, the default processing is not started before any plane geometry. Then sequentially added to the straight plane, when the call request with the added function of the line intersection of the straight line, the plane of intersection of the line is calculated in the existing lines, check whether an intersection exists, if not, the intersections stored in the data structure, and statistics increase the amount.

solveCycle()Function main process is as follows: in order to add to the plane circle, circle and calling the request function with geometry other point of intersection of computing and graphics of the existing intersection, the intersection query whether there has been, if not, the point of intersection into a data structure and increase the number of statistics.

The intersection of the line and the straight line calculation functions : first determine whether parallel, if the parallel return a special point, otherwise, the intersection calculation using the calculation formula. Here you need to set up unit testing , the test results are correct. Test parallel case, the slope is 0 , the slope of the absence of the case of straight lines intersect with each other.

Circle of intersection with the straight line calculation functions : first straight line and calculating the center distance, there are several intersections is determined, if there is no intersection, returned empty queue; if a tangential intersection is calculated for a foot of a perpendicular to the center straight line, and comprising a return intersection queue; if two intersection points, to calculate a foot perpendicular to a straight line as the center, and then in the direction of the linear acceleration vector a certain length, to obtain two intersection points, and return queue contains two intersections. Here you need to set up unit testing , the test results are correct. Test circle and a straight line tangent to , intersecting , with from to find the intersections of the case, wherein the need comprises a straight line with a slope of 0 , the slope of the absence of special circumstances.

Circle intersects the circle with the calculation function : first through the center of the two circles intersect distance determination, if the queue is returned empty intersection; if it intersects the straight line calculated by a general equation of a circle, the circle of intersection and then counting the number of straight lines. Here you need to set the test unit , the test circle and circle endo , exo , intersect , outside from , containing the situation to find the intersections.

In addition to the essential function, need to add floating-point comparison function , precision is set to 1e-8. The function takes two parameters, if the first parameter is greater than a second, it returns a positive number, equal to 0 returns, returns a negative number less than. Here you need to set up unit testing , test floating-point comparison is correct.

If the required data structures used for heavy-duty type, also needed to prepare the corresponding overloaded functions.

Fourth, the performance improvement.

The first edition of Program Performance Analysis

In the first edition is completed code repeatedly modified bug, and, through the program for performance analysis and after a series of test unit testing.

Performance data is analyzed using: 7000 randomly generated geometry data, wherein each line and circle 3500, the final number of intersections is calculated 22191950 intersections. Random number generation module in python random packet.

VS2019 in performance profiler, CPU utilization analysis as follows:

As can be seen, the most time-consuming functions, statistical functions should be the point of intersection of the circle and all other geometry, it is also reasonable. Click-page detailed report, analysis solveCycle()function results are as follows:

Surprising that the share of computing time compared with straight round in the process of seeking the intersection, to mapmodify and query data structure of the time it takes more. Through access to information can be seen, C ++ standard library mapprimarily through the red-black tree implementation, query and modify the time complexity are \ (O (\ log (the n-)) \) . However, in this question, we absolutely do not need to use its sequential structure, and Pointtype of content is relatively simple. After some inquiry, I found that C ++ has unordered_mapthis data structure, its function is similar to the java hashmap, query and modify the terms of faster than map, so I decided to replace the data structure for recording the intersection unordered_map, and is Pointoverloaded hash method.

Preliminary analysis of the optimized performance

The data structure of the recording intersection replaced unordered_mapafter me, the same operation performance analysis program using the same data, the following results obtained:

From the time can run the entire program, brings significant optimization of data structures feel replaced. Reduced from the original 01 seconds to 1 minute 24.5 seconds worth happy for a little while :)

But then I felt a little strange, I thought the performance bottleneck procedure should be repeated calculate the intersection, but why here use the data structure into a record intersection of it?

Performance Optimization subsequent thinking

Then, by analyzing the data with test performance, I found that I actually test data is the data does not match with the demand , so the performance analysis does not really help me solve the bottleneck where the algorithm.

Because the data I am using a randomly generated very little overlap intersection, the overall pay a lot of points, so the time complexity is in a heap on the intersection of the data structure maintenance. The title clearly stated, there are number of restrictions intersection , demand for the subject, in fact, performance optimization should focus on reducing repeat the calculation point, change the data structure may only account for a small head in the actual test.

Unreasonable test data, to help me come to a reasonable performance bottleneck, the performance analysis does not really help me solve the bottleneck algorithm where changes to this data structure should eventually help to me will not be many.

In addition to how to generate a more reasonable test data, there is a problem let me somewhat puzzled. In the actual software development, performance optimization should be an essential part, when the software we have not formed when no user data, how do you know where it is the key performance optimization? If the wrong emphasis, for example, I make great efforts to find a plug to write extremely fast data structure (I can send the paper to make a report directly to embark on a pinnacle of life), Results in actual use, others simply do not use this feature, or the function of this speed does not really affect the customer experience, that I was not in vain it?

Five key code Description

I think the key code has three parts.

  • solveLine(): Add to the intersection of the plane and the straight line calculation ignores parallel line, this should be the only difference with direct violence of my friends to find the intersections, and according to the fewer number of intersections to infer, parallel lines should not be much, optimized performance is not worth a mention. Note the code and shown below:

    int solveLine() {
      //ans为统计交点个数
      int ans = 0;
      //将直线按斜率排序,实现斜率分组功能
      sort(l, l + ln);
      //map查找用的迭代器
      umap::iterator iter;
      //依次遍历直线,当前待添加直线为i
      for (int i = 0; i < ln; i++) {
          //按顺序遍历已经添加到平面中的直线
          for (int j = 0; j < i; j++) {
              //如果遍历到平面中直线j与i的斜率相同,那么j~i之间的直线斜率都与i相同,不再计算i
              if (doublecompare(l[i].k, l[j].k) == 0) {
                  break;
              }
              //求直线i与j的交点
              Point tpoint = l[i].intersectWithLine(l[j]);
              //如果是新交点,加入umap中,并统计结果,否则不做处理
              iter = vis.find(tpoint);
              if (iter == vis.end()) {
                  ans += 1;
                  vis[tpoint] = 1;
              }
          }
      }
      return ans;
    }
  • std::vector<Point> Cycle::intersectWithLine(Line t): Seeking the circle of intersection with the straight line, and stored in vectorthe return, a first order ideas pedal, and then follows the direction of the acceleration vector, and the code Notes:

    std::vector<Point> Cycle::intersectWithLine(Line t) {
      //交点vector
      std::vector<Point>ps;
      //先计算圆心到直线的距离
      double ld = abs(t.a * x + t.b * y + t.c) / sqrt(t.a * t.a + t.b * t.b);
      //相离
      if (doublecompare(ld, r) == 1) {
          return ps;
      }
      //相切
      else if (doublecompare(ld, r) == 0) {
          //斜率不存在,横坐标为直线横坐标,纵坐标为圆心纵坐标
          if (t.k == inf_k) {
              ps.push_back(Point(t.x1, y));
              return ps;
          }
          //斜率为0,横坐标为圆心横坐标,纵坐标为直线纵坐标
          if (t.k == 0) {
              ps.push_back(Point(x, t.y1));
              return ps;
          }
          //斜率正常
          //先算过圆心与直线t垂直的线y=-1/k*x+b2
          double b2 = x / t.k + y;
          //直线t斜截式中b1
          double b1 = t.b1;
          //两直线求交点,即垂足
          double xt = t.k * (b2 - b1) / (1 + t.k * t.k);
          double yt = t.k * xt + b1;
          //相切时,垂足即交点
          ps.push_back(Point(xt, yt));
          return ps;
      }
      //相交
      double ln = sqrt(r * r - ld * ld);
      //同理
      if (t.k == inf_k) {
          ps.push_back(Point(t.x1, y - ln));
          ps.push_back(Point(t.x1, y + ln));
          return ps;
      }
      if (t.k == 0) {
          ps.push_back(Point(x - ln, t.y1));
          ps.push_back(Point(x + ln, t.y1));
          return ps;
      }
      double b2 = x / t.k + y;
      double b1 = t.b1;
      double xt = t.k * (b2 - b1) / (1 + t.k * t.k);
      double yt = t.k * xt + b1;
      //求出垂足后,求直线方向向量
      double s1k2 = sqrt(1 + t.k * t.k);
      double nx = 1 / s1k2;
      double ny = t.k / s1k2;
      //加减方向向量与半弦长的乘积得到两个交点
      ps.push_back(Point(xt + ln * nx, yt + ln * ny));
      ps.push_back(Point(xt - ln * nx, yt - ln * ny));
      return ps;
    }
  • std::size_t operator()(const Point& c) const: Use unordered_mapwhen Pointthe specified class hashfunction, as follows:

    class PointHash
    {
    public:
      std::size_t operator()(const Point& c) const
      {
          //直接利用库中hash函数,为防止32/64位操作系统的影响,默认size_t为32位,x,y各分16位
          return hash<double>()(c.x) + (hash<double>()(c.y) << 16);
      }
    };

Sixth, to eliminate the code warning

This machine is recommended to use VS2019 rule, the number of errors and warnings are zero!

Guess you like

Origin www.cnblogs.com/eitbar/p/12453042.html