IT Consultant: A Project Firefighting with Spitting Blood

  A year later, a partner company launched a sub-business system to connect with the company's internal single-point system. I received a technical consultation from the company: after the project was started, I suddenly could not log in without regularity. After restarting, I could not log in again after a period of time after logging in. The technicians of the other party were confused and did not know the reason, and there were no errors in the background log. information. When I was in danger, I rushed to the project to put out the fire. In fact, it took me 5 days to solve the problem that could have been solved in 2 days. Let me explain the specific reasons.

1. The essence of log log is not grasped

  The debug, info, and error information of the log log are scrambled. If you should use debug, use info. If you should use info, use debug.... The result is a successful login request. The background log has 300 lines of code, which has a serious impact. In order to improve the efficiency of troubleshooting and tracking problems, the online log level of the project is still at the debug level. If it is changed to the info level, a lot of key information is not printed.

  The key information of the log output format is not perfect. The category name of the log, the thread of occurrence, and the number of lines in the code are not clearly displayed, and it is impossible to know where the log is printed.

  Regarding this, what I want to say is that no matter how many frameworks there are, the configuration of various upper-level frameworks such as spark, flink, hadoop, and message middleware is just a trick. A means to quickly troubleshoot problems.

   Here is his log output:

2. The core parameters are not judged

  The data returned by the method is not judged by null or "" string, resulting in null pointer exceptions in various situations. The functions of the project are idealized, and it is expected that I need these data to give you the correct result, otherwise I don't know what went wrong. This problem caused me great confusion when I restored the scene of the case. I accidentally made a null pointer error. I had to make a stronger judgment on this error, so that I could simulate the situation that I could not log in after logging in many times.

  In addition, the use of in in the sql statement in the project is not standardized. Combined with the previous null judgment, there is a kind of: "Hey, I succeeded in logging in with this account, the sql is correct, and I log in with this person's account, how can I report it? The sql syntax is incorrect, it is clearly calling the same block of code"

 

 Obviously, if the roleid is "' string, the syntax of this sql statement is caused by the problem.

 

3. Promote local variables to static variables

   This is the reason for the problem mentioned at the beginning of the article, because the login needs to authenticate the user's identity to the single-point system, so it uses the httpclient framework to send http requests. It uses the httpclient variable as a static variable here, and then reuses it in the method The object, and then the object is called in the method without releasing the resource reasonably close, the framework will maintain a connection pool by default, if you apply for a resource and do not release it after use, then the resource will not be used by the next request, the new The request must wait in the waiting queue, and then after the user logs in 20 times, all the requests in the resource pool are exhausted, and the new request cannot get the resource and waits continuously in the waiting queue, which causes the server to time out and the login fails with a 504 error.

      At that time, when I saw the static variable of this class was httpclient, I had a bad feeling in my heart. This is an error-prone place. If it were me, I didn't have a full grasp of this framework and this class. I would It turns it into a local variable, so that under low concurrency, let the GC recycle it for me.

 

After transformation:

 

4. The path planning of the interceptor is confusing

   This problem has also hindered my troubleshooting. To troubleshoot the login problem, I must first trace the method trajectory of the backend after a successful login, and see which link is the code problem, because there is no task error message. As for his interceptor, a successful login request interceptor was repeatedly executed three times, and at least one interceptor failed to do anything effectively. The problem is that his front-end business sent irrelevant requests, which were intercepted. This forced I had to restore the full path outlined by the log instrumentation count, providing an opportunity for me to review the code to find the code problem in the call to httpclient.

5, abuse try catch

  This is also very disgusting. Its code is suddenly wrapped in try catch. Hey, this guy is very good. He also records some exceptions. I took a look at the code. What the hell is this? If you eat the information, you will eat it. Why don't you print the exception information or throw the exception, so you will eat the exception so fiercely. If there is a problem, it will not report it, and it will trigger a new exception. Snow hides the real problem.

 

  最后我想说,程序员何苦难为程序员,代码留一线, 日后好相见啥。你也不想自己给自己挖坑后,解决不了,然后来一句"大哥,你忙吗,我这有个小问题,帮忙看下呗(嗑瓜子)"。

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325261783&siteId=291194637