Record several issues in the project

They are all very annoying bugs.

Background

Recently did a transaction middleware project - the name sounds like a tall mountain. . . It is a trading platform that sells mobile phone traffic with a large number of visits. It connects with many customers to automatically place orders and automate processing of the upper-level system. It runs on Alibaba Cloud and is configured with 4 cores and 8GB of memory.

 

First thing first

1. Alibaba Cloud installs MariaDB 10.1.9 (after upgrading to 10.1.10, it is still not resolved), long things lose data.

Import from the order table to the historical order table every morning, at least 10K data, and then find that the data is lost. After writing the program, I compared the program output log with the binlog and the database and found that the sql was completely executed, but in this long thing, innodb will randomly lose a few when it is stored, of course, there will be no less one day... Later I tried various The method, after a few days, failed, and switched to RDS; this problem has not been encountered since then.

To be honest, I don't doubt MariaDB. This bug is too mentally retarded. I suspect that there is a problem with Alibaba Cloud's IO.

 

2. After switching to RDS, there was a data inconsistency situation. In a connection, a row-level lock locks a customer record, then checks the customer's balance, and then decides whether to do some logic according to the balance, and then modify the balance (using The update statement is directly modified), and it turns out that the value of one hour ago is queried... Then the udpate statement because the binlog of RDS records the data before and after the modification, so it is found that the binlog records the correct data. Why is it that the data from an hour ago is found... although only once!

 

3. Generate report Excel and download, about tens of thousands to hundreds of thousands of data. Because the system occasionally runs to death for a few days, the cpu usage rate explodes, or the connection pool times out (the actual number of connections is not very high, about seventy or eighty), I have no idea what the reason is, so I can only restart the system. Later, it was found that this matter was related to the 360 ​​browser (Wei Mao is also a 360 browser, and the history of blood and tears trapped by this thing is really...), if the customer uses the 360 ​​browser to download the generated Excel report, the browser may not be correct Recognizing the Content-Length header (for such a retarded bug as Mao), so the connection is closed halfway, the client fails to download, and the server returns normally after the catch exception; however, for such a simple problem, the tomcat on the server occupies the CPU and explodes directly Table, because there are many online transactions, I have not had time to carefully study the specific cause of this bug (suspected to be a bug of tomcat's apr), but it seems that this problem should not occur anyway... The solution is to change to something else All browsers are ok, as long as it is not a big hole 360.

 

Record it, there may be brothers who have been pitted later, you can refer to it.

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=327083692&siteId=291194637