Lessons learned from data table structure design and cache optimization

    The material system recently implemented for XX customers has successfully passed the trial period and entered the stable operation period. The functions are basically no longer changed, and it is a little more leisurely. Therefore, I want to improve the performance. Each data query is cached to reduce the number of IOs in the database and improve the efficiency of data query. After carefully reading the application scenarios and precautions of Ehcache, I suddenly found that the data that can be cached is very limited, and it can hardly play an optimization role. Now I will give a general introduction to the situation of our project and the overview of the system data structure design, and make a self-analysis for this situation. To say that the ugly point is to be a negative model of data design, haha, I hope that the majority of colleagues can communicate and propose more Good comments and suggestions.
    1. Project Overview:
      Material management should be a traditional business, because customers do not clearly describe their needs and goals, so we use the prototype development model for this system. In the early stage of design, we consider the needs of customers. and the efficiency of development, the main purpose is to achieve rapid iteration, which weakens the design of the data structure. The specific performance is that we save the amount and quantity of frequently changing materials in the same table structure.
   
2. Adopt technical framework: traditional SSH, UI: Ext js. DB: Oracle 10g

3. Specific examples
    The general design of many tables is similar to the following (only key fields are selected)
Material code table:
ID code name unit price quantity amount
1 0001 diesel ton 6000 10 60000

Advantages: data operation is simple, in the new When adding and modifying records, use Hibernate directly to operate on a single entity, and when querying, you can also find the record at one time, without having to do a connection query.

Disadvantages: Because the three fields of unit price, quantity and amount will be updated frequently with each material in and out of the warehouse. According to the description of Ehcache, for frequently modified data, it is necessary to frequently exchange data with the database and memory. From this point of view, it can be seen that the hit rate of the data must be relatively low, which will not only improve the efficiency but also greatly increase the efficiency. Reduce system performance. And it may cause data inconsistency. If you encounter the monthly system automatic accounting, the system scheduling task schedules the details of the material in and out of the warehouse, and happens to encounter the inconsistency between the cached data and the database data, then the month will directly affect the material finance. data. However, financial management requires that there should not be a single error in the amount of millions of dollars in and out each month, and the iron law of this month's inventory = last month's inventory + this month's storage - originally out of the warehouse must be strictly followed. The inconsistency of the data is likely to cause the above equation to not hold, which is absolutely not allowed in the financial management business.

My goal: I want to store the basic information of materials (name, unit, model, etc.), basic information of demand planning, and application for use that are frequently queried on a daily basis into the cache. In order to improve system efficiency and reduce the pressure on the database.

Solution: split the relatively fixed fields and frequently updated fields in the material coding table, material planning table, procurement planning table, etc., to store low-frequency and high-frequency change data separately, so that Ehcache, Redis can be used. , or related cache components such as memcached, load the data that is updated with low frequency and high frequency query into the cache, and use real-time query method for data updated with high frequency. In this way, the number of IOs and the amount of data query of the database are reduced, and the query to the database becomes the data that is updated as frequently as possible. While improving efficiency, the inconsistency of important data is eliminated, and the purpose of cache optimization is achieved.
Then the problem comes, because the project has now entered a stable operation period, the performance of the system is within the range of indicators, and has successfully passed the acceptance. Any changes in the data structure may theoretically lead to system instability. To achieve the above purpose and maintain the stability of the system, we must carry out a large number of transformations and regression tests on the logic of the original system DAO layer. For a company that does enterprise-level applications, if the customer does not increase the cost and consumes the company's resources at the same time, the boss does not agree. Hehe, unless out of technical feelings, without affecting the normal work, I voluntarily work overtime for the transformation, but I am already very tired to go to work during the day. Working overtime for the transformation, the kind of pursuit of perfection that my feelings have not yet risen to the point where I can forget sleep and eat for performance.
    My goals above are purely from the point of view of adding icing on the cake to the system. It is also a self-analysis of the entire project from the perspective of data structure design. Recall that if we had considered the subsequent performance tuning requirements when designing the data structure at the beginning, instead of blindly considering rapid iteration, we only considered writing code. Convenient and hassle-free. Probably later, there will be no dilemma of seeing the inadequacy of the system, but being powerless.
I hereby leave a message to remind myself not to make similar mistakes in future work.

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=327042087&siteId=291194637