Server interface optimization scheme

1. Background

For old projects, we did a lot of things to reduce costs and increase efficiency last year. Among them, we found the most time-consuming interface problem, so we focused on interface performance optimization. This article will share with you a general solution for interface optimization.

2. Summary of interface optimization scheme

1. Batch processing

Batch thinking: Batch operation of the database is easy to understand. In the interface of the loop insertion scenario, we can insert or update the database at one time after the batch execution is completed, avoiding multiple IOs.

//for循环单笔入库
list.stream().forEatch(msg->{
    insert();
});
//批量入库
batchInsert();

2. Asynchronous processing

Asynchronous thinking: For logic that takes a long time and is not necessary for the result, we can consider asynchronous execution, which can reduce the time-consuming interface.

For example, a purchase interface for financial management, account entry and writing of purchase documents are executed synchronously, because it is a T+1 transaction, the latter two logics are not necessary for the result, and we do not need to pay attention to its real-time results, so we consider the Accounting and writing purchase documents are changed to asynchronous processing. as the picture shows:

As for the asynchronous implementation, you can use thread pools, message queues, or some scheduling task frameworks.

3. Space for time

A well-understood example of exchanging space for time is the reasonable use of cache. For some data that is frequently used and infrequently changed, it can be cached in advance and directly checked in the cache when needed to avoid frequent database queries or repeated calculations.

What needs to be noted is that the word reasonable is used here, because exchanging space for time is also a double-edged sword. You need to consider your usage scenarios comprehensively. After all, the data consistency problem caused by the cache is also quite a headache.

The cache here can be R2M, local cache, memcached, or Map.

Here is an example query for a stock instrument:

Because the rebalancing information of the strategy rotation is only updated once a week, the logic of checking the database in the original rebalancing interface is unreasonable, and after getting the rebalancing information, it needs to go through complex calculations to finally get the backtest income and We want to outperform the Shanghai and Shenzhen Index. If we put the database lookup operation and calculation results into the cache, it can save a lot of execution time. As shown in the picture:

4. Preprocessing

That is, the idea of prefetching is to calculate the queried data in advance and put them into the cache or a field in the table, which will greatly improve the interface performance when used. Similar to the example above, but with a different focus.

To give a simple example: wealth management products, there will be data display requirements for calculating the annualized rate of return based on the net value. We can use the net value to apply the calculation logic of the annualized rate of return calculation formula. We can use preprocessing, so that each interface call directly fetches the corresponding field is fine.

5. Pooling thinking

We have all used database connection pools, thread pools, etc. This is the embodiment of the pool idea. The problem they solve is to avoid repeated creation of objects or connections, which can be reused and avoid unnecessary loss. After all, creation and destruction will also take time.

Pooling ideas include but are not limited to the above two. Generally speaking, the essence of pooling ideas is **pre-allocation and recycling. **After understanding this principle, even when we are doing some business scenarios, It can also be used.

For example: object pool

6. Serial to parallel

Serial means that the current execution logic must wait for the previous execution logic to be executed before it is executed. Parallel means that the two execution logics do not interfere with each other, so parallelism is relatively time-saving. Of course, it is based on the premise that there is no result parameter dependence.

For example, for the position information display interface of financial management, we not only need to query the user's account information, but also query product information and banner bit information to render the position page. If it is serial, the time-consuming interface is basically cumulative. If it is parallel, the interface time will be greatly reduced.

As shown in the picture:

7. Index

Adding an index can greatly improve the efficiency of data query. This will also be considered in the interface design. I won’t go into details here. With the iteration of requirements, we will focus on sorting out some scenarios where the index does not take effect. I hope it will be helpful to friends. help.

The specific scenarios that do not take effect will not be cited one by one. If there is time later, I will sort it out separately.

8. Avoid big transactions

The so-called big transaction problem refers to the transaction with a long running time. If the transaction is not submitted consistently, the database connection will be occupied, which will affect other requests to access the database and affect the performance of other interfaces.

for example:

@Transactional(value ="taskTransactionManager", propagation =Propagation.REQUIRED, isolation =Isolation.READ_COMMITTED, rollbackFor ={RuntimeException.class,Exception.class})
    publicBasicResultpurchaseRequest(PurchaseRecordrecord){
        BasicResult result =newBasicResult();
        //插入账户任务
        taskMapper.insert(ManagerParamUtil.buildTask(record,TaskEnum.Task_type.pension_account.type(),TaskEnum.Account_bizType.purchase_request.type()));
        //插入同步任务
        taskMapper.insert(ManagerParamUtil.buildTask(record,TaskEnum.Task_type.pension_sync.type(),TaskEnum.Sync_bizType.purchase.type()));
        //插入影像件上传任务
        taskMapper.insert(ManagerParamUtil.buildTask(record,TaskEnum.Task_type.pension_sync.type(),TaskEnum.Sync_bizType.cert.type()));
        result.setInfo(ResultInfoEnum.SUCCESS);
        return result;
    }

The above code is mainly to perform a series of follow-up operations after the purchase application is completed. If the new purchase is completed now, send a push to notify the user's needs. It is very likely that we will directly append later, as shown in the following figure: RPC calls are nested in the transaction, that is, non-DB operations. If these non-DB operations take a long time, large transaction problems may occur. The problems caused by big data mainly include: deadlock, interface timeout, master-slave delay, etc.

@Transactional(value ="taskTransactionManager", propagation =Propagation.REQUIRED, isolation =Isolation.READ_COMMITTED, rollbackFor ={RuntimeException.class,Exception.class})
    publicBasicResultpurchaseRequest(PurchaseRecordrecord){
        BasicResult result =newBasicResult();
        ...
        pushRpc.doPush(record);
        result.setInfo(ResultInfoEnum.SUCCESS);
        return result;
    }

Therefore, in order to avoid large transaction problems, we can avoid them through the following schemes:

1. RPC calls are not placed in the transaction

2. Query operations should be placed outside the transaction as much as possible

3. Avoid processing too much data in the transaction

9. Optimize program structure

Program structure problems generally appear after multiple iterations of requirements, and the code is superimposed. It will cause some time-consuming problems such as repeated queries and multiple creation of objects. It is more common when multiple people maintain a project. It is also relatively simple to solve. We need to refactor the interface as a whole, evaluate the function and purpose of each code block, and adjust the execution order.

10. Deep pagination problem

The problem of deep paging is relatively common. Generally, the first thing we think of when paging is limit. Why is it slow? We can look at this SQL:

select*from purchase_record where productCode ='PA9044'andstatus=4orderby orderTime desclimit100000,200

limit 100000,200 means that 100200 rows will be scanned, then 200 rows will be returned, and the first 100000 rows will be discarded. So execution is slow. Generally, the label recording method can be used to optimize, such as:

select*from purchase_record where productCode ='PA9044'andstatus=4and id >100000limit200

The advantage of this optimization is that it hits the primary key index, no matter how many pages there are, the performance is not bad, but the limitation is that a continuous self-incrementing field is required

11. SQL optimization

SQL optimization can greatly improve the query performance of the interface. Since this article focuses on the interface optimization scheme, the specific SQL optimization will not be listed one by one. Friends can consider optimization schemes in combination with indexing, paging, and other concerns.

12. Avoid too coarse lock granularity

Locks are generally used to protect shared resources in high-concurrency scenarios, but if the granularity of locks is too coarse, it will greatly affect interface performance.

About the lock granularity: It means how big the scope of the lock you want to lock. Whether it is synchronized or redis distributed lock, you only need to lock the critical resource. If it does not involve shared resources, you don’t need to lock it. It’s like you want to go to the bathroom , You only need to lock the door of the bathroom, you don't need to lock the door of the living room.

Wrong locking method:

//非共享资源
privatevoidnotShare(){
}
//共享资源
privatevoidshare(){
}
privateintwrong(){
    synchronized(this){
      share();
      notShare();
    }
}

The correct locking method:

//非共享资源
privatevoidnotShare(){
}
//共享资源
privatevoidshare(){
}
privateintright(){
    notShare();
    synchronized(this){
    share();
 }
}

Three, finally

I believe that the efficiency problems of many interfaces are not formed overnight. In the process of demand iteration, in order to quickly launch the demand, the method of directly accumulating code is adopted to realize the function, which will cause the above-mentioned interface performance problems.

Changing thinking, thinking at a higher level, and developing requirements from the perspective of an interface designer will avoid many of these problems, and it is also an effective way to reduce costs and increase efficiency.

Roll up, the technology knows no bounds! !