What data can be put into the cache? Record production environment once caching assessment

When the project was introduced Redis do distributed cache, it will face this problem:

 

  • What data should be placed in the cache? On what basis?
  • Cached data is refreshed with active or expired automatically expire?
  • If expired automatically expire, then the expiration time how to develop?

 

Just two weeks we do a project related to the assessment, the process record and share share; of course used in the process a lot of "awkward" if you have a better way, I hope to share.

 

01

 

Background of the project

 

Our project is a pure service platform, which is the only service that provides an interface, and no operation page, the interface calls daily amount of the project is about 200 million times, the peak of 10 million will succeed, because most of the interface is for internal system , so most requested concentrated in the 9:00 to 21:00 on weekdays, QPS when the system peak between 300-400.

 

Because we store project data using MongoDB, in theory, support QPS this magnitude should be more than enough, but I have a few observations and consider so:

Although the integration of MongoDB is good data, but many scenes are not single query, exaggerated when an interface may return hundreds of pieces of data, back reference packets have more than twenty thousand lines (do not ask me can not be paged to return .. .... clearly tell you not);

 

  • Although the integration of MongoDB is good data, but many scenes are not single query, exaggerated when an interface may return hundreds of pieces of data, back reference packets have more than twenty thousand lines (do not ask me can not be paged to return .. .... clearly tell you not);
  • 99.95% of the current project the interface response time in the tens to hundreds of milliseconds, to meet the basic needs of the business, but there are still 0.05% in response to requests for more than 1s, occasionally even reach 5s, 10s;
  • Looking at these long response time of the request, most of the time consumed by the query MongoDB, but when I would request message out manually when the call interface again, still millisecond return; MongoDB configuration in general, always have data updated, and I have observed, long response time of these interfaces, requesting that point in time particularly large amount;
  • The reason occasionally MongoDB query slow I confirm that I was the reason I can think of, such as: a large number of write operations affect the read operation, lock table, the index is less than the memory size, etc., being that it is the moment when MongoDB pressure; I observed, these interfaces response time is long, the time point that a particularly large volume of requests, it is not specifically analyzed here.

 

Although the request only ten thousand times four or five times the response time of an exception, but as more and more access to the requested item, quantitative qualitative change after Paul missing, still try to crisis strangled in the cradle, so decisively on the distribution of Redis do caching.

 

02

 

Interface combing

 

The next step is to produce environmental statistics and sort out existing interfaces to determine which interfaces can be placed in the cache, so we must first have a rough statistics for each interface to call volumes, because there is no access log platform, so I used the most stupid way, a number of thing one interface.

 

  • One day the working day log pulled down, our four application servers, daily log about a G, okay okay;
  • [Find in Files] EditPlus through this tool function to find the day's call volume each interface, the interface has been on line 30, has figured out a few minutes, because it is a one-time job, simply to manually counted;
  • Several times a day can not tune interface directly ignored, I basically just put million for the daily call volume interfaces to stay, the next step of the analysis.

 

03

 

Data dictionary table, the configuration of the class

 

This type of data is the most fit in the cache, because after the update frequency is particularly low, and sometimes even insert a'll be no update, if such data is larger than the call that is sure to put in Redis of;

 

As for caching strategy, you can double the time to write and update the database Redis, automatic failure mode can also be used, of course, the expiration time may be longer than put a number; for our project, I used the middle of the night 12:00 unified strategy failed, the first such data because our system is extracting over night by ETL, synchronization once a day, and the second is that we are not afraid of cache avalanche, not so much traffic at night but there is no access to the amount.

 

04

 

Hotspot data is the data clearly

 

There is a class of data, it is clear that the hot data;

 

We have an interface, although the business data, the amount of data but only a few thousand, but calls per day about 400,000, and the update frequency is not very high, such data Redis put in it, but right then ; as for what caching policy, but also from other systems because the data synchronization over, according to the time data synchronization, we finally adopted a one-hour expiration time.

 

05

 

Assessment of the remaining data

 

In fact, the first two data can easily evaluate it, the key is to assess such data:

 

  • We have a daily call volume interfaces 200,000 to 300,000, not quantity, but more complex queries and processing logic;
  • The basis of the amount of data is too large to put all the data into a Redis in;
  • Not the underlying data directly into Redis because there are multiple query dimensions (condition);
  • Unable to determine the frequency of each data call is how most pessimistic result, each data call only once that day, so there is no need for the cache.

 

But we can not say a racking our brains: "call volume getting bigger, directly into Redis in it," or "poor assessment, forget it, do not put the cache," and make a decision still needs to have any basis , so I do this:

 

Step 1.

All the interfaces to log all day to find out

Dozens certainly not a log file a turn, or to write their own programs to picking out the required data, but considering this job can only be done once, I still try to save some time now.

 

EditPlus still use this tool [Find in Files] function in the query results box [] copy all the contents, it took two minutes to put 240 000 log to find out.

 

What data can be put into the cache?  Record production environment once caching assessment

 

o The interface to query data 240 000

 

Step 2.

To import data into a database for further analysis

Each log something like this:

 

  •  
XXXX.log"(64190,95):2020-3-17 16:44:10.092 http-nio-8080-exec-5 INFO 包名.类名 : 请求参数:args1={"字段1":"XXX","字段2":"YYY"}

 

Log which I only need three elements: request packets Field 1 and Field 2, and call time; how pick out? Write a program? Of course, no problem, but I'm lazy that way, a few minutes can do things Why spend tens of minutes, then it? And this is a one-time job, so:

 

  • Alternatively entirety: [2020-3-17] replace [/ t2020-3-17], i.e. to add a time stamp in front of Tab;
  • Alternatively entirety: [{ "Field 1": "] replace [/ t];
  • Alternatively entirety: [ "," Field 2 ":"] replace [/ t];
  • Alternatively entirety: [ "}] Alternatively to [], is replaced with a null;
  • Select copy, paste to excel in, excel in the tab automatically exchange column;
  • Delete unneeded columns, leaving only field contents Field 1 and 2, and a time stamp;

 

It does not take a few steps a minute.

 

What data can be put into the cache?  Record production environment once caching assessment

 

o split out of three fields from the log

 

Step 3.

Call frequency analysis

When entering data into a database for analysis, according to our needs; we mainly want to know that the participants will not repeat the same call? Each call interval of time? A SQL get:

select 字段1 , 字段2, count(1) 调用次数, (MIDNIGHT_SECONDS(max(UPDATETIME)) - MIDNIGHT_SECONDS(min(UPDATETIME)))/60 调用间隔时间,处理成了分钟from TABLEgroup by 字段1 , 字段2having count(1) > 2with ur ;

 

Of course, call statistics interval, where statistical inaccuracies, specifically I do not explain, you fine chemicals ...

 

In short it, the amount of 240,000 calls a day, of which 100 000 calls once, 140,000 data will be called repeatedly in a short time, there is even some data will be repeated dozens of times in the query within a few minutes, so this interface Redis is more suitable to put in.

 

Step 4.

How data is stored?

Let me say that we save what data format to the Redis, a picture is worth a thousand words:

 

What data can be put into the cache?  Record production environment once caching assessment

 

o save the processing result to the Redis

 

As for the policy cache update Well, we still use to set expiration time, according to the data synchronization time and call statistics, this time set to 15 minutes more appropriate.

 

Can be seen in this evaluation process, all my operations have maintained a "lazy lazy can" the good habits and stay productive, make good use of tools, saving unnecessary time, the whole process took two hours, most of the time in data import, with almost a half hour, but fortunately, in this process, I can do other work.

Source: Webmaster News

Guess you like

Origin www.cnblogs.com/1994jinnan/p/12578093.html