Low Latency Java (1): Introduction

This article is the first in a series of articles on Java low-latency programming. After reading this article, you will master the following concepts:


  • What is latency, and why should you care about latency as a developer?

  • How to describe the delay and what does the percentage in the data mean?

  • What causes the delay?


Without further ado, let's get started.


1. What is delay and why is it important?


Delay can be simply defined as the time it takes to perform an operation.


"Operation (operation)" is a broad term. Here I mean the behavior that is worth measuring in the software system and the execution of the behavior at a certain point in time.


For example, in a typical web application, the operation can be to submit a query from a browser and view the search results; in a trading application, it can be a financial trading tool that automatically sends a buy and sell instruction to the exchange after receiving a price change. Generally, the shorter the operation takes, the greater the benefit to the user. Users prefer network applications that do not need to wait. In retrospect, Google’s biggest advantage over other search engines of its time is its fast search experience. The faster the trading system reacts to market changes, the higher the probability of successful trading. Hundreds of trading companies are obsessed with making their trading engine the lowest latency system on Wall Street and thus gain a competitive advantage.


It is no exaggeration to say that in high-risk areas, reducing latency can determine the success or failure of a company!


2. How to describe the delay?


Every operation has a delay, and a hundred operations have a hundred delays. Therefore, a single indicator like "operations/second" or "seconds/operation" cannot be used to describe system delay, because it can only be used to describe a single run of a certain operation.


At first glance, the delay can be defined as the average of all similar operations. This is not a good idea.


Is there any problem with averaging? Consider the picture below:


image


There are several SLA goals (actually 7) that have an operation degree exceeding 60 milliseconds, but the average response time is within the SLA range. Once the average response time is adopted, all outliers in the red area will be ignored. However, the ignored outliers are precisely the most important data for software engineers, that is, system performance issues that need to be paid attention and sorted out. To make matters worse, the problems hidden behind these data often occur in actual production environments.


It is also worth noting that in fact, many delay measurement results may be as seen in the above figure, and occasionally some random serious outliers may be seen. The delay never follows the normal distribution, Gaussian distribution or Poisson distribution, you see more likely to be multi-modal distribution delay. In fact, this is why it is not valid to use mean or standard deviation to discuss delay.


The delay is best described as a percentage.


What is a percentile? Consider a set of numbers, the nth percentile (where `0 <n <100`) divides it into two parts, the lower part contains n% of the data, and the higher part contains (100-n) % The data. Therefore, the sum of the two parts of the data is 100%.


For example, 50% means that half is below 50% and the other half is above 50%. The better-known term for percentages is the median.


Let us give a few examples of measuring latency. The 90% delay is 75 milliseconds, which means that 90 out of 100 operations have a delay of at most 75 milliseconds, while the remaining operations, that is, 100-90 = 10, have a delay of at least 75 milliseconds.


If you add further, 98% of the delay is 170 milliseconds, which means that 2 out of 100 operations have a delay of 170 milliseconds or more.


If you add further, 99% of the delay is 313 milliseconds, which means that 1 out of every 100 operations has a greater delay than other operations.


In fact, many systems exhibit such characteristics that even a percentage increase in latency will increase significantly.


image


Why worry about long-tail delays? I mean, if only 1 out of every 100 operations has high latency, is the system performance not good enough?


Well, in order to have an intuitive impression, imagine the following scene. For a popular website, 90% is delayed by 1 second, 95% is delayed by 2 seconds, and 99% is delayed by 25 seconds. Assuming that the daily page views of all pages of the website exceed 1 million, that means that a certain page will be loaded more than 10,000 times in more than 25 seconds. At this time, the user may yawn and close the browser and go to other things. In the worst case, they will complain to their friends and relatives about the bad experience. No online business can afford such long-tail delays.


3. What caused the delay?


The shortest answer is: everything is possible!


Delay "jitter" will produce unique shapes and random outliers, which can be attributed to the following things:


  • Hardware interrupt

  • Network/IO latency

  • Management program suspension

  • Operating system activities, such as internal structure reconstruction, flushing buffers, etc.

  • Context switch

  • Garbage collection paused


These events are usually random and do not follow a normal distribution.


In addition, from a higher level, Java program operation:


image


(Hypervisors and containers are optional for bare metal hardware, but are closely related to latency in virtualized or cloud environments)


Reducing latency is closely related to the following factors:


  • CPU/cache/memory architecture

  • JVM architecture and design

  • Application programming: concurrency, data structure algorithms and caching

  • Network protocol etc.


Each layer in the above figure is very complex, which greatly increases the knowledge and professional skills required for performance optimization. It is also the reason why cost and time rationality need to be considered at all times.


But that's why performance engineering is so interesting!


Our challenge is to keep the application latency at a reasonably low level for every operation required.


easy to say, hard to do!


Guess you like

Origin blog.51cto.com/15082395/2590374