Splitting and merging in performance optimization: You must have never imagined that these two operations can be played like this

This article aims to deeply explore the importance of performance optimization and provide a comprehensive performance optimization solution. We will analyze from three aspects of hardware, software and network to help you improve the overall performance of the system. By following these recommendations, you'll be able to significantly improve system responsiveness, reduce latency, and improve user experience.

1. Divide and cultivate (the word "fen")

Business layering, system classification, service distribution, database sub-database/table, dynamic and static separation, synchronous split into asynchronous, single-threaded into multi-threaded, original data cache separation, single table into multiple tables, single database into multiple databases, stream splitting, etc. . . .

In system performance optimization, "divide and conquer" is a commonly used strategy. By decomposing the problem into smaller, more manageable sub-problems, and then solving each sub-problem separately, the overall optimization effect is finally obtained. Here are some common "divide and conquer" techniques and approaches:

  1. Task parallelization: decompose a large task into multiple subtasks, and use parallel computing to process these subtasks at the same time to improve the overall processing capacity of the system. For example, split a large data processing task into multiple parallel tasks and use multithreading or distributed computing to process these tasks simultaneously.
  2. Modular design: Split complex systems into modules, each module is responsible for a specific function or task. Each module can be independently developed, tested and optimized to improve overall system maintainability and performance.
  3. Algorithm decomposition: Decompose complex algorithms and split them into simpler, reusable sub-algorithms. By optimizing the performance of each sub-algorithm, the execution efficiency of the overall algorithm can be improved.
  4. Data Partitioning: Partitioning large-scale datasets, dividing the data into subsets and performing parallel operations on each subset. This method is often used in big data processing and distributed systems to improve the efficiency of data processing and query.
  5. Resource allocation: allocate and manage system resources, such as assigning tasks to the most suitable processing units or nodes, to achieve better load balancing and resource utilization.
  6. Problem disassembly: disassemble a complex problem into multiple simple sub-problems, and use recursive or iterative methods to solve each sub-problem separately. Finally, the solutions of the sub-problems are combined to obtain the solution of the overall problem.
  7. Network optimization: including protocol optimization, load balancing, cache strategy, etc.
  8. System architecture optimization: including distributed architecture, microservice architecture, containerized deployment, etc.
  9. Task splitting: Divide tasks into multiple subtasks, execute them separately, and use parallel or distributed methods to improve efficiency and scalability.
  10. Asynchronous processing: For tasks that take a long time or depend on external resources, they can be executed in an asynchronous manner to avoid blocking the main thread or affecting the user experience. Common asynchronous processing mechanisms include message queues, callback functions, and event listeners.

2. Combine into one ("He" character formula)

The granularity of microservices should not be too fine. If the combination is combined, the large database table reduces the joint query. If the redundancy is redundant, the redundant data is merged. The intuitive expression is: from the front-end CDN, dynamic and static separation, to the splitting of background services into micro-services, distributed, load balancing, caching, pooling, multi-threading, IO, sub-database tables, search engines, etc. They all emphasize the word "fen".

In system performance optimization, "integration into one" is a strategy that emphasizes the maximization of overall performance, not just the solution of local problems. Here are some common "all-in-one" techniques and approaches:

  1. System-level optimization: Comprehensively consider the performance bottlenecks and hotspots of the entire system, and optimize the system as a whole through global performance analysis and optimization strategies. This includes comprehensively considering the interaction between various components and modules to maximize the overall performance.
  2. Resource management and optimization: through comprehensive management and optimization of system resources, such as memory, CPU, network, etc., to achieve overall performance improvement. This includes the optimization of resource allocation strategies, the realization of load balancing, and the maximization of resource utilization.
  3. Comprehensive concurrency and parallel optimization: Consider the concurrency and parallel operation of the entire system, and improve the concurrency performance and efficiency of the overall system by comprehensively optimizing concurrency control, resource sharing, and synchronization mechanisms.
  4. Data flow and data processing optimization: optimize the flow and processing of data in the system, and comprehensively consider all aspects of data transmission, conversion, and processing to maximize the efficiency and performance of overall data processing.
  5. System architecture design optimization: Consider the performance requirements at the system design stage, and select the appropriate architecture and technology to ensure the optimization of the overall system performance. This includes rational division of modules, reduction of coupling between components, selection of efficient communication protocols and data formats, etc.
  6. Overall system tuning and configuration: By tuning the configuration and parameters of the entire system to best suit specific workloads and performance requirements. This includes adjusting system buffer sizes, optimizing the number of network connections, configuring thread pool parameters, and more.
  7. Comprehensive performance monitoring and tuning tools: Use comprehensive performance monitoring tools and analyzers to comprehensively monitor and analyze system performance to identify overall performance bottlenecks and optimization opportunities. This includes comprehensively considering the performance indicators of various components and modules to develop a comprehensive optimization strategy.

By adopting the above "integration into one" technology, the overall system performance can be improved, and the overall performance can be maximized by comprehensively considering the interrelationship and mutual influence of each component and module. This method emphasizes the improvement of overall performance, so that the system can achieve the best performance and efficiency at all levels and dimensions.

I. Introduction

With the development of technology, high performance computing has become a key requirement in many industries. Whether it is gaming, finance, medical or other fields, high-performance computing plays a pivotal role. However, achieving high performance is not easy. In order to meet users' needs for speed and efficiency, we need to continuously optimize performance. In this article, we will introduce you a set of comprehensive performance optimization solutions to help you improve the overall performance of the system.

1. Definition of performance optimization

Performance optimization refers to having a deep understanding of computer hardware, operating systems and applications, adjusting the relationship between the three, maximizing the performance of the entire system (including hardware, operating systems, and applications), and continuously meeting existing business needs. The process of adjusting, improving, and optimizing a system to increase its execution speed, response time, throughput, resource utilization, or other performance indicators. Performance optimization aims to enable the system to meet user needs more efficiently and provide a better user experience by reducing latency, increasing throughput, and reducing resource consumption.

Performance optimization can be applied in various fields, including software development, database management, network communication, algorithm design, system architecture, etc.

Performance optimization is a continuous process that requires comprehensive consideration of system requirements, resource constraints, and user experience, and continuously improves system performance through real-time monitoring, analysis, and optimization. It involves careful performance analysis, experimentation, tuning, and verification to ensure that the system will perform at its best in actual operation.

2. The goal of performance optimization

The goal of performance optimization is to improve the performance of a system, application, or algorithm to meet user needs and provide a better user experience. The following are the main goals of performance optimization:

  1. Response Time Optimization: Reducing the response time of the system, enabling users to get results or perform actions faster. Fast response times enhance user satisfaction, increase user engagement, and increase system availability.
  2. Throughput optimization: Improve the processing capacity and concurrency of the system so that the system can handle more requests or transactions. Increasing throughput can improve the scalability of the system, adapt to high concurrent load, and support more users to access at the same time.
  3. Resource utilization optimization: Through reasonable resource management and utilization, the waste and idleness of resources are minimized, and the efficiency and performance of the system are improved. Optimizing resource utilization can reduce hardware resource costs and improve system scalability and economy.
  4. Stability and reliability optimization: By optimizing the stability and reliability of the system, ensure the stability of the system under long-term operation and high load conditions, and reduce the risk of system crashes and errors.
  5. Energy Optimization: Reducing the energy consumption of a system or device to increase energy efficiency, reduce operating costs, and have a smaller impact on the environment.
  6. User experience optimization: Improve user experience, including interface fluency, interactive response speed and smooth operation. Optimizing user experience can increase user satisfaction, reduce user churn, and increase user loyalty.
  7. Maintainability optimization: By optimizing the system design, code quality and architecture, the system is easy to maintain and expand, reducing maintenance costs and improving the productivity of the development team.

The goal of performance optimization is to make the system achieve the best performance level as possible under the given resources and constraints. Consider user needs, system requirements, and resource constraints to achieve the best balance of performance and availability.

3. The importance of performance optimization


Performance optimization is of great importance in modern application development and system design, the following are several key aspects of performance optimization and their importance:

  1. Improved user experience: Good performance can significantly improve user experience, allowing users to get responses and results faster. Fast response time and high throughput can enhance user satisfaction, reduce user waiting time and bad experience, and improve user engagement and retention.
  2. Enhance system scalability: performance optimization can improve the processing capacity and concurrency of the system, enabling it to handle more requests or transactions. By increasing throughput and improving resource utilization, the system can better adapt to the needs of high concurrent load and large-scale users, maintaining stability and high performance.
  3. Saving resources and costs: Performance optimization can reduce resource consumption and waste, and improve resource utilization efficiency. By optimizing the use of resources such as memory, CPU, and network, the system can handle more work under the same hardware configuration, reducing hardware resource costs and operating costs.
  4. Improve reliability and stability: Performance optimization helps to improve system stability and reliability. By reducing latency and optimizing concurrent operations, you can reduce the risk of system errors and crashes, and improve system stability and availability.
  5. Enhanced Competitive Advantage: In a highly competitive market, performance optimization can be a competitive advantage for businesses and products. Fast response time, high throughput and great user experience can attract more users, increase market share and brand reputation.
  6. Energy saving and environmental protection: Reducing energy consumption of systems or equipment through performance optimization can improve energy efficiency, reduce energy consumption and impact on the environment. This is of great significance for sustainable development and corporate social responsibility.
  7. Improve development efficiency: Performance optimization can prompt the development team to focus on code quality, design optimization, and system maintainability. This helps improve development efficiency, reduce maintenance costs, and empower developers with better code and system design capabilities.

Performance optimization is not only a basic requirement to meet user expectations, but also a key factor to remain competitive, improve efficiency and save resources. By devoting appropriate time and resources to performance optimization, excellent system performance and user experience can be achieved, thereby bringing great benefits to individual users, businesses and organizations.

4. The principle of performance optimization

Performance optimization needs to follow some basic principles:

  • Based on data instead of guessing: use testing, logging, monitoring and other tools to analyze the bottlenecks and problems of the system, and optimize them in a targeted manner.
  • Avoid premature optimization: In the early stages of product development, focus not on performance but on functionality and quality. Only after the product has stabilized, do the necessary optimizations.
  • Avoid over-optimization: Performance optimization is a continuous process, and reasonable goals and indicators need to be formulated based on business needs and cost-effectiveness. Don't sacrifice code readability, maintainability, and stability for extreme performance.
  • In-depth understanding of business: codes serve business, without understanding business requirements and scenarios, it is difficult to find out the deficiencies in system design and implementation. It is necessary to maintain communication with product, operation and other teams to understand user behavior and feedback.

5. The method of performance optimization

1. Performance optimization at the hardware level

  • Upgrade CPU and GPU: High-performance CPU and GPU are the key factors to improve system performance. Choose the right processor model for your needs and make sure they have a sufficient number of cores and threads.
  • Increase memory (RAM): RAM is where your computer stores data. Increasing the amount of memory can improve the speed and stability of the system.
  • Use a solid-state drive (SSD): Compared with traditional mechanical hard drives, solid-state drives have faster read and write speeds. Installing the operating system and commonly used programs on the SSD can significantly improve the system's boot speed and application response speed.

2. Performance optimization at the software level

  • Cache: Using the idea of ​​exchanging space for time, store frequently accessed or calculated data in memory or local files to reduce repeated IO or calculation overhead.
  • Concurrency: use multi-core CPU or multiple servers to share the workload and improve the throughput and concurrency of the system.
  • Laziness: Defer calculations until necessary, avoiding redundant or useless calculations.
  • Batch: When there is IO (network IO, disk IO), merge operations and batch operations reduce IO times and overhead.
  • Efficient implementation: choose more appropriate or faster algorithms, data structures, programming languages, etc. to implement functions.
  • Optimize traversal: calculate in a smaller data range instead of traversing all data. For example, technologies such as indexes, filters, and paging are used to speed up data retrieval.

3. Performance optimization at the network level

  • Optimizing network settings: Adjusting TCP/IP parameters, enabling QoS (Quality of Service) and other methods can improve network transmission efficiency and stability.
  • Use CDN (Content Distribution Network): CDN can cache website content on servers all over the world, so that users can access the required content more quickly.

2. Memory management optimization

1. Understand the basics of Java memory

1.1, Java memory model

JMM (Java Memory Model) is a memory model defined in the Java Virtual Machine Specification, which describes how Java programs access shared memory in a multi-threaded environment. JMM is mainly a memory model defined to shield the differences in memory access by various hardware and operating systems. JMM defines an abstract computer memory model, including main memory and working memory.

1.2. Runtime data area

The JVM runtime data area is the memory area used by the Java virtual machine when executing Java programs. These areas include the following sections:

  1. Program Counter (Program Counter Register): The program counter is a small memory area that can be seen as a line number indicator of the bytecode executed by the current thread. Each thread has its own independent program counter, which is used to record the address of the instruction that the thread needs to execute.
  2. Java virtual machine stack: The Java virtual machine stack is composed of stack frames (Stack Frame), and each stack frame corresponds to the call of a Java method. When a method is called, a corresponding stack frame is generated in the Java virtual machine stack and pushed onto the stack. When the method execution is complete, the stack frame is popped from the stack. The Java virtual machine stack is used to store information such as local variables, method parameters, return values, and operands.
  3. Native method stack: The native method stack is similar to the Java virtual machine stack, but it is used to execute the native method (Native Method). Native methods are methods implemented in languages ​​such as C and C++. They are different from Java codes and require direct access to operating system resources.
  4. Java heap: The Java heap is the largest memory area in the Java virtual machine, and it is also the only memory area shared by all threads when the program is running. The Java heap is used to store data structures such as Java object instances and arrays. The Java heap can expand and shrink dynamically, and its size can be controlled through command line parameters.
  5. Method Area: The method area is used to store data such as class information, constants, static variables, and code compiled by a just-in-time compiler. It is one of the permanent storage areas in the Java virtual machine. The method area was called the permanent generation (PermGen) before Java 8, and it was gradually replaced by Metaspace starting from Java 8.
  6. Runtime Constant Pool: The runtime constant pool is the runtime representation of the constant pool table for each class or interface. It contains literals and symbol references generated at compile time, as well as string literals generated at runtime, etc. The runtime constant pool is part of the method area.

The above are the main components of the Java virtual machine runtime data area. Different areas differ in memory size and usage, but they are all important components that support the normal execution of Java programs. Understanding the runtime data area of ​​the Java virtual machine is very important for writing efficient and stable Java programs.

1.3, Java Garbage Collection Mechanism

Java's garbage collection (Garbage Collection, GC) mechanism is a mechanism in which the Java Virtual Machine (JVM) is responsible for automatically reclaiming the memory space occupied by objects that are no longer used. The garbage collection mechanism greatly reduces the burden on developers to manually manage memory, and helps prevent memory leaks and improve application performance.

1. Recycling process

1) First judge whether the object is alive (whether it is garbage)

It can be judged by the reference counting algorithm and the reachability analysis algorithm. Since the reference counting algorithm cannot solve the problem of circular references, the reachability analysis algorithm is currently used

2) Traverse and recycle objects (recycle garbage)

Garbage can be reclaimed by the garbage collector (Serial/Parallel/CMS/G1). The algorithms used by the garbage collector are mark clearing algorithm, mark finishing algorithm, copy recovery algorithm and generational recovery algorithm.

2. Types of GC

3. The principle of GC

  • The core of the GC collector is the GC collection algorithm
  • The GC collection algorithm generally first needs to determine whether the object is alive, and then uses the reference counting algorithm or the reachability analysis algorithm
  • The reference counting algorithm cannot solve the situation of circular references, so the reachability analysis algorithm is currently used
  • GC is divided into 4 types, acting on different areas of memory (new generation Eden/S0/S1, old generation). At this time, the GC collectors will combine with each other to complete different types of GC, so as to achieve the function of JVM GC

2. Choose the right garbage collector

collector CMS G1
Recycling Algorithm mark clear markup
recycling area old generation new generation + old generation
memory layout Tradition Divide the new generation and the old generation together into Regions
memory fragmentation generate debris space Small space for debris
concurrency concurrency concurrency
JDK use JDK8 default (Parallel) JDK9 default
pause time minimum pause time predictable pause time

  1. Java 8: The default garbage collector is the Parallel collector. It uses parallel threads for garbage collection, suitable for multi-core processors, and the goal is to obtain the highest throughput in the shortest time.
  2. Java 11: The default garbage collector is the G1 (Garbage-First) collector. It is a low-latency garbage collector designed for low pause times and high throughput.
  3. Java 14: The default garbage collector is still the G1 collector.
  4. Java 15: The default garbage collector is still the G1 collector.
  5. Java 16: The default garbage collector is still the G1 collector.

The Java virtual machine also provides many parameters for tuning the behavior and performance of the garbage collector. These parameters can be used to specify a specific garbage collector, tune the heap size, pause time, throughput, etc. Depending on the needs and characteristics of the application, these parameters can be used to optimize the performance of the garbage collector.

3. Memory analysis tool

1) jstat: used to monitor JVM memory usage and garbage collection information.

2) jmap: Used to generate JVM heap dump files for analyzing memory usage.

3) jconsole: used to monitor JVM performance indicators, number of threads and other information.

4) VisualVM: A powerful performance analysis tool that can count various indicators such as CPU, memory, and GC, and provide a graphical interface.

5) Ali Arthas: Application performance analysis, memory leak detection, thread problem troubleshooting, method call tracking and other operations.

6) Apache JMeter: Used for stress testing and performance testing. The performance inflection point of the system can be tested.

7) Eclipse MAT: Mat is a plug-in of Eclipse and can also run independently, so even if you use IDEA, you can use Mat independently. The main function of MAT is to analyze dump files.

8) XRebel: A lightweight Java performance analysis tool that provides real-time code-level performance analysis and optimization suggestions.

After the page operation, you will see the time-consuming of each http request in the xrebel console

The method time consumption of each class that is called or executed is displayed, and different colors indicate the time consumption.

3. Multi-threaded concurrent optimization

1. Thread pool (pooling technology)

  • corePoolSize number of core threads (official workers)
  • workQueue waiting queue (contract workers)
  • maximumPoolSize maximum number of threads (all staff)

Oracle officials do not give a specific reference value for the corePoolSize of the thread pool, because the size of this value should be optimized and adjusted according to actual business scenarios and system resource conditions. Different business scenarios and system resource conditions may require different corePoolSize settings.
In the book "Java Concurrent Programming Practice", the author Brian Goetz and others pointed out that the size of the thread pool should be determined according to the task type and computing intensity. For CPU-intensive tasks, the number of core threads should be set to the number of processor cores plus 1 or 2; for I/O-intensive tasks, the number of core threads can be appropriately increased to take advantage of idle CPU time.
This suggestion is based on the following considerations: For CPU-intensive tasks, threads require a lot of calculations, so enough CPU resources are needed, and the number of processor cores plus 1 or 2 can make full use of CPU resources and avoid competition and blocking between threads; for I/O-intensive tasks, since threads are waiting for I/O operations most of the time, the number of core threads can be appropriately increased to take advantage of idle CPU time, thereby improving system efficiency.
Although this suggestion is not an official standard, it has been widely recognized and applied in practical applications, and achieved good results.

2. Lock competition and granularity (lock optimization technology)

In performance optimization, reducing lock competition and using fine-grained locks are common strategies, which can effectively improve the performance of concurrent programs. Here are some strategies for reducing lock contention and using fine-grained locks:

2.1. Reduce lock granularity

By decomposing a large lock into multiple small locks, the granularity of the lock is reduced, thereby reducing the degree of lock competition. For example, splitting a large block of synchronized code into multiple smaller blocks of synchronized code reduces contention among concurrently executing threads.

2.2. Using read-write locks

For scenarios with more reads and fewer writes, you can use a read-write lock (ReentrantReadWriteLock) to improve concurrency performance. Read-write locks allow multiple threads to acquire read locks at the same time, but only one thread can acquire write locks, thus providing higher concurrency.

2.3. Using concurrent collections

Java provides some concurrent collection classes, such as ConcurrentHashMap, ConcurrentLinkedQueue, etc. These concurrent collection classes use an internal fine-grained locking mechanism to provide efficient concurrent operations in a multi-threaded environment.

2.4, use CAS operation

CAS (Compare and Swap) is an optimistic locking mechanism that performs atomic operations by comparing and exchanging, avoiding the performance overhead and thread blocking caused by using traditional mutex locks. The Atomic class and AtomicReference class in Java provide support for CAS operations.

Specifically, the multithreaded CAS operation includes the following steps:

  1. Get the value and expected value of the current shared variable.
  2. Compare whether the current value of the shared variable is equal to the expected value, and if they are equal, update the value of the shared variable to the new value to be written.
  3. If the current value of the shared variable is not equal to the expected value, it means that other threads have modified the value of the shared variable at this time, then the current thread needs to reacquire the latest value of the shared variable, and repeat step b.

In a concurrent environment, multi-threaded CAS operations can guarantee the atomic operation of shared variables, and also avoid the overhead of thread blocking and context switching caused by traditional lock mechanisms. Therefore, multi-threaded CAS operations are widely used in various high-concurrency scenarios, such as database transactions, distributed systems, and so on.

2.5. Read-write separation lock

Separate read-only operations and write operations in the data structure, using read locks and write locks respectively. This can allow multiple threads to read data at the same time, and only need to lock when writing, reducing the competition between reading operations.

2.6. Use lock-free algorithm

A lock-free algorithm (Lock-Free) is a concurrent algorithm that does not use a mutex. It is usually based on CAS operations and other atomic operations, and achieves concurrent access in a non-blocking manner, avoiding performance loss caused by lock competition.

2.7. Avoid excessive synchronization

Reasonably evaluate the scope of synchronization code blocks to avoid unnecessary synchronization, reduce the scope of lock competition, and improve concurrency performance.

A suitable strategy needs to be selected according to specific application scenarios and requirements. At the same time, performance optimization is a process of comprehensive consideration, and other factors, such as resource utilization and algorithm optimization, need to be integrated to obtain the best performance improvement effect.

4. Code optimization

Code optimization is the basis of performance optimization, which can improve the execution efficiency of the program by reducing redundant and repeated operations in the code. Here are some common code optimization techniques:

1. Data structure

Choosing the right data structure can significantly improve the performance of your code. By analyzing the characteristics of the problem, choose a more efficient algorithm to solve the problem, such as using quick sort instead of bubble sort.

2. Avoid double counting

Store the results of repeated calculations to avoid calculating the same result multiple times in the program. For example, you can store the square of a number in a variable and then use that variable directly when needed.

3. Memory management

Reasonable management of memory resources can reduce the overhead of memory allocation and release, such as avoiding frequent object creation and destruction, and using object pools, caches and other technologies for optimization.

4. Use bitwise operations

Bitwise operations are faster than arithmetic operations because they can directly manipulate binary bits. For example, you can use the bitwise AND operator (&) to check whether an integer is even.

5. Reduce function calls

Function calls incur additional overhead, so the number of function calls should be minimized. For example, you can store some commonly used calculation results in global variables, and then use these variables directly when needed.

Regarding code optimization, you can choose a coding standard or form a certain code base. Official download address: " Java Development Manual (Huangshan Edition).pdf "

6. Streaming technology

The stream technology for performance optimization refers to using the Stream API in Java 8 to efficiently traverse and operate collections. Stream API can provide a declarative programming style, using Lambda expressions to perform various aggregation operations and batch data operations on collections, such as sorting, filtering, mapping, etc. The Stream API can also support parallel processing and take advantage of multi-core CPUs to improve data processing speed.

The implementation principle of the Stream API mainly involves the following aspects:

  • The classification of Stream operations is divided into intermediate operations and termination operations, as well as subcategories such as stateless, stateful, short-circuit, and non-short-circuit.
  • The structure of the Stream source code mainly includes interfaces and classes such as BaseStream, Stream, ReferencePipeline, and Sink.
  • The superposition of Stream operations assembles each operation into a call chain through ReferencePipeline, and defines the relationship between each operation through the Sink interface.
  • Stream parallel processing, convert serial stream to parallel stream through parallelStream() method or parallel() method, realize data splitting and merging through ForkJoinPool framework.

7. Reactive technology

The reactive technology for performance optimization refers to an asynchronous programming paradigm oriented to data streams and events, which can improve program response speed and resource utilization. Reactive technology can be divided into two aspects: reactive programming and reactive architecture.

Reactive programming refers to using some libraries or frameworks, such as RxJava, Reactor, Redux, etc., to implement responsive processing of data streams and events, which can simplify the complexity of asynchronous programming and improve the readability and maintainability of code.

Reactive architecture refers to the use of some design principles and patterns, such as responsive declaration, microservices, message-driven, etc., to build a highly available, highly scalable, and highly elastic distributed system that can cope with changing requirements and loads.

5. Database access optimization

1、MySQL

1.1, MySQL tuning dimension

1) SQL and index optimization: SQL query optimization is the key to improving MySQL performance. Various methods can be used to optimize query statements, such as using appropriate indexes, avoiding the use of function operators in WHERE clauses, reducing subqueries, etc.

2) Table structure optimization: The design and structure of the table will also affect the performance of MySQL. Proper table design can improve query performance and data processing speed. For example, using partitioned tables can speed up queries, and splitting tables vertically can reduce the load on the database.

3) System configuration optimization: MySQL and server parameter settings are also very important for performance. Adjusting the configuration can allow MySQL to make better use of hardware resources. For example, increasing the buffer size, adjusting the connection timeout, or optimizing the sort cache can improve system performance.

4) Hardware optimization: In addition to software optimization, MySQL performance can also be optimized through hardware. For example, using faster disks, increasing memory, and upgrading CPUs can all increase MySQL's load capacity.

1.2, MySQL tuning decomposition

2. Cache technology

Caching technology is a commonly used strategy in performance optimization. It reduces the number of visits to the original data source by storing copies of calculation results, data or resources, thereby improving the speed and performance of data access.

2.1. Common caching technologies:

  1. Data caching: Store frequently accessed data in memory or other high-speed storage media to avoid frequent disk or network access. Common data caches include memory caches, distributed caches (such as Redis, Memcached), etc.
  2. Query cache: For database query results, query statements and their results can be cached, and the next query can be directly obtained from the cache, reducing the number of database queries. Query caching can be implemented by the database's own query cache or by using an external caching tool (such as Ehcache).
  3. Object caching: Store frequently used objects in memory to avoid frequent object creation and initialization. Object caching can improve object access speed and reusability.
  4. Page caching: cache the dynamically generated page content, and return the cached page directly when the next request is made, avoiding repeated page rendering and database query operations. Common page caching technologies include browser caching, CDN caching, and so on.
  5. Resource cache: For time-consuming resource loading operations, such as loading of images, CSS, JavaScript and other files, they can be cached locally or in CDN to reduce network requests and improve loading speed.
  6. Cache Warming: Load and initialize cached data ahead of time, before the application starts or requests begin, to avoid cold-start delays on first access.

It is necessary to select the appropriate caching technology according to the specific application scenarios and requirements, and comprehensively consider factors such as cache consistency, capacity, and update mechanism. At the same time, the design and management of the cache also need to be cautious to avoid cache expansion, data consistency issues, and the impact of expired caches.

2.2, cache classification

1) Local cache:

Store cached data in the internal memory of a single application process, usually using Java collection classes such as HashMap, ConcurrentHashMap, etc. for implementation. The advantage of local caching is that it is fast, easy to implement, and does not require network transmission, but it cannot share data across multiple application processes.

2) Distributed cache:

Store the cache data on multiple servers, and transmit the data through the network to realize cache sharing. Common distributed cache frameworks include Redis, Memcached, Ehcache, etc. The advantages of distributed cache are good scalability, high concurrency support, large capacity, and the ability to improve application reliability and availability.

3) Multi-level cache (local + distributed):

Store cached data in both local cache and distributed cache to speed up access and improve reliability. Common multi-level cache solutions include EHCache+Redis, Guava Cache+Redis, etc. The advantage of multi-level caching is that it takes into account the advantages of local caching and distributed caching, making the caching system more flexible and more powerful.

6. Communication and IO optimization

1. Non-blocking IO and asynchronous IO models

  • The non-blocking IO (Non-blocking IO) model is implemented by using non-blocking sockets, allowing multiple connections to be processed simultaneously in a single thread, reducing thread switching and resource occupation.
  • The asynchronous IO (Asynchronous IO) model can continue to process other tasks after initiating an IO operation by using a callback or event-driven method. When the IO operation is completed, the callback function is triggered to improve the efficiency and throughput of IO.
  • Selecting a suitable IO model requires comprehensive consideration of factors such as the number of concurrent connections, system resources, and response time based on specific application scenarios and requirements.

2. NIO and multiplexing technology

  • The NIO (New IO) framework provides support for non-blocking IO, including components such as Channel, Buffer, and Selector, which can achieve efficient IO operations.
  • Multiplexing technology monitors multiple IO events through one thread. For example, using a Selector selector can simultaneously process read and write operations of multiple connections, reducing the number of threads and resource usage.

Selector

The selector is an important component in Java NIO. It can be used to monitor the read and write events of multiple channels at the same time, and respond immediately when an event occurs. The selector can realize the effect of single-thread monitoring multiple channels, thereby improving system throughput and operating efficiency.

Channel

A channel is an object for reading and writing data, similar to a stream in Java IO. Unlike streams, channels can be read and written non-blockingly, and read and write operations can be performed concurrently. Channels are divided into two types: FileChannel and SocketChannel, which are used for files and networks respectively

communication.

Buffer

In Java NIO, all data is transferred through buffer objects. A buffer is a contiguous block of memory that holds data that needs to be read or written. The buffer object contains some state variables, such as capacity, limit, position, etc., which are used to control the reading and writing of data.

3. Protocol and data format

  • Optimizing protocols and data formats can reduce the amount of data transmitted over the network and improve transmission efficiency. For example, use binary protocol instead of text protocol, compress data, reduce transmission of useless data, etc.
  • Using serialization technologies (such as Protocol Buffers, MessagePack) can achieve efficient object serialization and deserialization, reducing data size and transmission time.
aspect http protocol rpc protocol
transport layer Based on TCP, it has a specific transmission format, contains a lot of header information, and has low data transmission efficiency Based on TCP or UDP, custom data format, high data transmission efficiency
Versatility Don't care about implementation details, cross-language, cross-platform, suitable for calls between departments or external services It needs to be encapsulated at the API level, which limits the development language environment and is suitable for internal service calls
Development Difficulty Relatively simple, just follow the REST specification, details such as requests and responses need to be implemented by yourself Relatively complex, need to consider server selection, serialization, communication, fault tolerance and other functions
speed Slower, affected by HTTP header information and TCP handshake Faster, concise data format, and reliable communication method

4. Http connection pool and connection multiplexing

Http connection pool is a technology for managing and reusing HTTP connections, which can improve the performance and efficiency of HTTP requests. The following are several common Http connection pool implementation technologies:

  • Connection pooling technology can manage and reuse connections, avoid frequently creating and closing connections, and reduce connection establishment and destruction overhead.
  • Using the connection pool can improve the reuse rate of connections, reduce the process of connection initialization and authentication, and improve the concurrency and performance of the system.

The following are some open source http connection pool implementation technologies:

  1. OkHttp: OkHttp is a modern HTTP client library that supports connection pool management. It has the characteristics of concise API and high performance, and can automatically manage connection reuse and connection timeout.
  2. Apache HttpClient: Apache HttpClient is a mature Java HTTP client library that provides connection pool management functions. It is highly customizable and flexible, and supports functions such as connection multiplexing, connection timeout control, and concurrent request management.

5. Synchronization becomes asynchronous

Performance optimization Synchronous to asynchronous is a common programming pattern that can improve the responsiveness and throughput of the system. Synchronous operations will cause the request to block until it fails or returns a result successfully. Asynchronous operations can support horizontal expansion, relieve instantaneous request pressure, and make requests smooth.

If multiple steps are required in an interface, and these business operations are independent, the traditional synchronous execution based on code order is time-consuming, and the traditional optimization space is relatively small, then you can consider using multi-threading to optimize the interface, so that synchronization becomes asynchronous, and the interface business operations are processed in parallel, greatly improving the performance of the interface.

Seven, performance monitoring

In a microservice architecture, system performance monitoring typically uses the following tools and techniques:

  1. Distributed tracing tools: Distributed tracing tools are used to trace and monitor request links between microservices to help discover performance bottlenecks and failure points. Common distributed tracing tools include Zipkin, Jaeger, and SkyWalking.

  1. Indicator monitoring and time series database: Indicator monitoring tools are used to collect, store and visualize key indicators and performance data of the system, helping users understand the status and performance of the system in real time. Common indicator monitoring tools include Prometheus, InfluxDB, and Grafana.

  1. Log management and analysis tools: Log management and analysis tools are used to collect, store and analyze the log data of microservices to help diagnose and solve problems. Common log management and analysis tools include ELK Stack (Elasticsearch, Logstash, Kibana), Splunk, and Graylog, etc.

  1. Container monitoring and management tools: If microservices are deployed on containerized platforms, such as Docker and Kubernetes, container monitoring and management tools can be used to monitor container resource usage, network communication, and scheduling performance. Common container monitoring and management tools include cAdvisor, Prometheus Operator, and Kubernetes Dashboard.

  1. Application Performance Management (APM) tools: APM tools are used to monitor and analyze the performance and health of microservice applications in real time, including service response time, database query performance, CPU and memory usage, etc. Common APM tools include New Relic, AppDynamics, and Dynatrace.

These tools and technologies can provide real-time system performance monitoring, troubleshooting and performance optimization capabilities, helping developers and operation and maintenance teams monitor and manage performance and availability in the microservice architecture. Choosing the right tools and technologies requires consideration of specific needs, technology stacks, and scalability requirements.

8. Structure optimization

The architecture optimization of system performance optimization is a key part, which includes the following technologies and strategies:

  1. Load balancing: Load balancing is used to distribute requests to multiple servers to balance the load of the system and improve the throughput and availability of the system. Common load balancing technologies include hardware load balancers (such as F5) and software load balancers (such as Nginx, Gateway).
  2. Traffic funnel: The traffic funnel is used to limit the request rate of the system to prevent the impact of sudden traffic on the system. It smooths the arrival of requests and prevents system overload by setting request rate limits and queuing mechanisms. The traffic funnel technology is often used in scenarios such as API traffic limit and request limit.
  3. Cluster: A cluster is a logical unit that combines multiple servers to process requests together, improving the scalability and fault tolerance of the system. By adding server nodes through horizontal expansion, the processing capacity and concurrent performance of the system can be improved. Common cluster technologies include distributed cache (such as Redis cluster), distributed database (such as MySQL cluster), and distributed file system (such as Hadoop).
  4. Fuse: The fuse mechanism is used to automatically disconnect the service when the system fails or is abnormal, so as to protect the stability and availability of the system. By setting the threshold and time window, when the error rate or response time of the system exceeds a certain threshold, a fuse will be triggered to stop the request for the service and avoid the spread of errors.
  5. Degradation: Degradation is to temporarily block certain functions or services when system resources are limited or abnormal conditions occur, so as to ensure the availability of core functions. By setting priority, when the system load is too high or abnormal, some non-core functions can be actively closed or simplified to improve system stability and responsiveness.
  6. Current limiting: The current limiting mechanism is used to limit the maximum number of concurrent requests of the system to prevent the system from being overwhelmed by too many requests. By setting the request rate limit, the limit on the number of concurrent connections, etc., you can control the load of the system and maintain the stability of the system. Common current limiting techniques include token bucket algorithm, leaky bucket algorithm, etc.

These architecture optimization techniques and strategies can be selected and applied according to specific application requirements and system bottlenecks. Through reasonable architecture design and optimization, the performance, scalability and availability of the system can be improved to ensure that the system can withstand high load and high concurrent requests.

9. System optimization

System optimization is the optimization of the entire system, which can improve the performance of the system by adjusting the configuration and parameters of the system. Here are some common system optimization tips:

  1. Space for time: Use memory, cache and other storage devices to reduce disk or network read and write operations and improve data access speed.
  2. Time for space: When space becomes a bottleneck, methods such as batch processing, compression, and partitioning can be used to reduce space occupation and data transmission overhead.
  3. Close unnecessary services: The more services running in the system, the heavier the system's burden, so unnecessary services should be closed to reduce the system's burden. For example, you can use Task Manager to shut down unnecessary processes and services.
  4. Adjust process priority: Process priority determines the order in which the operating system allocates resources, so system performance can be optimized by adjusting process priority. For example, important processes can be given high priority to ensure they get enough resources.
  5. Use performance monitoring tools: Performance monitoring tools can help users monitor system performance in real time and provide corresponding optimization suggestions. For example, you can use the built-in task manager of Windows or a third-party performance monitoring tool to monitor system performance indicators such as CPU, memory, and disk.
  6. Regular system maintenance: Regular system maintenance can clean up system junk files, fix system errors, update system patches, etc., thereby improving system stability and performance. For example, operations such as disk defragmentation, registry cleaning, and virus scanning can be performed regularly.


If the article is helpful to you, welcome to pay attention + like it, and you must return to close! ! !

Guess you like

Origin blog.csdn.net/citywu123/article/details/131822634