Summary from a 13-year testing veteran, common problems + solutions + analysis in performance testing...


Preface

1. Memory overflow

1) Heap memory overflow

Phenomena:
After the stress test is executed for a period of time, the system processing capacity decreases. At this time, use JConsole, JVisualVM and other tools to connect to the server to check the GC situation. Each time GC recycling is incomplete and the available heap memory is getting less and less.

The stress test continues, and eventually an error message is reported in the log:java.lang.OutOfMemoryError.Java heap space

Troubleshooting method:
Use the jmap -histo pid > test.txt command to save the heap memory usage to the test.txt file, open the file to view the top 50 classes Are there any familiar class names or class names marked by the company? If so, it is highly suspected that the memory leak is caused by this class.

If not, use the command: jmap -dump:live,format=b,file=test.dump pid to generate the test.dump file, and then use MAT for analysis.

If you suspect a memory leak, you can also use JProfiler to connect to the server and start running the stress test. After running for a period of time, click "Mark Current Values". Subsequent runs will show the increment. At this time, perform GC and observe which class has not been thoroughly tested. Recycling, you can basically determine that the memory leak is caused by this class.

Solution: Optimize the code. After the object is used, it needs to be set to null.

2) Permanent generation/method area overflow

Phenomenon: After the stress test is executed for a period of time, there is an error message in the log:java.lang.OutOfMemoryError: PermGen space

Cause: Because there are too many static variables such as classes, method descriptions, field descriptions, constant pools, access modifiers, etc., the persistent generation is full and the persistent generation overflows.

Solution: Modify the JVM parameters and increase the XX:MaxPermSize parameter. Minimize static variables.

3) Stack memory overflow

Phenomenon: After the stress test is executed for a period of time, there is an error message in the log:java.lang.StackOverflowError

Cause: The stack depth requested by the thread is greater than the maximum depth allowed by the virtual machine, the recursion does not return, or it is caused by a circular call.

Solution: Modify the JVM parameters, increase the Xss parameter, and increase the stack memory. Stack memory overflow must be caused by batch operations. Reduce the amount of batch data.

4) System memory overflow

Phenomenon: After the stress test is executed for a period of time, there is an error message in the log:java.lang.OutOfMemoryError: unable to create new native thread

Cause: The operating system does not have enough resources to generate this thread. When the system creates a thread, in addition to allocating memory in the Java heap, the operating system itself also needs to allocate resources to create the thread. Therefore, when the number of threads reaches a certain level, there may still be space in the heap, but the operating system cannot allocate resources, and this exception occurs.

Solution:
Reduce heap memory;
Reduce the number of threads;
If the number of threads cannot be reduced, then Reduce the stack size of each thread and reduce the individual thread size through -Xss so that more threads can be produced;

2. CPU is too high

1) US cpu high

Phenomenon: During the stress test, use the top command to check the system resource usage. The us cpu is too high, exceeding 50%.

Troubleshooting methods:
Use the top command to identify which process consumes the most CPU;
Find the thread that consumes the most CPU:top -H -p Process number;
Convert the thread number into hexadecimal: printf "%x\n" Thread number;
Then use the jstack command to analyze What is this thread doing: jstack process number | grep hexadecimal thread number;
Through the layer-by-layer analysis of JProfiler's CPU Views, you can clearly find the cause of high CPU; a>

2) sy cpu high

Phenomenon: During the stress test, use the top command to check the system resource usage. The sy cpu is too high, exceeding 50%.

Troubleshooting methods:
First check the disk busyness and disk queue (iostat, nmon);
If there is no problem with the disk, use strace to check System kernel call status;

3. TPS cannot be uploaded

1) Network bandwidth

In stress testing, it is sometimes necessary to simulate a large number of user requests. If the data packets transmitted per unit time are too large and exceed the transmission capacity of the bandwidth, it will cause competition for network resources and indirectly lead to the number of requests received by the server. Less than the upper limit of the server’s processing capabilities.

2) Connection pool

The maximum number of connections is too few, causing requests to wait. Connection pools are generally divided into server middleware connection pools (such as Tomcat) and database connection pools (or understood as the maximum allowed number of connections).

3) Garbage collection mechanism

From the perspective of common application servers, such as Tomcat, if the heap memory setting is relatively small, it will cause the Eden area of ​​the new generation to frequently perform Young GC, and the Full GC of the old generation will also recycle more frequently, which will also have a certain impact on TPS. , because garbage collection usually suspends the work of all threads.

4) Database

Under high concurrency conditions, if the requested data needs to be written to the database and needs to be written to multiple tables, if the maximum number of connections in the database is not enough, or the SQL to write the data has no index or bind variables, or there is no master-slave separation, Separation of reading and writing will cause database transaction processing to be too slow and affect TPS.

5) Hardware resources

Including CPU (configuration, usage, etc.), memory (occupancy, etc.), disk (I/O, page exchange, etc.).

6) Press

For example, Jmeter and Loadrunner have limited single-machine load capacity. If the number of user requests that need to be simulated exceeds their load limit, it will also indirectly affect TPS (at this time, distributed stress testing is required to solve the single-machine load problem).

7) Business logic

The degree of business decoupling is low and relatively complex. If the entire transaction processing line is stretched, TPS will not be able to increase.

8) System architecture

For example, whether there is a cache service, cache server configuration, cache hit rate, cache penetration, cache expiration, etc. will all affect the test results.

4. Performance problem analysis process

Check the CPU, memory, load, etc. of the server, including the application server and database server;
Check the health status of the database, database deadlock, connection pool is not released;
Check the project log (check if there is no error);
Check the jvm’s gc and other conditions;

The following is the most comprehensive software testing engineer learning knowledge architecture system diagram in 2023 that I compiled.

1. Python programming from entry to proficiency

Please add image description

2. Practical implementation of interface automation projects

Please add image description

3. Web automation project actual combat

Please add image description

4. Practical implementation of App automation project

Please add image description

5. Resumes of first-tier manufacturers

Please add image description

6. Test and develop DevOps system

Please add image description

7. Commonly used automated testing tools

Please add image description

8. JMeter performance test

Please add image description

9. Summary (little surprise at the end)

Don't be afraid of failure, because it is the only way to success; don't be afraid of difficulties, because it is the catalyst for growth. Move forward bravely and persist in your struggle, and you will discover your inner strength and create dazzling brilliance.

At the dawn of every dawn, take courage and take firm steps. No matter how difficult the road ahead is, continuous struggle is the only way to realize your dreams. Believe in yourself and shine out your brilliant life.

Struggle is the melody of life, and every effort is a movement of self-transcendence. No matter how many difficulties and challenges you encounter, keep fighting, bloom your own bright light, and create incredible miracles.

Guess you like

Origin blog.csdn.net/shuang_waiwai/article/details/134854146