Performance Monitoring ---- true story about a Java program application performance

Original link: http://www.cnblogs.com/kapok/archive/2005/10/15/255401.html
http://dev2dev.bea.com.cn/techdoc/20030778.html

For any organization to ensure that enterprise applications to achieve high standards of performance are two basic requirements: a monitor in the case of load close to the critical state of the ability of the application; have the ability to quickly identify the root cause of failure in the event of a failure, whether the application is in QA, two-phase deployment (staging), or the deployment phase.

The story of this is true, the names used herein are not his real name.

This article is a real screenplay about Java performance problems. First, it provides an overview of Wily Enterprise Java Application Management solutions and its role before and after deployment performance analysis plays. Then, I will talk about some examples of performance degradation. You will know why ignore Java application management is worth the candle.


Wily 4 solution
with the maturity of a healthy development of the institution, it spontaneously developed a series of policies and procedures to eliminate it experienced quot;? Confusion "state, these" chaos "took place in and served as mission-critical Java the process of application performance issues in the fight against these institutions also know that poor application performance and huge overhead is closely associated: IT investment at low yield, and the risk of capital flow and end-user satisfaction aspects.

Wily Technology's Wily 4 solution allows enterprises at any stage of the application life cycle monitor, improve, manage enterprise Java applications.

It provides a common language, when a Java application performance problems, any department within IT organizations can use it to quickly identify and fix the problem.

With Wily 4, teams to monitor 24/7 without interruption of business application functionality to find funds transfer, payment of fees, the purchase of goods, as well as other core performance bottlenecks exist in the case - having to modify the source code, all use cases It can be mapped to the underlying servlets, JSP, EJB, and achieve their custom code. Customizable templates and a variety of integration options make Wily 4 become the perfect complement to mainstream systems management solutions. Wily 4 allows organizations to Introscope warning issued to existing operational troubleshooting approach, warning contains the application client function of the availability of information (see Figure 1).
image001.gif
The key component of Wily 4 solution include: Java Application Monitor Introscope, Introscope system for identifying potential leak monitor memory leaks, and resolve Introscope Transaction Tracker matters related to performance issues. Wily 4 solution also includes Introscope SQL Agent, and Introscope PowerPacks, identify performance problems for Java applications and back-end support connected to the system.

These solutions are combined with each other to provide a "whole Application" under a complete Java environment view.

Monitoring network performance: set and achieve performance objectives
As a best practice, some agencies use Wily Introscope set of reference records the response of key components and applications, the aim to achieve performance goals. The successful development of enterprise Java applications and deployment should include continuous core baseline measurements use cases, but also in the development of them should be the case. When developers to add and modify applications, these baselines can provide hard statistical evaluation methods to evaluate the follow-up version of the application. If you do not carefully monitor each step of the development process and will probably encounter a lot of bug, Worse, bottleneck architecture. If the two-phase commit or production stage and then repair these bottlenecks, the costs will be greater.

Like a good detective, a good baseline or benchmark constantly asking the same question until you get a satisfactory answer. This process encourages the kind of "look before you leap" approach, which repeatedly proved himself to be the most effective way to improve software performance.

The key use cases for all applications successfully loaded, the application must have all the features; a period of good performance load generator script application core function tests with regard to the role played by the equivalent of twice the excellent consistency check.

Once the application is running under load, we need to constantly monitor the agency from its performance on a functional level, to identify bottlenecks, especially in the first few months of the product stage.

Wily Introscope the details can be extracted from the Java metrics and reporting procedures into a view and a real performance applications described. In the most common pet store example, the metric is for product procurement, return receipts, catalogs look, and look for other aspects of inventory. Using this method, adjustment is no longer a component of motivation because it is slow, but because customers find the time required for return receipts grow to the extent unacceptable. Without such a view, you may waste valuable development time on performance tuning, but the overall performance of the application's success might not matter.

Wily's Introscope can report the number of calls and the average of all the key components of the client application, the minimum and maximum response time. Introscope also monitors concurrency (or business monitoring method for each logical component of the thread), and memory statistics, files and sockets i / o statistics, pooled resource utilization statistics (for example, the MQ connection), Environment statistics, and so on.

In the prototype J2EE application, acting as the controller servlet, the EJB acts as a model, the background connector API (such as JDBC, MQ Series and the like), and acts as a JSP view will be monitored. Additional means often take the easy customization Introscope step, so you can monitor specific business logic or background connector without changing the source code. For example, when Introscope at major airlines, telecommunications companies legacy mainframe callback, and run countless such custom business logic and back-end systems, it completed the task of monitoring SABRE affairs.

Data from the application program may be recorded to a CSV flat file database and mainstream, or integrated directly into the load generation tools.

Each customer can choose the most suitable for data recording and analysis of their needs. Data is written to the file can be automatically made in accordance with the restrictions configured file, which helps the script automated reporting and cleanup. Some commonly used database also contains examples of data that can be used to obtain the script.

The rest of this article focuses on the use of some Introscope real case.

"It is important Sleep"
Philadelphia, Monday, cloudy and cold. I value the day shift. Customer is an insurance company, their applications being deployed in two phases.

Our monitoring program is called AUDIT, it is a Web front-end legacy green screen applications.
Customer insurance agents who use the application to create a new insurance policy. AUDIT CICS is mainly used as its back-end data, and to a new process has been used according to the same 30-year legacy stored procedure, and then parsing them into JSP. Pages are designed to be loaded in 1-2 seconds, but each request requires spent 7-8 seconds. Data from CICS indicates that all requests have been quickly dealt with. So the problem is where is it? We need to quickly find out the reason.

Wily Introscope run custom data objects tracker on the application of analytical procedures show: Each transaction requires 50-60 times to call, but the response time of each call are just 100 milliseconds. Survey data object construction reveals placeholder logic developers.

Each object requires a unique ID. Thus prepared a constructor 100 ms sleep, the only purpose is to generate a time stamp by calling the system clock.

Outcome: Developers no longer remote set sleep time.

"Defect Bean adventure"
Texas, on Tuesday afternoon. I am a customer and together, this client struggling to test the performance problem for months trying to solve the application. When I consulted Dr. Watson, as he answered with a core dump.

In the product stage, because the response time is too slow application in this example will probably lose 30% of users purchase. Incredibly many requests need to run more than two minutes, and then timed out. Introscope soon discovered that the performance bottleneck in a method, the method is used to calculate the sales tax for the purchase order entry. The configuration is only one row Introscope enough to track all of the service logic being implemented in the interface of the present embodiment.

Through further examination, we found that this method uses Runtime.exec call a new JVM. In addition to the process, this new JVM calculate sales tax a single entry, and then outputs the result to the system output. The application server patiently waiting for this output, and then repeat the same thing for each entry. Under normal load conditions, up to 50-60 JVM is started simultaneously on and off, and each will need to wait for about 20-40 seconds.

We found that: lost transactions, both with multiple purchase transaction entries, and why? A developer is assigned that contains the EJB interfaces of third-party tax calculation software to complete the calculation of sales tax, but he could not put the software loaded into the context of the server. There is no way, he had to resort to the sample code of software, so he rewrite example a script, the script is transmitted to the appropriate parameters and generates a command line program on the system output results.

Outcome: suspect's findings demonstrate the innocence of the Perl scripting language. The suspect was fired. This example is finished.

"Then, nothing (JDBC cursors)"
That day is Wednesday and I from the famous Boston Public green space in the history far. In the loading test long process, my clients always will exhaust all JDBC cursors and MQ Series connection. Wily Introscope for statements of allocation and recovery of these resources were monitored. With Introscope's Blame characteristics, the team quickly identify the wrong EJB: under high load conditions, the application throws an exception, the result of skipping important close calls, they should have been included in the program last executed block thus ensuring the release of resources.

Outcome: Now, lose the "off" has been safely placed after the last block.


"On the issue of synchronization for help,"
Thursday, my dog barking. Finished his fifth cup of coffee, I go out.

Wily Introscope measure of average transactions a dedicated back the response time is 10 seconds.

10 but concurrent users in a load test, the average transaction response time to a background system between 90-100 seconds. This is interesting, it is clear background system does not process requests concurrently. By using concurrent metrics in the call, we quickly identify defects: nine concurrent user threads in a call to the backend systems in wait, only one thread in real processing tasks.

Wily concurrent measure can capture the following question: mistakenly put into the synchronized keyword class rather than on the method symbol, resource pool should have 10 instances of only one instance you need to be lost remote access server response. In this case, the application can not handle unexpected system failure background, so will spend a few minutes or a few hours, resulting in the return of this results in a few seconds the call expires. Concurrency is to find a quick and easy way to such defects from the application, which allows us a more elegant way to easily deal with similar failures in the product stage.

Outcome: applications and tools for its owner in the product stage at a low ebb.

"Maltese result sets"
Today is Friday, I'm in Atlanta. There are many trees on the streets so much better to just say there are a lot of peach.
The customer ran into trouble on a large database result set.

Because there is no build such a strategy: it ensures that the query can be limited, and is paid at a reasonable growth, my client within the application server raises a "pig in the python" memory failure. Search or find the error led to rampant everywhere on the JVM GC memory stack. The application server database effortlessly given in accordance with the way the result set is read, this is the most serious side effects of other transactions in response to the application brought serious external damage.

By pointing out the database query response time and "round-trip" inconsistent response time, Wily's Introscope SQL Agent immediately diagnose a fault. The latter measure contains all the system clock time spent before a closed result set, but the former contains only the time it takes to respond to the capture database. There is a big difference between the round-trip response times and corresponding measures. Of course, this measure can only be captured from within the application server.

Outcome: development team made the necessary changes. The application is now completely transformed, and play their role in society.

"Murder JDBC Express"
I volunteered to work Saturday in San Francisco. After the last customer visit Vegas, I need to work overtime.

I do not know of how far from the end of JDBC disaster. Customers have probably learned how to adjust the key database transaction response time --JDBC statement about a measured average of 10 milliseconds, most floating up and down within 1 millisecond.

Those matters are beyond the scope of how it happened? I used Introscope Transaction Tracker application, and recorded a 30-millisecond transaction occur under standard load.

In the past, there is no integration tools can help users find the "guilty" approach. Now, the transaction tracker call graph, use Introscope automatically remove extraneous data, enabling customers to see every tree in the forest: a single transaction led to almost 4000 JDBC statement. Even database responsive, so many queries can also be sufficient to cause a malfunction.

Analysis of Transaction Tracker shows that: each check method is responsible for handling JDBC statements from six calls - calls to this method accounted for almost 40% of the JDBC statement in each transaction. The development team has been minor modifications - less than a day of work - the use of buffering strategies to achieve these check. CPU utilization and processing time have come down; productivity is improved.

Outcome: database has been fully repaired, but since then, the development team is designated to develop a Swing.

"I know too much of Vector"
Saturday night, the long week-end. City lights through fog shot out. When I was a glass of rye whiskey when the phone rang.

The customer has a trading system based on J2EE architecture, which is a direct call for the EJB remote interface. A few months ago, they replaced the JMS implementation, are now experiencing a problem CPU overload and declining productivity. "GC Heap Bytes in Use" accompanying each transaction continue to emerge, and eventually led to memory exception.

The development team experimented with all commonly used integration tools (profiling tool), but no one can discover the source of leaks. (See Figure 3)

image002.gif
image003.gif
Wily Introscope monitors the leak quickly found abnormal growth in a Vector - The Vector is only a 45,000 kinds of data structures used by the application in. Vector problematic containing a transaction identifier, which temporarily buffer for fast rollback. When the number of applications increased transaction, Vector also increases. When the call to Vector "contain" when the new JMS implementation caused an unexpected result. This error explanation is called "contain" each transaction resulted in an increase of entry, the transaction would lead to growth and new Vector.

Outcome: hate Vector has now been replaced by the name of Victor, and it is well run witness protection program in Florida.
The end of a long farewell, I walked into the night. I do not know where the next case will take place.

 About the Author
Dave Martin is Wily technology company, a systems engineer. He has worked for many corporate customers, the main work is to identify and fix performance problems in Java applications. As Wily Technology, a former software developer, Dave has extensive knowledge of architecture, design and implementation of software system architecture aspects of J2EE.

Reproduced in: https: //www.cnblogs.com/kapok/archive/2005/10/15/255401.html

Guess you like

Origin blog.csdn.net/weixin_30411997/article/details/94783388