Published August 2002
Sitraka PerformaSure is a transaction-centric diagnosis tool that helps companies to measure, analyze and maximize performance of distributed J2EE applications. PerformaSure's exclusive Tag and Follow technology traces and reconstructs the execution path of end-user transactions across the web servers, clustered application servers, and JDBC database calls of an entire J2EE system. PerformaSure captures method-level timing, application server, OS, and network traffic metrics every step of the way.
This article provides an introduction to PerformaSure, showing how it can be used to diagnose performance bottlenecks through an entire J2EE system and to shed light on common J2EE deployment issues.
PerformaSure is made up of three major components: Agents, Nexus, and Workstation. These components collect data about your J2EE application, store that data, and visualize it. Like your distributed application, this design allows PerformaSure to scale well in any enterprise setting.
Lightweight, low-overhead Agents are placed on your web servers, application servers, and other systems that comprise your distributed application. Agents monitor the activities of a software system like BEA WebLogic or IBM WebSphere, or of hardware components like the CPU, hard drives, and main memory.
Agents report to the PerformaSure Nexus. The Nexus is responsible for collecting performance data from each Agent and storing it in a Session file. The Session is a record of how your application performed during a period of time. Data collected from the Agents is stored in time slices. Time slices allow PerformaSure to analyze how the performance characteristics of your application change over time.
The PerformaSure Workstation provides a rich interface for viewing the performance data stored in the Session. There are five types of viewer, each enabling you to browse and examine a different aspect of the application's performance. The Threshold View helps rapidly identify time periods where performance appears slow to the end-user. The Transaction Time View identifies slow running transactions and responsible servers. The Transaction Tree View shows the execution path of transactions instantly highlighting performance hotspots. The Metrics View provides a correlated view of application server and operating system metrics across clustered servers. Finally, the Network Traffic View shows IP to IP network traffic across the entire system.
At the heart of PerformaSure is its unique Tag and Follow technology. This technology allows Agents to track an end-user transaction request through different parts of your distributed application. When an end-user transaction enters the system, PerformaSure tags it so that Agents can follow and record its every action. The result is a seamless view of the behavior and performance of each transaction from web server to application server to database.
To generate a load in a development or QA setting, tools like Mercury Interactive's LoadRunner are used with PerformaSure. LoadRunner is a robust tool for generating end-user transaction requests on your system. PerformaSure complements it by showing you how those transactions are handled. Whatever load generation tool you use, the key to good J2EE performance tuning is being able to reproduce the same load on your application over and over again. This allows you to accurately gauge how much performance has improved as you tune the application, server and database configuration.
Installing PerformaSure is quite straightforward. The first step is to set up the Nexus. As the central data aggregation storage component, it's generally the best place to start. The user can specify some runtime characteristics, but generally the Nexus is pre-configured when installed. The communications protocol, thresholds, and filtering patterns can be adjusted. Filtering patterns allow you to configure PerformaSure to ignore certain transaction requests (such as all requests for GIF and JPEG files).
The next step is to place an Agent on each subsystem of your distributed application. Agents specialize in the type of data they collect. For example, a WebLogic or WebSphere Agent must be installed to collect information about those application servers and to record data about the transactions that they handle. PerformaSure also provides specialized Agents for web servers and operating systems. Agents only need the location of the Nexus.
The PerformaSure Workstation requires almost no configuration. After running the installer, simply launch the Workstation and enter the location of the Nexus.
After some simple installation and setup, you're ready to use PerformaSure. It's time to run PerformaSure and put it to work.
To keep this example short, we'll use a prerecorded session. The application we'll be analyzing is Sun's Pet Store application. We have already set up the Pet Store application, installed PerformaSure, and applied a load to the system. Pet Store is deployed on two load-balanced web servers, and two clustered application servers.
In this scenario, we will assume the role of a performance analyst. Our goal is to locate performance problems and forward that information to the appropriate development teams. Problems that can be solved by changing the application server configuration are preferred because they don't involve code changes.
We will focus the discussion on just one of PerformaSure's five viewers. While PerformaSure's other viewers would help isolate the performance problem, discussing them in detail is beyond the scope of this report.
QA has reported that the product details transaction performs particularly poorly. The Transaction Tree View is well suited to this type of problem, so it's a good starting point.
The Transaction Tree View is shown in Figure 2. Just below the menu and toolbar is the Time Control. Remember how PerformaSure uses time slices to store performance data? The Time Control allows you to select the time slices you want to use. For example, you could select only the time slices from 1:00pm to 1:15pm.
In the center of the view is the graph area. This view really shows the power of Tag and Follow technology. Transactions are graphed with the entry point on the left, typically a URL node, and the endpoints on the right, typically nodes representing method calls or JDBC access. Transactions are followed across machine boundaries, between web and application servers, and through multiple JVM instances. Tag and Follow records performance data on requested URLs, Servlets, EJBs, method calls, RMI, JNDI and the JDBC layer.
At the bottom of the view is the transaction data in table form. When doing very detailed analysis, this is often a better way to look at the data.
Two significant problems are immediately visible in the product details transaction. A portion of the transaction tree with the locations of the performance problems is shown in Figure 3. The tree is colored using the total time spent in each node, called the exclusive time of the node. Red represents a hot spot, and blue represents a node that executes quickly. You can see the two problem nodes colored red.
Performance problem #1: Figure 4 shows a close-up of the first two nodes in the transaction, and the tooltips that pop up as you mouse over each node. The Workstation uses tooltips to disclose performance data as you point at nodes in the tree. It take 83.784s (seconds) (A) of cumulative time for the web server to handle 103 (B) transaction requests. Cumulative time includes the time to execute a node and all of its child nodes. That average transaction executes in about 0.813s.
The web server then forwards the transaction to the application server. It takes 60.256s (C) of exclusive time for the application server to receive the 103 transaction requests, forward them to the Servlet, and send responses. Exclusive time is the amount of time spent in a single node. In this case, it accounts for about 72% of the total time to service a product details request. Something is wrong.
We've identified the first performance bottleneck, and it's a big one. Because the bottleneck occurs before the transaction reaches the Servlet, it's likely that this is a configuration issue. Perhaps too few execute threads have been setup on the application server. Whatever the case, with 72% of the time being wasted here it's a good place to concentrate the performance tuning effort.
Performance problem #2: The ShoppingClientControllerEJB is a stateful session bean. Figure 5 shows the portion of the tree that gets the bean, and the tooltip for the create() method on the home class. Pet Store takes 6.013s (D) to create a ShoppingClientControllerEJB. That's 7% of the total time to service the request. What's going on here?
A closer inspection of the EJB shows that the ejbCreate() and ejbPassivate() methods are being called a lot. There are 103 calls (E) to ejbCreate(). That matches the number of transactions. However, it's interesting that there are 80 calls (F) to ejbPassivate(). That seems excessive.
Given that not much time is spent in the ejbCreate() and ejbPassivate() methods, and the excessive passivation, it's likely that the EJB isn't properly configured. The problem could be solved by reducing the idle-timeout-seconds limit, or by increasing the limit for max-beans-in-cache. Another choice may be to modify the application code to use HTTP session to store web information rather than the ShoppingClientController stateful session bean.
As an aside, we can confirm this diagnosis by examining other views, such as the Metrics View. We can examine application server passivation and cached beans metrics and see that the entire cache is filled quickly, causing passivation to increase. An operating system metric would show increased disk activity.
This problem and the previous one account for 79% of the time to execute the product details transaction. It's time to send an email to the development team and report the performance bottlenecks.
Correcting both problems has the potential to increase performance of the product details transaction by up to 4.8 times. That's a dramatic improvement. Other transaction requests should also perform better. Less time spent passivating beans means the server is free to perform other tasks. The improved ShoppingClientControllerEJB configuration will increase the performance of all transactions that use the bean.
In just one View, PerformaSure provides enough information to locate two significant and non-obvious performance bottlenecks. The power of PerformaSure is that it brings together information that is distributed across your application and visualizes the parts that need performance tuning. It also allows you to recognize the parts that are performing well, saving you time and money.
A proven leader in J2EE Performance Assurance, Sitraka delivers advanced diagnostic solutions that help companies to pinpoint and eliminate performance hazards in mission-critical J2EE applications. Sitraka products include PerformaSure, a transaction-centric J2EE diagnosis solution, JProbe performance tuning tools, JClass Java components, and DeployDirector, a Java application deployment solution. Sitraka products are sold and supported directly through our North American headquarters in Toronto, European headquarters in Amsterdam and global network of resellers. Visit Sitraka on the Web at www.sitraka.com.