[an error occurred while processing this directive]
Java Performance Tuning
Java(TM) - see bottom of page
Our valued sponsors who help make this site possible
JProfiler: Get rid of your performance problems and memory leaks!
Training online: Threading Essentials course
Newsletter no. 15, February 22th, 2002
Get rid of your performance problems and memory leaks!
Get rid of your performance problems and memory leaks!
First off, if you feel this site has been useful to you, I encourage
you to vote for my book "Java Performance Tuning" in the
JDJ readers choice awards. The JavaPerformanceTuning.com website is a
companion to, and a direct consequence of my book.
Now, on to this month's news. I've added 3 new category pages to the website:
tips about load balancing;
tips about JDBC connection pools; and
tips on using
The articles keep coming so thick and fast that I can barely keep up.
And what an eclectic mixture. We have object tracking and recycling,
multicasting, compression, cache management, JDBC, JMS, RMI, Proxy
objects, signal handling, and much more. The Dr. Dobbs "Priority Queue"
article stands out as one providing predictive capability through
mathematical analysis using queueing theory, and throws in priority
queues to boot. We have a clutch of webservice articles
focusing on performance. I know web services don't necessarily mean
Java, but these articles provide useful tips for any coarse grained
distributed system. And articles for J2ME have made a strong
comeback after having being absent for a while.
Kirk discusses ECperf, covers some interesting discussions, and
updates us on airport security.
We also have a long list of performance related tools. Normally these
come in dribs and drabs, but for some reason I have a sudden flood
to add to the resource page this month.
If you feel this newsletter could be useful to a friend,
why not forward it to them. Note also that Yukio Andoh's
Japanese translation will hopefully be available at
http://www.hatena.org/JavaPerformanceTuning/ in a while.
A note from this newsletter's sponsor
Precise Software revolutionizes J2EE Application Performance
Management with automatic problem detection & correction. Read
our white paper on how to instantly isolate J2EE bottlenecks.
Java performance tuning related news.
All the following page references have their tips extracted below.
All the following page references have their tips extracted below.
ECperf is a benchmark used for measuring the performance and scalability
of J2EE servers. The benchmark was created under the Java Community Process.
ECperf uses a new scaling measurement known as Benchmark Business operations
per minute (BBops/min). BBops/min is a count of business transactions and orders
that can be carried out in the customer?s domain. With so many benchmarks in
existence, why do we need another? One reason is that ECperf is intended to run
in any vendor?s environment without the need to change any code. This consistency
should lend real meaning to the measurement of BBops. Ed Roman, CEO of The
Middleware Company, makes the claim that
This brand new benchmarking service offers vendors an easy and cost-efficient
way to verify claims by working through a trusted and credible third party.
Because it uses the Sun-endorsed ECperf benchmarking standard, this also makes
performance claims both fair and comparable.
Wow! Of course it?s expected that vendors will place the best light on
their products, and now we have some means of independently verifying that
their technology stands up. So, where are all the results? Of the eleven
original members of the ECperf JCP committee, only two studies have been posted
(and one of those has been withdrawn). Quoting Sun MicroSystems' Shanti Subramanyam,
ECperf Spec lead, "Well I certainly hope that vendors will be publishing ECperf results."
Certainly IONA, BEA, and HP have conducted in house tests. Certainly Sybase, Oracle
and the others know how well their technology performs. So where are the results?
To find out, I did a quick search on BEA?s website using the keyword ecperf. The
search turned up two hits. It looks as though BEA's answer is to train customers
to perform their own ECperf benchmark. Interesting, I looked through HPs website
looking for ECperf results for Bluestone. This search netted a single hit which
turns out to be a link to the ECperf home page at javasoft. So, if Sun?s position
is that they?d like to see results published, one can only ask, why are vendors
Ask any salesman and he?ll tell you that unless the Benchmark shows that you?re
the best, the results can only hinder his efforts to sell product. Everyone?s
application is different so, should this be the case? Well, more often than not,
people inappropriately apply results from standard benchmarks to their application
and/or environment. To top it all off, when things don?t work out, they blame
the benchmark. The question is, where?s the fault? Is it the benchmark or is it
in a poor extrapolation of results. No matter the answer, benchmarks now suffer
from a creditability gap. Combine this with the pressure on sales to sell and
boom, there goes any hope of seeing many results being published.
On the brighter side, I see ECperf as being a better benchmark. Why? Well
one of the knocks against benchmarks is that they do not match real life.
Though one could still make this claim with ECperf, its emphasis on performing
constant quantities of work and it?s measurement of BBops do offer hope that
results from these benchmarks will allow for customers to better extrapolate
results into their environment. Now, we will only have to worry about estimating
the effects of different hardware configurations.
Now lets wander into the saloon at the JavaRanch to begin this month?s roundup.
The first question of interest was on how one might monitor the amount
of memory that your Java process is using. Though the answer was,
unanimously, to use Runtime.getRuntime.totalMemory(), it did come with
the following warning: "totalMemory() returns the amount of memory used
by the JVM." The total amount of memory used by the JVM is that which
the OS has currently allocated. Depending upon high water marks, you may
actually be using far less than that amount at any particular time.
Want to know how to scare your DBA? Try closing your DB connections
between each JDBC call. Every DBA knows that setting up then tearing
down a DB connection is a heavy weight process and so do the authors of
the JDBC 2.0 specification. If you look at the JDBC 2.0 specification,
you?ll find that it includes a connection caching scheme in JDBCDataSource.
One last note, there was a question regarding variable initialization.
Though this topic has been covered before, it keeps surfacing often enough
that it is worth the time to review how variables are initialized. All
variables are initialized to their default values, null for references,
false for booleans, and 0 for numerics. Any initialization code found in a
static or class initializer and any constructor will be executed in addition
to the default initializations. Thus, there is no need to initialize variables
to their default value.
Last month there was a little ripple of noise regarding JDK 1.4 GC. That
ripple continues at the performance discussion group hosted at www.JavaGaming.org.
The gamer is looking for even more control over GC. In particular, he wanted to
be able to schedule GC on a regular interval. There seemed to be quite a bit of
support for this suggestion. Both Jack and myself have tuned a Smalltalk
application sever that ran what was known as an epoch GC. This GC was
scheduled to run at a regular interval. The result of being able to adjust
the run interval was often better performance. So there is some evidence
that these guys maybe onto a good thing.
There was a long thread discussing possible solutions to untangling the
CPU needs of Flash banners and applet-based games running in the same
browser. In the end, there seemed to be no way to throttle back Flash.
The only possible solution left was to slow the frame rate down to 25 fps.
In the middle of the discussion, there is a very nice tutorial on how to use
the JNI to control process priorities in Windows. As further evidence on how
long the discussion thread ran, there was also a detailed discussion as to why
applet writers still rely on JDK 1.1. From this perspective, it?s clear that
Microsoft?s decision not to ship a JVM with Windows is a set back to this group.
Finally, a posting alerted all that there was an article posted at
[one of the articles in the tips section too] regarding native compilation.
In it, there are pros and cons of using native compilation. In the world of
JIT and other types of runtime optimizations, you have to wonder how long
it will take before we get over the need to see a file of static machine code.
Maybe it begs the question: how long was it before application programmers realized
the advantage of writing in a 3GL over assembler? [There's still one or two who
haven't seen the light, Kirk -ed.]
There was more discussion of benchmarking at The Server Side.
Is it possible to bind a process to a single CPU? Though this is not
really a Java question, it does have something to do with tuning.
The answer on Linux, Solaris, AIX and Windows is yes. For Windows,
this comes with a few third party products. In Solaris, processes can
be bound to processor sets. This technique seems mostly useful when one
is trying to guarantee a minimum level of service or when one is trying
to limit the activities of a CPU hog. I?ll wait for reader response before
posting any other platforms that support this sometimes-useful technique.
Here?s a question: "Is it a good idea (in terms of scalability) to store a
serialized string of 1MB size in the HTTPSession?" It turns out that
the author wants to store a large ResultSet in a stateful session bean (SFSB). One response
addresses the question by suggesting that the call be made directly from a Servlet.
Though this would shorten the distance between the client and the source, I?m not
sure if it would really be enough to overcome the potential performance problems.
This does seem to lend itself to a cursor-like Entity bean solution. In the solution,
the query is conducted in a stateless session bean (SLSB). The result set is
wrapped in an Entity bean. The SLSB passes a handle to the Entity bean back to
the client (Servlet) after which successive calls to the Servlet will result in
the result set being scrolled through.
Finally, since I did two successive columns on efficiencies in airports,
I thought you might be interested in these two conversations I?ve had with
Airport security guards.
As I approach the metal detectors at Ft. Lauderdale International Airport,
I am stopped by a security guard and was asked for my boarding card and a
picture id. I handed her my passport and boarding card. After flipping through
my passport several times, she hands it back to me. This is the conversation that
Security guard: "I need an id with a picture in it"
Me: "But that?s my passport"
Security guard: "It needs to have a picture in it."
Me: "But, all passports have pictures in them"
On an even scarier note, I was entering the C halle of Charles du Gaulle
airport while in transit when I encountered their metal detection systems.
After I passed through the machine, the lady behind me followed and set off all
of the alarms. Now, you?d think that the security people would have reacted.
You might even think that the guy standing there with the M-16 would have said
something but no, she continues on as if nothing happened. The security people
didn?t even look at her. So, I asked the security why they didn?t stop the metal
laden lady. I was told by the screener (who was looking at a monitor at the time)
that she knew how to do her job and that I should go away!
It?s funny, but I never experienced anything like this pre-September 11th.
Balancing Network Load with Priority Queues (Page last updated December 2001, Added 2002-02-22, Author Frank Fabian). Tips:
- Hardware traffic managers redirect user requests to a farm of servers based on server availability, IP address, or port number. All traffic is routed to the load balancer, then requests are fanned out to servers based on the balancing algorithm.
- Popular load-balancing algorithms include: server availability (find a server with available processing capability); IP address management (route to the nearest server by IP address); port number (locate different types of servers on different machines, and route by port number); HTTP header checking (route by URI or cookie, etc).
- Web hits should cater for handling peak hit rate, not the average rate.
- You can model hit rates using gaussian distribution to determine the average hit rate per time unit (e.g. per second) at peak usage, then a poisson probability gives the probability of a given number of users simulatneously hitting the server within that time unit. [Article gives an example with gaussian fitted to peak traffic of 4000 users with a standard deviation of 20 minutes resulting in an average of 1.33 users per second at the peak, which in turn gives the probabilities that 0, 1, 2, 3, 4, 5, 6 users hitting the server within one second as 26%, 35%, 23%, 10%, 3%, 1%, 0.2%. Service time was 53 milliseconds, which means that the server can service 19 hits per second without the service rate requiring requests being queued.]
- System throughput is the arrival rate divided by the service rate. If the ratio becomes greater than one, requests exceed the system capability and will be lost or need to be queued.
- If requests are queued because capacity is exceeded, the throughput must drop sufficiently to handle the queued requests or the system will fail (the service rate must increase or arrival rate decrease). If the average throughput exceeds 1, then the system will fail.
- Sort incoming requests into different priority queues, and service the requests according to the priorities assigned to each queue. [Article gives the example where combining user and automatic requests in one queue can result in a worst case user wait of 3.5 minutes, as opposed to less than 0.1 seconds if priority queues are used].
- [Note that Java application servers often do not show a constant service time. Instead the service time often decreases with higher concurrency due to non-linear effects of garbage collection].
Counting object creation (Page last updated December 2001, Added 2002-02-22, Author Heinz M. Kabutz). Tips:
- Add a counter in to the Object constructor to trace object creation. Doesn't trace arrays [nor objects created from deserialization].
Object recycling part 2 (Page last updated February 2002, Added 2002-02-22, Author Angus Muir and Roman Bialach). Tips:
- The efficiency of pooling objects compared to creating and disposing of objects is highly dependent on the size and complexity of the objects.
- Object pools have deterministic access and reclamation costs for both CPU and memory, whereas object creation and garbage collection can be less deterministic.
Multicasting efficiency (Page last updated January 2002, Added 2002-02-22, Author Paul Timberlake). Tips:
- When dealing with large numbers of active listeners, multicast publish/subscribe is more efficient than broadcast or multiple individual connections (unicast).
- When dealing with large numbers of listeners with only a few active, or if dealing with only a few listeners, multicasting is inefficient. This scenario is common in enterprise application integration (EAI) systems. Inactive listeners require all missed messages to be resent to them in order when the listener becomes active.
- A unicast-based message transport, such as message queuing organized into a hub-and-spoke model, is more efficient than multicast for most application integration (EAI) scenarios.
NIO (Page last updated February 2002, Added 2002-02-22, Author Daniel F. Savarese). Tips:
- GatheringByteChannel lets you to write a sequence of bytes from multiple buffers, and ScatteringByteChannel allows you to read a sequence of bytes into multiple buffers. Both let you minimize the number of system calls meade by combining operations that might otherwise require multiple system calls.
- Selector allows you to multiplex I/O channels, reducing the number of threads required for efficient concurrent I/O operations.
- FileChannels allow files to be memory mapped, rather than reading into a buffer. This can be more efficient. [But note that both operations bring the file into memory in different ways, so which is faster will be system and data dependent].
Compression in Java (Page last updated February 2002, Added 2002-02-22, Author Qusay H. Mahmoud and Konstantin Kladko). Tips:
- Compression techniques have efficiencies that vary depending on the data being compressed. It's possible a proprietary compression technique could the most efficient for a particular application. For example, instead of transmitting a compressed picture, the component objects that describe how to draw the picture may be a much smaller amount of data to transfer.
- ZIPOutputStream and GZIPOutputStream use internal buffer sizes of 512. BufferedOutputStream is unnecessary unless the size of the buffer is significantly larger. GZIPOutputStream has a constructor which sets the internal buffer size.
- Zip entries are not cached when a file is read using ZipInputStream and FileInputStream, but using ZipFile does cache data, so creating more than one ZipFile object on the same file only opens the file once.
- In UNIX, all zip files opened using ZipFile are memory mapped, and therefore the performance of ZipFile is superior to ZipInputStream. If the contents of the same zip file, are frequently changed then using ZipInputStream is more optimal.
- Compressing data on the fly only improves performance when the data being compressed are more than a couple of hundred bytes.
Porting to KVM (Page last updated February 2002, Added 2002-02-22, Author Shiuh-Lin Lee). Tips:
- Minimize program runtime size. Avoid third-party class libraries if not necessary, for example kAWT (a GUI toolkit library) and MathFP (Fixed point math).
- Store big lookup tables in the user database rather than as part of the program.
- Call GC functions manually.
- Dispose of Objects; close the database and the network connections as soon as they are no longer needed.
- Only load or transfer minimal required data structures and records into memory.
- Avoid float and double calculations.
- Avoid data conversions: store and use the data in the final required format, or execute conversions on the server.
- Use client caching.
- Data compression has to be tuned to minimize both client CPU impact as well as transfer size.
- Use tabbed panels to hold different groups of information. Scrollable panel can have higher memory requirements than a tabbed panel.
- Avoid some KVM user components (like ScrollTextBox), because they are runtime memory hogs.
- Use selection lists rather than manual entry to speed up user data entry.
Atomic File Transactions, Part 2 (Page last updated February 2002, Added 2002-02-22, Author Jonathan Amsterdam). Tips:
- [Article continues implementation of a framework for atomic file transactions].
- If a transaction creates a file and then performs several other actions on it, there is no need to undo the actions -- it is enough to delete the file.
- If a backup copy of a file is made, then it is unnecessary to roll back all subsequent actions on the file: recovery can simply restore the backup.
Quality of service for web services (Page last updated January 2002, Added 2002-02-22, Author Anbazhagan Mani, Arun Nagarajan). Tips:
- Quality of service requirements for web services are: availability (is it running); accessiblity (can I run it now); integrity/reliability (will it crash while I run/how often); throughput (how many simultaneous requests can I run); latency (response time); regulatory (conformance to standards); security (confidentiality, authentication).
- HTTP is a best-effort delivery service. This means any request could simply be dropped. Web services have to handle this and retry.
- Web service latencies are measured in the tens to thousands of milliseconds.
- Asynchronous messaging can improve throughput, at the cost of latency.
- SOAP overheads include: extracting the SOAP envelope; parsing the contained XML information; XML data cannot be optimized very much; SOAP requires typing information in every SOAP message; binary data gets expanded (by an average of 5-fold) when included in XML, and also requires encoding/decoding.
- Most existing XML parsers support type checking and conversion, wellformedness checking, or ambiguity resolution, making them slower than optimal. Consider using of stripped down XML parser which only pe4rforms essential parsing.
- DOM based parsers are slower than SAX based ones.
- Compress the XML when the CPU overhead required for compression is less than the network latency.
- Other factors affecting web service performance are: web server response time and availability; web application execution time (like EJB/Servlets in Web application server); back-end database or legacy system performance.
- Requests results should be cached where possible.
- Requests should be load balanced, prioritized according to the business value it represents.
- Carry out capacity planning to enable the performance to be maintained in the future.
- Extreme care should be taken to make sure that resources are not locked for long periods of time, to avoid serious scalability problems.
- Measure the performance of your web services by adding code measuring elapsed time to the generated service proxy (and recompiling). [Article gives an example].
Data expiration in caches (Page last updated January 2002, Added 2002-02-22, Author William Grosso). Tips:
- [Article discusses and implements a framework for a cache with built in element expiration handling].
Wrapping PreparedStatement (Page last updated January 2002, Added 2002-02-22, Author Bob Byron and Troy Thompson). Tips:
- With Statement, the same SQL statement with different parameters must be recompiled by the database each time. But PreparedStatements can be parametrized, and these do not need to be recompiled by the database for use with different parameters.
- [Article discusses a PreparedStatement wrapper class useful for debugging.]
Webservices SOAP communications overheads (Page last updated January 2002, Added 2002-02-22, Author Leigh Dodds). Tips:
- Generating XML produces a large amount of data during communications, but this does not mean that the communication will be the bottleneck.
- Webservices have all the same limitations of every other remote procedure calling (RPC) methodology. Requiring synchronous communications across a WAN is a heavy overhead regardless of the protocol.
- If "Web services" tend to be chatty, with lots of little round trips and a subtle statefulness between individual communications, they will be slow. That's a function of failing to realize that the API call model isn't well-suited to building communicating applications where caller and callee are separated by a medium (networks!) with variable and unconstrained performance characteristics/latency.
- Asynchronous messaging may be required for efficient webservices.
Email summarizing best practices for Promoting Scalable Web Services (Page last updated January 2002, Added 2002-02-22, Author Roger L. Costello). Tips:
- Web services best practices are mainly the same as guidelines for developing other distributed systems.
- Stay away from using XML messaging to do fine-grained RPC, e.g. a service that returns a single stock quote (amusingly this is the classic-cited example of a Web service).
- Do use course-grained RPC, that is, use Web services that "do a lot of work, and return a lot of information".
- When the transport may be slow and/or unreliable, or the processing is complex and/or long-running, consider an asynchronous messaging model.
- Always take the overall system performance into account. Don't optimize until you know where the bottlenecks are, i.e., don't assume that XML's "bloat" or HTTP's limitations are a problem until they are demonstrated in your application.
- Take the frequency of the messaging into account. Replicate data as necessary.
- For aggregation services, try to retrieve data during off-hours in large, course-grained transactions.
Report of how Ace's Hardware made their SPECmine tool blazingly fast (Page last updated December 2001, Added 2002-02-22, Author Chris Rijk). Tips:
- Tranform your data to minimize the costs of searching it.
- If your dataset is small enough, read it all into memory or use an in-memory database (keeping the primary copy on disk for recovery).
- An in-memory datavase avoids the following overheads: no need to pass data in from a separate process; less memory allocation by avoiding all the data copies as it's passed between processes and layers; no need for data conversion; fine-tuned sorting and filtering possible; other optimizations become simpler.
- Pre-calculation makes some results faster by making the database data more efficient to access (by ordering it in advance for example), or by setting up extra data in advance, generated from the main data, to make calculating the results for a query simpler.
- Pre-determine possible data values in queries, and use boolean arrays to access the chosen values.
- Pre-calculate all formatting that is invariant for generated HTML pages. Cache all reused HTML fragments.
- Caching many strings may consume too much memory. IF memory is limited, it may be more effective to generate strings as needed.
- Write out strings individually, rather than concatenating them and writing the result.
- Extract common strings into an identical string object.
- Compress generated html pages to send to the user, if their browser supports compressed html. This is a heavier load on the server, but produces a significantly faster transfer for limited bandwidth clients.
- Some pages are temporarily static. Ccahe these pages, and only re-generate them when they change.
- Caching can significantly improve the responsiveness of a website.
JMS vs RMI (Page last updated February 2002, Added 2002-02-22, Author Kevin Jones). Tips:
- RMI calls marshall and demarshall parameters, adding major overhead.
- Every network communication has several overheads: the distance between the sender and the receiver adds a minimum latency (limited by the speed the signal can travel along the wire, about two-thirds of the speed of light: London to New York would take about 3 milliseconds); each network router and switch adds time to respond to data, on the order of 0.1 milliseconds per device per packet.
- Part of most network communications consists of small control packets, adding significant overhead.
- One RMI call does not generally cause a noticeable delay, but even tens of RMI calls can be noticeable to the users.
- Beans written with many getXXX() and setXXX() methods can incur an RMI round trip for every data attribute.
- Messaging is naturally asynchronous, and allows an application to decouple network communications from ongoing processing, potentially avoiding threads from being blocked on communications.
Proxy code generation (Page last updated February 2002, Added 2002-02-22, Author Paul McLachlan). Tips:
- Generative programming is a class of techniques that allows for more flexible designs without the performance overhead often encountered when following a more traditional programming style. JSP engines are one example. java.lang.reflect.Proxy is another.
- More advanced code obfuscations (such as control-flow obfuscation) can produce slower programs as the obfuscated bytecode is more difficult to optimize by the JIT or HotSpot compiler.
- A reflective lookup [obtaining the method reference from its name] is much slower than a reflective invoke [invoking the method from the reference] once you have a method reference.
- [Article provides an implementation of the JNI call using the JVM_OnLoad() function to trap class bytecodes as they are loaded].
- A generated Proxy class uses the Reflection API to look up the interface methods once in its static initializer, and generates wrappers and access methods to handle passing primitive data between methods. [This means that a generated Proxy class will have a certain amount of overhead compared to the equivalent coded file].
Generating code dynamically (Page last updated February 2002, Added 2002-02-22, Author Norman Richards). Tips:
- Compiling code into classes at runtime, such as for JSP pages, provides excellent flexibility with almost no performance overhead.
- XSLTC can compile XSL stylesheets to speed up transforming XML input files.
- If a complex interpreted procedure is expected to be used more than once, it can be more efficient to convert the procedure into an expression tree which will apply the procedure optimally.
- Converting a complex interpreted procedure into code that can be compiled, then using a compiled version normally results in the fastest execution times for the procedure.
- Sun's javac is not a very efficient compiler. Faster compilers are available, such as jikes.
- Compiling code at runtime can take a significant amount of time. If the compile time needs to be minimized, it is important to use the fastest compiler available.
- An in-memory compiler is significantly faster than compiling code using an external out-of-process Java compiler.
- Generating bytecode directly in-process is significantly faster than compiling code using an external out-of-process Java compiler, and is also faster than using an in-memory compiler. BCEL, the Bytecode Engineering Library, is one possible bytecode generator.
JMS & JCACHE (Page last updated February 2002, Added 2002-02-22, Author Steve Ross-Talbot). Tips:
- Asynchronous messaged communications allows subsystems to decouple and work more efficiently in parallel, more closely reflecting actual workflows.
- Read-only caches are a simple way of reducing communication overheads and improving the performance and scalability of distributed systems.
- Event-driven systems tend to be more scalable.
- Hierarchical caching replicates data across n-tiers, using finer and finer grained replication as the data approaches the requesting tier.
- Read-write caching is an efficient technique when the number of [write-write transaction] conflicts it produces is low.
Notated keys to access elements of nested Maps. (Page last updated January 2002, Added 2002-02-22, Author Matt Liotta). Tips:
- Use dot separated, concatenated strings to optimize access to elements of nested Maps by caching elements in the top level Map.
Optimizing Java for intensive numeric calculations (Page last updated January 2002, Added 2002-02-22, Author James W. Cooper). Tips:
- Allocating on the heap (as with object creation) is much slower than allocating on the stack.
- Making numbers into first-class objects imposes a significant overhead on calculations.
- Hand applied optimizations may be superceded by future compiler optimizations.
- Use specialized subtypes to reduce dynamic dispatching.
- Replace objects with their data held and passed as local variables.
OS Signal handling in Java (Page last updated January 2002, Added 2002-02-22, Author Chris White). Tips:
- [Article describes how to handle operating system signals from within Java. Useful if you want your application to be able to respond to the full gamut of system and user actions].
Natively compiled code from Java source (Page last updated January 2002, Added 2002-02-22, Author Martyn Honeyford). Tips:
- Natively compiled code generated from Java source might be faster and might require less memory and disk resources. [But this articles show some JVMs can be faster].
- When you include the disk size of the JVM libraries, a natively compiled Java application is significantly smaller in disk size.
- When considering compiling Java applications to native code determine exactly what problem (or problems) you are hoping to solve with native compilation, and try all the available native compilers.
RMI arguments (Page last updated December 2001, Added 2002-02-22, Author Scott Oaks). Tips:
- Some application servers can automatically pass parameters by reference if the communicating EJBs are in the same JVM. To ensure that this does not break the application, write EJB methods so that they don't modify the parameters passed to them.
Choosing an application server (Page last updated January 2002, Added 2002-02-22, Author Sue Spielman). Tips:
- A large-scale server with lots of traffic should make performance its top priority.
- Performance factors to consider include: connection pooling; types of JDBC drivers; caching features, and their configurability; CMP support.
- Inability to scale with reliable performance means lost customers.
- Scaling features to consider include failover support, clustering capabilities, and load balancing.
Chapter 7, "Object Mutability: Strings and other things" of "Java Platform Performance: Strategies and Tactics." (Page last updated 2000, Added 2002-02-22, Author Steve Wilson and Jeff Kesselman). Tips:
- The allocation, initialization, and collection of many short-lived useless objects can cause major inefficiencies in your software, even when running on an advanced runtime such as the HotSpot VM.
- Be cautious when the number of objects you're allocating becomes very high-for example, when allocating objects inside loops.
- For heavy-duty text processing, however, some uses of the String class can become major performance bottlenecks.
- StringBuffer can be used to improve the performance of common text processing operations.
- Avoid creating new strings in compute intensive parts of code. Be careful of the concatenation operators '+' and '+=' when used with strings.
- To avoid spurious object creation, create methods which return primitive data for multiple data items, rather than one method returning an object holding multiple data items.
- Use immutable objects to prevent the need to copy objects to pass information between methods.
- Object pooling small objects is often counterproductive. The overhead of managing the object pool is often greater than the small object penalty. Pooling can also increase a program's memory footprint.
- Pooling large objects (e.g. large bitmaps or arrays) or objects that work with native resources (e.g. Threads or Graphics) can be efficient.
Chapter 8, "Algorithms and data structures" of "Java Platform Performance: Strategies and Tactics." (Page last updated 2000, Added 2002-02-22, Author Steve Wilson and Jeff Kesselman). Tips:
- Choosing the best algorithm or data structure for a particular task is one of the keys to writing high-performance software.
- The optimal algorithm for a task is highly dependent on the data and data size.
- Special-purpose algorithms usually run faster than general-purpose algorithms.
- Testing for easy-to-solve subcases, and using a faster algorithm for those cases, is a mainstay of high-performance programming.
- Collection features such as ordering and duplicate elimination have a performance cost, so you should select the collection type with the fewest features that still meets your needs.
- Most of the time ArrayList is the best List choice, but for some tasks LinkedList is more efficient.
- HashSet is much faster than TreeSet.
- Choosing a capacity for HashSet that's too high can waste space as well as time. Set the initial capacity to about twice the size that you expect the Set to grow to.
- The default hash load factor (.75) offers a good trade-off between time and space costs. Higher values decrease the space overhead, but increase the time it takes to look up an entry. (When the number of entries exceeds the product of the load factor and the current capacity, the capacity is doubled).
- Programs pay the costs associated with thread synchronization even when they're used in a single-threaded environment.
- The Collections.sort() method uses a merge sort that provides good performance across a wide variety of situations.
- When dealing with collections of primitives, the overhead of allocating a wrapper for each primitive and then extracting the primitive value from the wrapper each time it's used is quite high. In performance-critical situations, a better solution is to work with plain array structures when you're dealing with collections of primitive types.
- Random number generation can take time. If possible you can pre-generate the random number sequence into an array, and use the elements when required.
Last Updated: 2019-08-29
Copyright © 2000-2019 Fasterj.com. All Rights Reserved.
All trademarks and registered trademarks appearing on JavaPerformanceTuning.com are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries. JavaPerformanceTuning.com is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
RSS Feed: http://www.JavaPerformanceTuning.com/newsletters.rss
Trouble with this page? Please contact us