Back to newsletter 018 contents
As I?m sitting here at my keyboard generating these words, I?m thinking what an interesting month this has been for me. Now, as much as I like cranking out code, sometimes you?ve just got to stop and take a step back. That?s exactly what I did this month. I took a step back to look at how the software was built in my current project. It was an eye opening experience to say the least. After taking a look, I quickly realized that I couldn?t just go to management and say, "the code base is a bit unusual". I had anecdotal evidence to support such a statement but I needed something stronger, something indisputable. For that level of assurance, I turned to software metrics.
There are a number of tools that will perform some obscure calculation such as violations of the law of Demeter. The tool I happened to have at hand was TogetherJ. So, I grabbed the latest version of a project from the source code control system, pointed the TJ at it and fired. After the gears turned a bit and the CPU got fairly hot, TJ presented me with a pretty large matrix filled with acronyms and numbers. What did all of these numbers mean? Were the numbers bad and if so, how bad? Could these numbers validate the anecdotal evidence that I had amassed? To find out, I went out into the open source community and downloaded the source code from a number of projects. Again, I pointed TJ at the source and fired. As before, TJ dutifully generated that extra large matrix full of obscurity. But this time, I was able to take the numbers and create some population statistics. With these statistics in hand, I was now able to evaluate the numbers from the first project.
I must admit that my motivation for running the metrics was quite selfish. I knew that I was going to have to deal with the source code and I wanted to know what I was up against. In this case if was looking for three measures, volume, complexity and couplings. Understanding these numbers provided me with a clear sense of how difficult and time consuming it was going to be to modify the code. Since my goal was to performance tune that piece of the system, these measurements helped me devise a strategy to not only deal with the code base, but to set expectations as to how much could be achieved with the given time budget. After all, a major part of performance is about expectations.
Now lets review the hot topics of the last month.
From the Java Ranch, we have an interesting thread that discusses the question "why does my GUI take up so much memory?" Generally, the responses that one receives from the ranch hands to this type of question are thoughtful and quite helpful. In this instance, the response referenced a paper on the alphaworks website that suggests that one should implement a destroy interface in which one would then null out references to fields in the object. Now, I haven?t actually read the paper so I can?t really comment on that but the rule for Java is that if you release a reference to an object, it is subject to be garbage collected. In addition, any references held by that object will also be subject to GC assuming that they are not reachable from an object that is not subject to GC. This is different from C++ where there is no GC. Consequently, C++ programmers are obliged to release memory using (strangely enough) a destructor. That destructor is responsible for possibly calling the destructor for any objects that it is referencing.
Now the question is, how do you know if another C++ object is holding onto your object? The general answer is, you don?t. That is why people like GC. It handles this for you. Hence, it alleviates most (but not all) of the problems associated with memory management. The point is that if you maintain a reference to an object, it will never be eligible for GC. So, in these systems, the most common symptom that you're not releasing references is that you experience memory bloat. Now, granted the destructor as it?s been applied in the discussion thread may be a nice hack to get around the problem, but it doesn?t solve the problem. For that, I?d recommend a memory profiler such as JProbe or OptimizeIt. With one of these tools, you can find the memory leak and fix it at the source rather than throwing more code at the (already bloated) problem.
Here?s a question that produced a few interesting tidbits of information. Which is faster
if (myarray == Boolean.TRUE)or,
if ( ((Boolean)myarray).booleanValue()).Though the question may seem trivial, what it does reveal is that the first statement should always return false. Why? Because == checks identity. A better check would be to use
The interesting tidbit is that this appears to change in the JDK1.4. In that version of
the JDK, it is recommended that you use
Boolean.valueOf(boolean) as this does not require
that you create many identical objects. Plus, you can get away with using ==, which is a
slighter faster operation. One final note,
new Boolean(true).equals(new Boolean(true))always returns true.
The gaming developers were up to their usual task of trying to push the envelope even further. There was a very interesting discussion between two developers concerning some problem in the AWT that was causing a simple animation to consume 100% of the CPU. HProf was reporting that the code was spending 96% of its time in sun.awt.windows.Wtoolkit.eventLoop on the AWT-Windows thread. After much speculation and banter back and forth, a test was devised in which a JNI function named go() was created. Its function was to call Sleep(INFINITE). The test started three threads, the main, an AWT thread, and one to execute go(). HProf reported that all three threads consumed equal amounts of CPU time. Conclusion, -Xrunhprof cannot report statistics on native code.
Will we see Java running on vector processors? As it turns out, there is a strategy that could see a specialized JIT replace the byte codes in your class file with a vectorized set of instructions. Just don?t expect to see it in the near future. In the same thread, there is a nice explanation of an AOT compiler. For those of us that have not run into this acronym, it?s an ahead of time compiler. From my experience with vector processors, I understand the need for code to behave the same when running in either scalar or vector modes. Java?s exception handling presents a problem for vector processors. The participants felt that an AOT compiler might help solve this problem. This seems like a reasonable conclusion as Cray used this strategy (as a part of a static compiler) to vectorize C, C++, and Fortran code.
The Server Side continues to provide a wealth of J2EE performance tuning information. This is evidenced by a discussion that centered on the question, "should one use CMP 2.0 or BMP with a DAO". Though the group as a whole came down on the side of CMP 2.0, no one could produce any numbers to help answer the original question. As a side conversation, one participant explained why these numbers might not be of any use anyways, as a system will perform differently on different hardware configurations. For example, it is unlikely that your system matches those used for the ECPerf tests. This is pure speculation, but I bet that if a vendor did publish numbers for optimal hardware configuration for their software, that the customers would choose to follow suit.
In a thread that was seeded by the Middleware?s own Floyd Marinescu, participants were asked to make a performance comparison between stateless session beans and servlets. The discussion brought out a number of points and, in the end, exposed a number of myths. One point that many agreed upon is that performance and scalability needed to be looked on as separate concerns. In this light, anyone who was using EJB to improve performance would be disappointed. It is true that every time you separate a tier with a communication layer such as RMI, you take a performance hit, which may not be recoverable. Even so, it was agreed upon that EJB leads to a more scalable design, allows for greater flexibility for the physical topology of the application, and better designs due to a greater separation of concerns.
It?s hard to believe that it?s only been a little more than a month since the ECPerf wars heated up. Since then, we?ve seen the new ECPerf 1.1 specification released. Both IBM and Pramati announced new results. The IBM results demonstrate a near linear scalability as they moved from a seven to nine node cluster. The resulting Bbop/$ figure remained just about the same. Pramati?s submission is follow by a Q&A session with Ramesh Loganathan of Pramati which can been found at http://www.theserverside.com/home/thread.jsp?thread_id=13262. Though Pramati?s results lag behind both BEA and IBM, one must applaud their efforts to do the work and publish the result.
Back to newsletter 018 contents