Back to newsletter 030 contents
I had a driver for a couple of years. It does sound more glamorous that it really is as Leonard, my driver, was actually a taxi driver that I had made a deal with. Leonard was courteous, reliable and often knew my schedule better than I did. He was also curious as to where I was off to now and what I was up to. The question is, how does one explain to a layperson, what performance tuning is all about? But then, I quickly realized that I was talking to a master of optimization. Here was a person that understood the quickest way to get from point A to point B. He also knew how to adjust this path based on expected traffic. So, I came up with an explanation I was able to base on his ability to optimize. The world is full of opportunities for learning. And now lets see what we can learn from this month's roundup.
One poster wondered why GC kicked in repeatedly after 32M when he specified -Xms32m -Xmx64. Was the mx parameter being ignored? The sole answer suggested that the JVM was tuned to believe that GC is cheaper than reallocating the heap. So once 32M is reached, it will always try reclaiming space before asking the OS for more memory. In this particular application, the reclaims were always successful, allowing the JVM to stay at 32M. Neither the JVM vendor nor version was mentioned.
Another question requested how to configure the JVM for production. The answer: use -server, and set -Xms and -Xmx heap parameters. There was no answer to the followup question of how to choose the heap parameters. My answer is to check http://www.JavaPerformanceTuning.com/tips/ (or the 2nd edition of Jack's book which covers the heap tuning methodology).
The "Java vs C++ speed" troll post came up yet again. This time, not much heat was generated, as all the answers sensibly said Java was faster for some things C++ faster for others. One poster suggested Java wasn't fast enough to write games or real-time systems, which will come as a surprise to the gamers over at JavaGaming.org, and lots of embedded systems writers. (They should actually be pleased, it's always nice to be told you are succeeding at achieving the impossible). The best answer: 'The real question should be "Is Java fast enough for the job at hand".'
A fascinating discussion about replacing multiple threads with NIO Select for a multiplayer networked game server cropped up. This is a live game, with many (over 100) players. The programmer found that despite the OS having free resources, the JVM could not exceed about 1,000 threads on his Linux server (different JVMs had different limits). He had correctly reset ulimit to allow unlimited threads for the user, and had recompiled the kernel to allow 4096 open files per process (up from the default of 1024). None of this seemed to help. The other posters suggested switching to NIO, which he did, and then the thread limitation was no longer an issue, as the NIO based server used only a few threads. However, a different issue now cropped up. The NIO Select call was taking 100% CPU, but the server seemed able to handle as many users as required. Instead of a proper select blocking call, it seemed to be polling continuously. Although the discussion never solved ths problem, the code was posted. Reviewing the code, I could see that whenever a new socket was accepted, it was registered with the server in both READ and WRITE modes. However, a new or unwritten socket is always ready to be written to, so naturally the selector immediately returned each time it was called because there were ready sockets to service. The problem was a subtle bug in the code, difficult to understand if you haven't played with NIO selectors before. The call to the selector always returned immediately, and was looping, hence the 100% CPU utilization. Whenever there was spare time in the system, the I/O service thread looped, but it wasn't actually causing any load to the CPU other than one erroneously looping thread, so the CPU could handle all other game engine threads without a noticeable decrease in performance. The solution is to avoid registering the socket for WRITE mode except when it actually needs to write.
Another poster found that using a custom ColorModel slowed painting down enormously on one platform (MacOS X). The problem turned out to be that ColorModels are optimized one each platform, and you should use the default ColorModel, from GraphicsConfiguration.getColorModel(), or Toolkit.getColorModel() to gain the optimal performance (or actually to optimize your chances that you'll get an optimal ColorModel for optimal performance).
Finally, there was an inconclusive discussion on whether boolean comparison or integer comparison to 0 is faster. My guess, and that of the respondants was that it depends entirely on the JVM/OS/compiler combinations, and optimizations applied by them.
One poster was seeing strange behavior from his clustered Weblogic BMP entity beans. It looked like synchronization (of data between servers) wasn't working. The beans were strongly optimized, with database stores and accesses only happening when the bean had changed. But Weblogic uses ejbLoad() to synchronize by re-loading beans. So it seems like the optimization may interact with the synchronization to cause problems.
Another poster asked about running batch EJB jobs on millions of records. The only answer pointed out that running a batch job simplistically, as if each record manipulation was like one user call, would be like simulating millions of user requests, causing millions of very short transactions and would likely bring the system (especially database) to its knees. This poster suggested throttling the requests, combining transactions and controlling when commits occur.
Last week I had a conversation with a sales guy. He commented that the real technical people would always search for what they are looking for. But the most successful businesses were those that also catered to the hobbyist. The interesting point is that he was talking about a ballet store. It's funny how some observations transcend specialities.
Back to newsletter 030 contents