|
|
|
Back to newsletter 042 contents
Every now and then you get the fortunate opportunity to meet someone who instantly has a profound effect on your thinking or views on a subject. This week, such a thing happened to me as I engaged in conversation with an interpreter. Though it may seem difficult to understand for most monolingual people (including yours truly), being multilingual doesn't mean that one can function as a translator. An interpreter requires a number of different skill sets from those required by a translator.
If we follow the process required to interpret, we can begin to understand why this is so. The first thing to note is that an interpreter may interpret many languages but they will only translate to a single language. For example, the person that I met translates German and English to French. If a French person is speaking, then he stops. There are some obvious service related practicalities to this strategy (for example, a French speaking person can listen to a single interpreter where speech is always translated to French), and there are some performance advantages also.
The process of interpretation requires the person to listen, comprehend and then speak in the target language. All of these three activities must happen at the same time, as the interpreter has no control over the rate at which speech arrives and cannot get too far behind without risking loss of information. In this job, the room for error is almost non-existent. In order for all of this activity to take place, the interpreter must divide his focus between these activities. If anything were to cause his focus to shift, then too much attention will be paid to the offending activity and the other two will suffer. In this system, proper performance is obtained by maintaining the correct balance in allocation of the scarce resource: in this case, the brain.
In this interpreting example, the resource being stressed is easily identified. This is not always the case when dealing with an application. When dealing with an application, we must first identify the resource that is in scarce supply. To make that identification, we need to take measurements. The next step is to understand the capacities of the system resources that we are measuring. This will tell us conclusively which resource is being stressed. It will also help us to get to the next step in the process, to "learn" how to balance the utilization of the scarce resource (if that is at all possible). This "learning" process can sometimes happen very quickly. Sometimes it takes quite a while before a balance can be reached. In either case, it will be faster than the typical five years that it takes an interpreter to find the sweet spot.
Now lets review what is currently being discussed in the discussion groups.
During the effort to tune blog-city, we ran into the problem of needing to use a caching strategy. The first instinct was to use an open-source product which we followed through on. The insertion of the cache was a disaster on a number of counts of which the most devastating was a memory leak that was introduced. After spending some time on the problem, it was decided to remove the product and build a cache. Had we looked at the gaming performance discussion groups, we would have found a perfect post to help guide the implementation. The post itself is simple, "Does anyone know how to built a least recently used (LRU) cache?" The response advised to look in the javadocs for java.util.LinkedHashMap. Included in that documentation are a set of instructions on how to turn it into an LRU cache. Nice to know the next time you need a simple cache.
From the serverside, we have an interesting discussion on soft references and weak references. Objects can be held strong, softly, weakly or with a phantom reference. In the normal case, objects are bound in memory with a reachable strong reference. As long as objects are held with a strong reference and the linkage is reachable from a root object, they are not eligible for garbage collection. But, just as the world is much more interesting when painted in color, so it is so when an environment supports more than one type of reference type. Enter soft and weak references to help make our development life more interesting. Like strong references, soft and weak references help ensure that an object is reachable from some root. Unlike strong references, the object held by soft and weak references are eligible for garbage collection.
The difference between soft and weak references is in how they are treated during garbage collection. While soft references try to keep the referenced object in memory for as long as possible, this is not the case with weak references. In other words, any object held by a weak reference will be garbage collected. Any object held by a soft reference will only be collected if the alternative choice is to throw an out of memory exception. This is how things are supposed to work. Unfortunately, as was discovered during the course of a discussion on the subject, that weak references do not function to specification in the Windows version of the J2SE 1.4.2_02. The end result of the thread was that a bug was filed.
Our visit from the Java Ranch produced a request for help to diagnose an excessive CPU utilization problem. The request came with a quick overview of the system. There are several reasons for an application to over-utilize a CPU. First, one has to understand that there are a number of different consumers of CPU. Aside from the many applications that may be running on the machine, the operation system (OS) itself is a consumer of CPU cycles. The best way to determine if the problem lies in the operation system or in one of the applications that may be running, is to monitor the CPU utilization level.
OS utilization of the CPU is typically very low. If it is taking a significant proportion of the CPU, then one needs to determine if one (or a combination) of the applications is causing stress to the OS. The OS offers protection to globally available resources by forcing applications to access them through it as an OS service and in doing so causes the OS to do work. These services include access to the file systems and scheduling.
The other and most obvious cause of an over utilized CPU is the application itself. The best strategy in this instance is to profile the application to determine where the bottleneck is. The discussion mentioned a commercial product, OptimizeIt. An alternate free profiler is to simply dump the hprof output produced by the VM itself. Once the bottleneck has been identified, it is then easier to diagnose the problem and then eliminate it.
Back to newsletter 042 contents