|
|
|
Back to newsletter 032 contents
One interesting discussion was about the fastest way to extract the tokens from a comma
delimited string compared three techniques: StringTokenizer
,
String.split()
and a custom search and extraction method. A test
showed the custom method fastest, with StringTokenizer
40% slower
and String.split()
140% slower. String.split()
is,
of course, using the new 1.4 regular expression classes, though it doesn't
do so the most efficient way.
An interesting discussion was generated on how best to first determine the number of rows in a database when you are also going to subsequently process those rows. The original poster used the sequence
rs.last(); int numOfRowsRetrieved = rs.getRow(); rs.beforeFirst()
moving the scrollable ResultSet
to the last row, getting the count, then
scrolling back to the beginning row. However, another poster pointed out
that many implementations of ResultSet.last()
simply iterated through
ResultSet.next()
until the last row was reached. In fact, that's probably
okay if the number of rows is small, since essentially they will be cached
for the subsequent processing. But for large result sets, this could be
a memory hog. This poster recommended issuing a "SELECT count(*)"
query
first to get the size of the result set. The original poster suspected
that the overhead of this query would be large [though I believe it
would be processed in a fairly optimal way by most databases]. Another
poster suggested using CachedRowSet
. The final poster suggested
to build the app first and then tune it, and if the JDBC was a problem
then to try it out using each of the options available to find the quickest. Sounds
just about right.
A fascinating discussion on whether OutputStream.write()
blocks, unearthed some
excellent advice. Ultimately, it was pointed out that the behavior of
OutputStream.write()
is not sufficiently specified. And this probably wasn't
an oversight, but actually a consequence of the multiple different
specifications on different operating systems, and different configurations.
The upshot was that OutputStream.write()
could block, though the blocking
would be unlikely to be for a significant length of time except in the
case where a slow connection was flooded with data, such as a dialup line
being written to with lots of data [mind you, that is typical for server
writes to slow clients]. Basically, there is no way of guaranteeing non-blocking
writes with OutputStream.write()
. You need to use NIO writes for
non-blocking behavior. And more worryingly, with OutputStream.write()
you could get an exception returned rather than
having the call blocked, and you need to be able to handle this
in high volume writes. If you are handling this situation,
bear in mind that a couple of the discussion participants
reported that the exception thrown when the OS buffer was overrun was a
NullPointerException
, and not a type of IOException
.
One long rambling discussion started off about the efficiency of the Java collection classes. Generally, these classes can easily produce and discard lots of new objects. Consequently, in tight loops (as you often get in animations), the Java collections can be a liability. Type-specific collection classes were recommended (collection classes which directly hold primitive data types rather than having to wrap those data types). The discussion then went to cover various data structures, and considered the pros and cons of various Java language features (and lack of features). Some of the interesting structures raised were skip lists, which provide a probabilistic way to balance trees; skip ahead lists which basically has shortcuts to specific elements in order to make traversal faster by "skipping ahead" to elements; and parallel arrays which can let you convert arrays of objects to arrays of primitives. It's clear from discussions like these that getting the data structure right goes a huge way to solving any particular problem, including performance problems.
One load tester found client response times growing as he increased the number of simulated clients, but the server was not showing a heavier load. The one responder suggested that the clients were serializing their requests, possibly by reusing the same connection for all clients, which would account for the observations.
A really good discussion on PreparedStatements
vs Statement
threw up several useful pointers. (Statements
are parsed for each call,
PreparedStatements
parsed once only.) Firstly, if you are not reusing
statements, there is no benefit to PreparedStatements
. Secondly, for
best effect you need to be able to reuse PrepareStatements
across
connections, and only some drivers support this. Thirdly, PreparedStatements
seem to manage arguments better, since quotation need not be considered, parameters
are passed to the reused statement as is (except some special consideration may be needed for
null parameters).
Back to newsletter 032 contents