"Testing gotchas: testing once is jittery and un-representative, interpreted code behaves very differently from compiled JIT optimized code, System.currentTimeMillis() can go backwards (eg from NTP), JVM code and branch elimination can give unrepresentative performance for unrepresentative data and workloads, testing on different platforms can give very different (unrepresentative) results, power saving modes can give unrepresentative results, having other processes running at the same time can give unrepresentative results."