Java Performance Tuning
Java(TM) - see bottom of page
Our valued sponsors who help make this site possible
JProfiler: Get rid of your performance problems and memory leaks!
Training online: Threading Essentials course
Tips November 2015
Get rid of your performance problems and memory leaks!
Get rid of your performance problems and memory leaks!
Back to newsletter 180 contents
Messaging for IoT (Page last updated November 2015, Added 2015-11-30, Author Martyn Taylor, Publisher Virtual:JBUG). Tips:
- The challenges of IoT are interoperability, availability, scalability and security.
- Choice of protocol for IoT depends on how and what pattern of communications the devices need to do - device processing capability, interoperability, message sizes, communication encryption, reliability of message delivery (atmost once, at least once, exactly once), and communication flow control (importantly whether you need to ensure there is a control channel open which means you can't allow congested channels) all need to factor in to the protocol decision.
- Because IoT devices are usually highly constrained, you usually want to offload and severe processing loads to the server.
- Reactive (mutiplexed) server implementations can achieve higher vertical scales: ie resource usage on a single machine is lower for a given number of connections; alternatively it can handle more connections before reaching limits.
- A pluggable architecture allows the server to use the minimum resources required for specific deployments.
- Append-only journals are efficient for persistence.
- Splitting the routing of messages from the message handling allows for better high availability - the routers can be small, reliable and widely dispersed as they are so focused and don't need to process messages, only routes. A small amount of message processing in order to allow filtering (ie dropping messages that aren't wanted by any server) is worthwhile to include as it improves the overall efficiency of the system.
102 performance engineering questions every software development team should ask (Page last updated August 2015, Added 2015-11-30, Author Todd DeCapua, Publisher TechBeacon). Tips:
- Size the servers: how many for each tier serving how many users and/or transactions.
- Optimize the servers: tune hardware configurations and system resources.
- Plan capacity: What is the headroom, how can you grow it, can you reduce the server count?
- Does the ISP provide the SLA required?
- Are vulnerabilities assessed and fixed; how do you handle DoS; do you have intrusion detetction; how quickly can you recover from failure.
- Assess user interaction profiles and ensure that the communication patterns and needs are adequately handled including for failover modes - this should include optimizing load balancer configurations and system timeouts.
- Focus on shared resources (both internally and externally shared) - these can be severe constraints; consider failover scenarios where resources are too constrained or denied.
- Does your system handle the required throughput at all stages from client through to backends, including any DMZs?
- Some important resources to monitor: connections (& connections pools), file descriptors, processes, threads (and thread release), listen queues, page (transaction/request) handling rates, caches, queues, sessions, CPU, memory, IO, context switches, paging.
- Clustering: how is session management handled especially for load balanced services, what is the capacity?
- Have different datasets been tested especially large datasets?
- What are upper limits (eg firewall connection limit)?
- Are filters applied, and if so, they should be applied as early as possible (to reduce downstream loads).
- Alert configuration is an important component in effectively managing a system.
You're probably wrong about caching (Page last updated September 2015, Added 2015-11-30, Author Mike Solomon, Publisher msol). Tips:
- Caching has direct benefits - improved latency & reduced load on the external datasource; and direct costs - more memory. It also has indirect costs - handling cache synchronisation.
- Every cache update and access requires a decision about the difference between the cache data and the persistent store.
- Cached data needs the same security and access control as the peristed data it caches - this is often missed.
- Cache expiry can introduce race conditions.
- When cache misses increase, latency increases and throughput can drop and the load on the persistent store increases and can be overloaded.
- Cached objects are long-lived so get promoted to the old generation, which can interfere with GC times expecially if the items expire a lot in the old generation.
- Loading caches can take time during which full service may not be available.
The Core Activities of Performance Testing (Page last updated November 2015, Added 2015-11-30, Author Sabrina Nisha, Publisher DZone). Tips:
- Identify the complete configuration of the test and production environments: machine configuruations, network architecture, DNS, installed software and licenses, storage, logging, load balancing, monitoring, request types, background processes, external dependencies.
- Define what is acceptable performance before starting testing so that you know what the gap is.
- Plan performance tests so that they are representative of of the real world, otherwise you'll be testing the wrong things. User simulation, request simulation, data volumes and types shoudl all be realistic.
- Determine the maximum load you can generate before reaching a bottleneck.
- Validate that tests produce realistic results.
9.5 Low Latency Decision as a Service Design Patterns (Page last updated October 2015, Added 2015-11-30, Author Itai Frenkel, Publisher Forter). Tips:
- Measure the latency of all requests; know what percentile buckets latencies fall into (split into 90%, 99%, 99.9%, etc).
- If you have a timeout on the API, you need to specify what the client should do if that is encountered; if retries are allowed, your system needs to handle a retry idempotently; a retry should not just overwrite the initial attempt results, as you need to know how far that succeeded internally and possibly also that there was more than one attempt.
- Gracefully degrade API failures by having failures delegated to return values from the upstream system.
- Handle overload by prioritising traffic rather than rejecting some traffic.
- Time your request from the start, and allow intermediate subrequest timeouts to have an upper and lower bound - then you can use the "request time so far" to determine actual timeouts to use between the upper and lower bounds.
- High availability techniques include: fail-fast; termination of process when it is an undefined state; auto-restart; hardware failover; load balancing with awareness of healthy instances; standby services in a different datacentre.
- For low latency (hundreds of milliseconds and below) timestamp the request as it proceeds through each subsystem and plot that
- CPU graphs are too large grained to provide detail on low latency requests.
- Enemies of consistent latency: Cloud host flakiness; logging (IO stalls); eventing (full queues); regexs; TCP Nagle Algorithm and Delayed ACKs TCP (turn off Nagle, set SO_NODELAY to true); GCs; LRU caches; memory leaks; DNS name resolution.
- Try to keep indexes in-memory - this may need configuration changes.
- Data stores have a large number of ways to make them perform better to achieve low latency responses, you need to master these, inlcuing chooosing the right data store for each use-case.
Building Globally Distributed, Mission Critical Applications: Lessons From the Trenches Part 1 (Page last updated August 2015, Added 2015-11-30, Author Kris Beevers, Publisher highscalability). Tips:
- Don't optimize your code, optimize your architecture. Modern systems don't depend on fast code, they depend on managing the communication between interacting systems, and the horizontal scalability of each component.
- Scale horizontally before you worry about efficient code. Servers are fast and cheap, and they're always getting faster and cheaper - developer time isn't. Worry about code optimization only when you find places it really matters.
- If you architecture your system correctly with separate intercommunicating components that scale horizontally, then the same system works from one server to hundreds - it should be mainly a different configuration, not a different architecture.
- Multi-datacenter application delivery is HARD. Communications outside the LAN are much less reliable. Plan for server failures and communication failures.
- Design your architecture with the assumption that your datacenters will lose connectivity with each other frequently. Ensure rapid re-convergence when communication is restored.
- Latency sensitive critical messages need robust message queueing systems. Non-critical telemetry can be lighter weight but less robust to network outages. Modern datastores with robust WAN replication are worth considering.
- Consistent hashing is a killer tool for distribution. It's easy to implement, and it scales gracefully with your infrastructure, minimizing re-striping as you add or remove nodes in a subsystem.
- Measure and monitor everything - what you don't measure, you can't understand.
- Instrument your application to understand database response times, messaging delays, cache hit ratios, memory fragmentation, disk I/O, etc.
- Ensure you have baselines for all your measurements. When something exceptional happens compare against the baselines to identify what happened.
Back to newsletter 180 contents
Last Updated: 2020-12-28
Copyright © 2000-2020 Fasterj.com. All Rights Reserved.
All trademarks and registered trademarks appearing on JavaPerformanceTuning.com are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries. JavaPerformanceTuning.com is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
RSS Feed: http://www.JavaPerformanceTuning.com/newsletters.rss
Trouble with this page? Please contact us