Tips May 2020

https://www.youtube.com/watch?v=0cyy7PDnCOo
How to fail with Serverless (Page last updated May 2020, Added 2020-05-27, Author Jeremy Daly, Publisher Failover Conf). Tips:

Serverless apps are stateless and uncoordinated (so require buses/queues/pub-sub/state machines)
The serverless underlying infrastructure handles uptime, compute and scaling, but you need to handle app resilience (absorbing problems while continuing to provide acceptable service - not preventing failure but dealing with it gracefully)
Serverless typically runs under a single-concurrency model (single-threaded) so you need to consider that limitation
Serverless has no sticky-sessions nor guaranteed lifespan
Things to consider for serverless error handling in addition to normal app error handling: what if there is a network issue that prevents you writing logs; what if the function container crashes; what if the function never runs despite a triggering event happening? Serverless error types include: unhandled exceptions, function timeouts, out-of-memory errors, throttling errors
Use cloud capabilities rather than build into the app to manage errors, retries, network failures, routing, failover, redundancy. Return errors to the invoking service rather than handling them, using configurable cloud retry and dead letter queues
By reducing the scope of a serverless function to single-purpose, you can fail, scale and throttle each function independently
Retries are a vital part of every distributed system. Serverless (usually) guarantees "at least once" delivery. But that means the same event could be delivered more than once - make sure your serverless function is idempotent
Dead letter queues are important to monitor and handle/alert on so that events are not missed (your serverless function may never even have seen the event due to errors prior to receiving the event)
Serverless invocation types: synchronous (request/response; errors returned to and retries required by the client); asynchronous (event sent by client then disconnected; event on a queue for the serverless function, serverless function called N times or M seconds - configurable, dependent on throttling - then sent to dead letter queue or on-failure destination if not processed; can route based on success and failure); stream-based (pushed in batches synchronously from eg Kinesis or DynamoDB with retries N times or M seconds; need to handle batches that have one bad event somehow, preferably from the cloud capability eg BisectBatchOnFunctionError); poller-based (poller pulls synchronously from a queue, probably in batches)
Use automatic retries with exponential backoffs
Use eventual consistency either with queues or appropriate datastores
Serverless circuit breaker pattern is to have an external state "status check" (in a cache or datastore) for the service calls being protected by the circuit breaker, and mark the status as failed when the number of retries fails within a certain period; check occasionally until it works again and set to working
Buffer and throttle events being called to other services from the serverless function so that you don't overwhelm those systems
Serverless works best with asynchronous patterns to decouple components

https://www.capitalone.com/tech/cloud/serverless-streaming/
Scaling to Billions of Requests-The Serverless Way (Page last updated December 2019, Added 2020-05-27, Author Vijay Bantanur, Maharshi Jha, Publisher Capital One). Tips:

Auto scaling should be a basic design considerations of any modern architecture
Throttle requests to protect downstream systems
Fault tolerance is a critical requirements if you don't want to lose data when your backend system is down
Monitoring is crucial for both synchronous and asynchronous systems
Most distributed fast data systems guarantee "at-least once" delivery. To filter out duplicate messages use hashing or other methodologies to implement message deduplication with sub milliseconds latency

https://www.simform.com/serverless-performance/
Serverless Performance Tuning with AWS Lambda (Page last updated April 2019, Added 2020-05-27, Author Jignesh Solanki, Publisher SIMFORM). Tips:

Simply choosing the memory size that sufficiently runs your function isn't optimal. Smaller memory often increases time taken to process (also Lambda billing is in 100ms increments, so if you may end up paying more if you take longer). You should test to find the optimal performance (and cost) memory size for Serverless functions
Without provisioned concurrency, Serverless Java has a slower startup than some other languages, though the best performance after startup
Serverless tips: Minimize the deployment package size of your function; Make sure you use modularity of the AWS SDK for Java ; Reduce the complexities of your dependencies; Lazily load variables so that your function stays warm for several minutes.
Serverless container reuse: Store and referenced locally any externalized configuration or dependencies that your code retrieves after initial execution; limit the re-initialization of variables on every invocation and use static initialization, global/static variables and singletons instead; keep alive and reuse connections (HTTP, database, etc.) that were established during a previous invocation

https://lumigo.io/blog/how-to-optimize-aws-lambda-performance/
How to optimize aws lambda performance (Page last updated January 2020, Added 2020-05-27, Author Efi Merdler-Kravitz , Publisher Lumigo). Tips:

Lambda requires careful design to get the best performance out of the computation capabilities it provides.
Lambda Function RAM allocation allocates a linearly proportional amount of CPU power - every 1,792 MB is the equivalent of one full vCPU (one vCPU-second of credits per second). If you have a single threaded app, you shouldn't select more than 1.8 GB RAM, as it cannot make use of the additional CPU
If you select less than 1.8 GB RAM for Lambda and have multi-threading code which is CPU bound, it won't help in reducing execution time because you will only have one vCPU at that memory
Putting the smallest RAM for a lambda function may reduce the memory cost but increase latency because of the proportional allocation of RAM and CPU. If your application is CPU-bound, increasing the memory makes sense as it will reduce the execution time drastically and save on cost per execution
If a Lambda function takes up the entire concurrency execution limit of the account, other functions may be impacted by throttling errors. So it is recommended to always configure ?reserve concurrency?, applying a bulkhead pattern.
Provisioned Concurrency for Lambda gives options to provision an execution environment in advance when creating a function. Lambda can also be auto-scaled based on CloudWatch metrics or scheduled for a particular time or day depending on requirements.
When invoking a Lambda function for the first time, it downloads the code from S3, downloads all the dependencies, creates a container and starts the application before it executes the code. This whole duration (except the execution of code) is the cold start time.
Lambda optimization: interpreted languages start quicker but compiled languages perform better after start; provisioned concurrency can start Java quicker; Use the default network environment unless you need a VPC resource with private IP, because ENIs take significant start time and adds to the cold start time; Remove all unnecessary dependencies that are not required to run the function; Use Global/Static variables and singleton objects as these remain alive until the container goes down; Define DB connections at the Global level so that they can be reused for subsequent invocations; If your Lambda in a VPC is calling an AWS resource, avoid DNS resolution; put your dependency .jar files in a separate /lib directory rather than putting them along with function code. It speeds up the package unpacking process

Jack Shirazi Back to newsletter 234 contents

Last Updated: 2025-10-27
Copyright © 2000-2025 Fasterj.com. All Rights Reserved.
All trademarks and registered trademarks appearing on JavaPerformanceTuning.com are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries. JavaPerformanceTuning.com is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
URL: http://www.JavaPerformanceTuning.com/news/newtips234.shtml
RSS Feed: http://www.JavaPerformanceTuning.com/newsletters.rss
Trouble with this page? Please contact us