"Being able to scale is challenging. Being able to autoscale needs you to understand what metrics determine when you need more (or fewer) resources allocated. Architecting to achieve a queueable application is fundamentally different from the ground up."
"There are three options to handle as much traffic as possible, which can be used in combination: scale (adding sufficient additional capacity to handle the traffic increase); overprovision (already having sufficient additional capacity to handle any traffic increase); queue (temporarily holding requests somewhere and processing when resources are available)."