"Avoiding waiting by: eliminating synchronization (eg partitioning the data per core and processing just on the core); use wait-free algorithms; keep the code in user-space (avoid kernel calls or bypass it if possible); avoid context-switching (use dedicated thread-to-core); use async/non-blocking IO; use busy polling; make shared data structures read-only; use single-producer+single-consumer queues to transfer between cores; use TCP_NODELAY; don't process requests as they come in from the network, take them off the queue and process separately so that longer latency requests don't block the queue"