"You need monitoring or you are blind to what is happening in your system. But monitoring only gets you the data, you need to use that data with analyses/thresholds/alerts/consoles/profiles to get full observability"
"For load distribution (by a coordinator), choosing the least busy server optimizes performance. But choosing the least busy server has a (coordination) cost, and this is limited by the Universal scalability law. At low parallelism coordination makes latency more predictable, but at high parallelism coordination degrades throughput. A compromise strategy is for the coordinator to choose 2 servers at random and then pick the least busy of the two"