Patterns | DNS | CDN | Load Balancer | Auto Scaling Group | Instance | Container | Application | Datastore |
---|---|---|---|---|---|---|---|---|
Fail fast - Timeout | ||||||||
Circuit breaker | ||||||||
Retry with exponential backoff and Jitter | ||||||||
Feature Toggle | ||||||||
Bulkheads | ||||||||
Handshaking - Throttling | ||||||||
Decoupling middleware | ||||||||
Shed load |
✅ : checked - ❓ : We don't know - ❌ :to do - ✔️ not required
Source: Release it! (Second Edition)
The Timeouts pattern is useful when you need to protect your system from someone else’s failure. Fail Fast is useful when you need to report why you won’t be able to process some transaction. Fail Fast applies to incoming requests, whereas the Timeouts pattern applies primarily to outbound requests. They’re two sides of the same coin.
A circuit breaker may also have a “fallback” strategy. Perhaps it returns the last good response or a cached value. It may return a generic answer rather than a personalized one. Or it may even call a secondary service when the primary is not available. Circuit breakers are a way to automatically degrade functionality when the system is under stress.
At any moment, more than a billion devices could make a request. No matter how strong your load balancers or how fast you can scale, the world can always make more load than you can handle.
...
Services should model TCP’s approach. When load gets too high, start to refuse new requests for work. This is related to Fail Fast. ...
The ideal way to define “load is too high” is for a service to monitor its own performance relative to its SLA. When requests take longer than the SLA, it’s time to shed some load. Failing that, you may choose to keep a semaphore in your application and only allow a certain number of concurrent requests in the system. A queue between accepting connections and processing them would have a similar effect, but at the expense of both complexity and latency. When a load balancer is in the picture, individual instances can use a 503 status code on their health check pages to tell the load balancer to back off for a while. Inside the boundaries of a system or enterprise, it is more efficient to use backpressure to create a balanced throughput of requests across synchronous- ly-coupled services. Use load shedding as a secondary measure in these cases.