DylanDmitri/todo.md

Last active June 26, 2019 18:27

Star (0) You must be signed in to star a gist
Fork (0) You must be signed in to fork a gist

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/DylanDmitri/f8f9df0e41b6b7649651805e84c14fe3.js"></script>
Save DylanDmitri/f8f9df0e41b6b7649651805e84c14fe3 to your computer and use it in GitHub Desktop.

Download ZIP

Raw

todo.md

Queuing Middleware

Goal

Customer has slow website, queuing makes it fast.

Specifically, customer has a server that suffers from unecessary internal congestion when under load. By enforcing concurrency limits, this middleware sidesteps thread-based problems (starvation, context-switching) and improves overall performance.

Requirements

Functionality ("can it help?")

greatly increases performance in certain scenarios

Ease of use ("just works")

low or no configuration
works despite hardware differences
works despite workload differences

Ease of use ("clarity")

queueing helps only specific scenarios, but helps massively
customers should test throughput under load, ideally on a real server; then make an informed decision
AVOID: customers refreshing on localhost and saying "yeah it doesn't feel faster"
AVOID: customers adding this to every project because "hey it seems useful"

Project - Reducing Overhead

In pursuit of performance

queueing a request allocates ~1kb
this adds up, resulting in slow GC calls

Potential Gains

on the plaintext benchmark, queuing results in
- 4% throughput overhead
- 22% memory overhead
this fix eliminates most of the memory overhead

Project - Sliplane for LIFO queue

In pursuit of consistent performance

the LIFO strategy is a clear improvement (under load: same throughput, 20x better latency)
however, LIFO can result in degenerate cases where one request gets "bobbled"
the fix (that Facebook uses) is running a FIFO "sliplane" to guarantee timely entry when load is low

Potential Gains

reduces request variance
not sure of the numbers, this would require research and scenario development
make sure to look at latency bell curves (and uncompleted requests), not just average

Potential project - Configuring `MaxConcurrentRequests`

In pursuit of ease of use

rps varies wildly with MaxConcurrentRequests; setting a proper level is crucial
ideally middleware adjusts automatically, with little to no user input
otherwise, customers must be educated and encouraged to tweak it themselves

Potential gains

within reasonable MaxConcurrentRequest values, throughput can vary up to 2.5x on the same scenario
makes it easier for customers to test if queueing actually helps their website

Potential project - Customer Testing

Helps validate assumptions

can this middleware improve real servers, not just test scenarios?
what ease of use problems do customers actually hit?

Side Notes

may be difficult to find a willing customer with relevant problems
- still useful to prove negatives (ie that it won't help in some specific case)
can be worked on in parallel with other projects

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment