Skip to content

Instantly share code, notes, and snippets.

@DylanDmitri
Last active June 26, 2019 18:27
Show Gist options
  • Save DylanDmitri/f8f9df0e41b6b7649651805e84c14fe3 to your computer and use it in GitHub Desktop.
Save DylanDmitri/f8f9df0e41b6b7649651805e84c14fe3 to your computer and use it in GitHub Desktop.

Queuing Middleware

Goal

Customer has slow website, queuing makes it fast.

Specifically, customer has a server that suffers from unecessary internal congestion when under load. By enforcing concurrency limits, this middleware sidesteps thread-based problems (starvation, context-switching) and improves overall performance.

Requirements

Functionality ("can it help?")

  • greatly increases performance in certain scenarios

Ease of use ("just works")

  • low or no configuration
  • works despite hardware differences
  • works despite workload differences

Ease of use ("clarity")

  • queueing helps only specific scenarios, but helps massively
  • customers should test throughput under load, ideally on a real server; then make an informed decision
  • AVOID: customers refreshing on localhost and saying "yeah it doesn't feel faster"
  • AVOID: customers adding this to every project because "hey it seems useful"

Project - Reducing Overhead

In pursuit of performance

  • queueing a request allocates ~1kb
  • this adds up, resulting in slow GC calls

Potential Gains

  • on the plaintext benchmark, queuing results in
    • 4% throughput overhead
    • 22% memory overhead
  • this fix eliminates most of the memory overhead

Project - Sliplane for LIFO queue

In pursuit of consistent performance

  • the LIFO strategy is a clear improvement (under load: same throughput, 20x better latency)
  • however, LIFO can result in degenerate cases where one request gets "bobbled"
  • the fix (that Facebook uses) is running a FIFO "sliplane" to guarantee timely entry when load is low

Potential Gains

  • reduces request variance
  • not sure of the numbers, this would require research and scenario development
  • make sure to look at latency bell curves (and uncompleted requests), not just average

Potential project - Configuring MaxConcurrentRequests

In pursuit of ease of use

  • rps varies wildly with MaxConcurrentRequests; setting a proper level is crucial
  • ideally middleware adjusts automatically, with little to no user input
  • otherwise, customers must be educated and encouraged to tweak it themselves

Potential gains

  • within reasonable MaxConcurrentRequest values, throughput can vary up to 2.5x on the same scenario
  • makes it easier for customers to test if queueing actually helps their website

Potential project - Customer Testing

Helps validate assumptions

  • can this middleware improve real servers, not just test scenarios?
  • what ease of use problems do customers actually hit?

Side Notes

  • may be difficult to find a willing customer with relevant problems
    • still useful to prove negatives (ie that it won't help in some specific case)
  • can be worked on in parallel with other projects
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment