Last active
May 16, 2023 21:00
-
-
Save tdrhq/ffe778e7da8153264749b059391eaa99 to your computer and use it in GitHub Desktop.
Screenshotbot Stack
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
I'm going to tell you my unconventional stack, and the reasoning for | |
it. You may or may not agree with it, but let's give it a try. | |
Most of the modern backend web frameworks were started maybe 15-20 | |
years ago, and the same patterns have continued to be used. But things | |
have changed since then: | |
* Disk is a lot faster (NVMe) | |
* Disk is a lot more robust (EBS Io2) | |
* RAM is super cheap | |
* You can rent machines with hundreds of cores if your heart desires. | |
All this means that the historical design considerations don't really | |
apply anymore, at least for small apps. You don't need multiple | |
servers connecting to a DB to scale an app to a reasonable level. You | |
don't need the robustness of a DB, because you have EBS. You don't | |
need clever disk based indices and such, because you can just keep | |
everything in RAM and use in-memory data-structures as indices. But if | |
you don't need a DB, your entire app can be on one single machine, | |
running as a single multi-threaded process. (So, you can avoid all | |
your async worker stuff, because async workers are just threads | |
running in your process.) If you need to scale, rent a bigger | |
machine. Will you hit a limit eventually? Probably, but by then you're | |
well on your way to success so you can invest in re-architecting | |
things. What if your machine dies? Well, EBS means you AWS will | |
automatically recreate it for with the same EBS volume. | |
Wait, if your machine is a single process serving code, that means you | |
can do clever things. You no longer need to keep "object ids" for | |
every object that is sent as part of APIs, your web pages can just be | |
simple closures that transparently reference objects in | |
memory. (Imagine a multi-page form, where you don't have to serialize | |
the state between pages.) | |
All of this sounds nice, but making it work does require the right | |
stack. First, if you have a long running in-memory process, you need | |
to be able to deploy code without restarting that process. So you need | |
a language where you can reliably do hot-reload. The only language I | |
know of where hot-reload is not just an add-on, but part of the | |
standard and everything is designed around it is... Common Lisp. | |
Okay, now we need to figure on an in-memory store. But we also need it | |
to be persistent. For this, there's a library called | |
bknr.datastore. Essentially, everything is in memory. So any GET | |
requests will be served without any disk IO (side note: this also | |
means that things like AsyncIO aren't that useful, your code is mostly | |
CPU bound rather than IO bound). When you modify your data (create new | |
objects, modify fields of existing objects), the framework just write | |
a transaction to disk. Every now and then you take a snapshot of all | |
objects that are part of your store. If your process crashes, it just | |
reloads the snapshot and replays the transaction logs. Effectively, | |
your database is inside the process that serves your web app (But | |
unlike sqlite you have access to the objects, and you have a lot of | |
flexibility over what the objects look like: it's not just a table of | |
rows). This framework obviously gets complicated, but it'll take a | |
while before you need the additional complexity. | |
(PS. I suspect you'll never use this stack, but I like talking about | |
it whenever I get the chance. I use this stack here: | |
https://github.com/screenshotbot/screenshotbot-oss) | |
Followup comments with relevant information: | |
* Sanjeev suggests that you don't want an architecture that you can't | |
incrementally scale: | |
My response: | |
In most cases, you don't have to rewrite! You can still treat your | |
process very similar to MySQL instances, if you architect them | |
carefully. E.g, you can have one process replicate the transaction log | |
from another process (so you can have multiple servers serving read | |
requests) and you can shard your your data. (For instance, you might | |
have a frontend router that routes to a specific backend based on | |
which customer is making the request.) I'm not at that stage though, | |
so I'm probably going to hit issues when I get there, but I think it | |
can be done 🙂 | |
Also, sometimes having things in memory lets you do scaling things | |
that you can't do with databases. For instance, the solution to this | |
problem: | |
[link to ex-fb eng post], | |
used some clever in memory data-structures that can't easily be done | |
with databases. | |
I will add, that the 80-20 rule still applies. 80% of the scalability | |
problem is created by 20% of the different "types" of objects being | |
stored. (In my cases far less than the 20%, three different object | |
types account for most of the scalability considerations.) So, a | |
simple scaling solution if everything else fails, is to just move | |
these objects to MySQL. I don't think I'm going to need it, ever, but | |
it's still an option without crazy re-architecting. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment