The goal of this document is to describe GDAL's current caching mechanisms, how they fail at scale, and how distributed caching could help. We'll be using COG headers as an example as it is what I know the best; but this applies to other formats like Zarr.
GDAL maintains two caching mechanisms; the "VSI cache" used to cache recently accessed I/O through any VSI driver (ex. a COG header over http) and the "block cache" used to cache raster blocks. I am not familiar with the implemention details of either cache; the important thing for this discussion is the cache is local per process.