Skip to content

Instantly share code, notes, and snippets.

@ZoomTen
Created April 28, 2024 06:45

Revisions

  1. ZoomTen created this gist Apr 28, 2024.
    181 changes: 181 additions & 0 deletions MINIO_S3_NOTES.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,181 @@
    # Notes on uploading to MinIO through HTTP

    ## Some definitions

    <dl>
    <dt>iso_datetime</dt>
    <dd>
    Date and time in ISO 8601 UTC, e.g. <code>20240428T051943Z</code> = <b>2024</b>-<b>04</b>-<b>28</b> (time <b>T</b>) <b>05</b>:<b>19</b>:<b>43</b> (UTC <b>Z</b>)
    </dd>
    <dt>iso_date</dt>
    <dd>
    Date in ISO 8601 UTC, e.g. <code>20240428</code> = <b>2024</b>-<b>04</b>-<b>28</b>
    </dd>
    <dt>data</dt>
    <dd>
    The data itself
    </dd>
    <dt>data_sha256</dt>
    <dd>
    SHA256 hash of the data to upload
    </dd>
    <dt>bucket_name</dt>
    <dd>
    Self explanatory
    </dd>
    <dt>file_path</dt>
    <dd>
    File path including any prefixes
    </dd>
    <dt>auth_string</dt>
    <dd>
    See below
    </dd>
    <dt>credential</dt>
    <dd>
    See below
    </dd>
    <dt>string_to_sign</dt>
    <dd>
    See below
    </dd>
    <dt>signed_header_list</dt>
    <dd>
    Which headers should be considered in the signature check&mdash;all lowercase, no space, separated by <code>;</code> semicolons, <strong>sorted alphabetically</strong>, e.g. <code>host;x-amz-content-sha256;x-amz-date</code>
    </dd>
    <dt>access_key</dt>
    <dd>
    Access ID
    </dd>
    <dt>secret_key</dt>
    <dd>
    A secret associated with the Access ID
    </dd>
    <dt>region</dt>
    <dd>
    e.g. <code>us-east-1</code>
    </dd>
    <dt>service</dt>
    <dd>
    Usually <code>s3</code>.
    </dd>
    </dl>

    Params to fill in are between brackets `<param>`.

    ## Basic uploading

    ```http
    PUT /<bucket_name>/<file_path> HTTP/1.1
    Host: localhost:9000
    Connection: Keep-Alive
    Content-Length: 11
    User-Agent: Doesn't matter
    Authorization: <auth_string>
    X-Amz-Date: <iso_datetime>
    X-Amz-Content-SHA256: <data_sha256>
    <data>
    ```

    Example
    ```http
    PUT /addons/admin/test HTTP/1.1
    Host: localhost:9000
    Connection: Keep-Alive
    content-length: 11
    user-agent: My-Frontend
    authorization: AWS4-HMAC-SHA256 Credential=WqXXrGvJwnOC0RiXjhv9/20240428/us-east-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=30c58171ebfe6f6517697c7459dd8cd3c21b97ab3d8ab4eef0de031852879a0d
    x-amz-date: 20240428T051943Z
    x-amz-content-sha256: a591a6d40bf420404a011733cfb7b190d62c65bf0bcda32b57b277d9ad9f146e
    Hello World
    ```

    ### Format of auth_string

    ```xml
    AWS4-HMAC-SHA256 Credential=<credential>, SignedHeaders=<signed_header_list>, Signature=<signature>
    ```

    ### Format of credential

    ```xml
    <access_key>/<iso_date>/<region>/<service>/aws4_request
    ```

    `aws4_request` terminates the credential string.

    ## Creating a signature

    This one's a doozy.

    Consider three elements:
    1. The "canonical request"
    2. The string for signing
    3. The finished signature.

    ### Canonical request

    It's a list of the following, each separated by a Unix new line:
    1. HTTP method, in all caps
    2. Canonical URI/path (everything before the `?`), so `/<bucket_name>/<file_path>`
    3. The query string, if there is none it's just empty ""
    4. The HTTP headers that factor in the signing, each separated by a Unix new line. Keys are all lowercase and values aren't separated by spaces, so `key:value`.
    5. Empty, for some reason.
    6. `signed_header_list` (again)
    7. `data_sha256`

    Example:
    ```
    PUT
    /addons/admin/test
    host:localhost:9000
    x-amz-content-sha256:a591a6d40bf420404a011733cfb7b190d62c65bf0bcda32b57b277d9ad9f146e
    x-amz-date:20240428T054729Z
    host;x-amz-content-sha256;x-amz-date
    a591a6d40bf420404a011733cfb7b190d62c65bf0bcda32b57b277d9ad9f146e
    ```

    ### String for Signing (string_to_sign)

    It's another list like the last one, but it contains:
    1. Algo selection, usually `AWS4-HMAC-SHA256`
    2. `iso_datetime`
    3. `credential`, without `<access_key>/`
    4. SHA256 of the canonical request string

    Example:
    ```
    AWS4-HMAC-SHA256
    20240428T054729Z
    20240428/us-east-1/s3/aws4_request
    5121c23da563a010520d10507c1768170e9007a0655d9d973d460c9b6f7c79dc
    ```

    ### The finished signature

    Keep note:
    * `hmac[sha256]` returns the **raw data**, e.g. a byte array like [0x10, 0x20, 0x3f]
    * `hmac_STRING[sha256]` returns the **hex string** e.g. "10203f"

    **`signature` defined here ↓**

    ```cpp
    date_key =
    hmac[sha256](concat("AWS4", <secret_key>), <iso_date>)

    date_region_key =
    hmac[sha256](date_key, <region>)

    date_region_service_key =
    hmac[sha256](date_region_key, <service>)

    signing_key =
    hmac[sha256](date_region_service_key, "aws4_request")

    signature =
    hmac_STRING[sha256](signing_key, <string_to_sign>)
    ```