Skip to content

Instantly share code, notes, and snippets.

@rain-1
rain-1 / llama-home.md
Last active June 24, 2025 11:12
How to run Llama 13B with a 6GB graphics card

This worked on 14/May/23. The instructions will probably require updating in the future.

llama is a text prediction model similar to GPT-2, and the version of GPT-3 that has not been fine tuned yet. It is also possible to run fine tuned versions (like alpaca or vicuna with this. I think. Those versions are more focused on answering questions)

Note: I have been told that this does not support multiple GPUs. It can only use a single GPU.

It is possible to run LLama 13B with a 6GB graphics card now! (e.g. a RTX 2060). Thanks to the amazing work involved in llama.cpp. The latest change is CUDA/cuBLAS which allows you pick an arbitrary number of the transformer layers to be run on the GPU. This is perfect for low VRAM.

  • Clone llama.cpp from git, I am on commit 08737ef720f0510c7ec2aa84d7f70c691073c35d.
@superseb
superseb / k3s-etcd-commands.md
Last active November 3, 2025 12:55
k3s etcd commands

k3s etcd commands

etcd

Setup etcdctl using the instructions at https://github.com/etcd-io/etcd/releases/tag/v3.4.13 (changed path to /usr/local/bin):

Note: if you want to match th etcdctl binaries with the embedded k3s etcd version, please run the curl command for getting the version first and adjust ETCD_VER below accordingly:

curl -L --cacert /var/lib/rancher/k3s/server/tls/etcd/server-ca.crt --cert /var/lib/rancher/k3s/server/tls/etcd/server-client.crt --key /var/lib/rancher/k3s/server/tls/etcd/server-client.key https://127.0.0.1:2379/version