Omkarnath Omie

This worked on 14/May/23. The instructions will probably require updating in the future.

llama is a text prediction model similar to GPT-2, and the version of GPT-3 that has not been fine tuned yet. It is also possible to run fine tuned versions (like alpaca or vicuna with this. I think. Those versions are more focused on answering questions)

Note: I have been told that this does not support multiple GPUs. It can only use a single GPU.

It is possible to run LLama 13B with a 6GB graphics card now! (e.g. a RTX 2060). Thanks to the amazing work involved in llama.cpp. The latest change is CUDA/cuBLAS which allows you pick an arbitrary number of the transformer layers to be run on the GPU. This is perfect for low VRAM.

Clone llama.cpp from git, I am on commit 08737ef720f0510c7ec2aa84d7f70c691073c35d.

Linux tips

On Thursday 25 April 2021, I entirely switched from macOS to Linux: https://maelvls.dev/evolution-of-my-home-office/

I took note of every adjustment I had to make along the way. I currently use Ubuntu 21.10 "vanilla".

Replace Ubuntu's Dock
Correct key repetition speed and mouse speed

	# Ask for the user password
	# Script only works if sudo caches the password for a few minutes
	sudo true

	# Install kernel extra's to enable docker aufs support
	sudo apt-get -y install linux-image-extra-$(uname -r)

	# Add Docker PPA and install latest version
	sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys 36A1D7869245C8950F966E92D8576A8BA88D21E9
	sudo sh -c "echo deb https://get.docker.io/ubuntu docker main > /etc/apt/sources.list.d/docker.list"