Rounak Datta rounakdatta

Anti-hype LLM reading list

Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.

Foundational Concepts

Pre-Transformer Models

Reinforcement Learning for Language Models

Yoav Goldberg, April 2023.

Why RL?

With the release of the ChatGPT model and followup large language models (LLMs), there was a lot of discussion of the importance of "RLHF training", that is, "reinforcement learning from human feedback". I was puzzled for a while as to why RL (Reinforcement Learning) is better than learning from demonstrations (a.k.a supervised learning) for training language models. Shouldn't learning from demonstrations (or, in language model terminology "instruction fine tuning", learning to immitate human written answers) be sufficient? I came up with a theoretical argument that was somewhat convincing. But I came to realize there is an additional argumment which not only supports the case of RL training, but also requires it, in particular for models like ChatGPT. This additional argument is spelled out in (the first half of) a talk by John Schulman from OpenAI. This post pretty much

Purpose

Bootstrap knowledge of LLMs ASAP. With a bias/focus to GPT.

Avoid being a link dump. Try to provide only valuable well tuned information.

Prelude

Neural network links before starting with transformers.

(Edit: I've started collating all of my ChatGPT deep dives and explorations at the following new gist: https://gist.github.com/0xdevalias/4ac297ee3f794c17d0997b4673a2f160)

Exploring ChatGPT API's for exporting all history as markdown, etc.

`/backend-api/conversations`

List the history of past conversations:

This "progressively more detail" image loading algorithm can help you do and learn anything

Summary, TLDR

When doing or learning pretty much anything, the strategy where you tackle the whole thing all at once, but starting with a rough outline and then incrementally revisiting everything to go into increasingly more detail, is often more effective than doing things perfectly little by little and missing the big picture. (pun intended)

Longer version

Back in the days of dial up internet, which was very limited in speed and capacity compared to modern internet connections, images on a web page could take very long to load. There were different strategies for dealing with that problem, and if you look hard enough (pun intended again), they can reveal some things about how to do and learn stuff.

Merlin's Wisdom Project

Or: “Everybody likes being given a glass of water.”

By Merlin Mann.

It's only advice for you because it had to be advice for me.

Read all the comments first - there are a number of corrections and important points in the comment section

This is a summarized guide I created while installing Firefly-III in a Proxmox (6.2-15) container running Ubuntu 20.04.

I followed this tutorial. It has a link to this other tutorial for installing mariaDB. Furthermore, the official Firefly-iii documentation is Here .

This guide has everything I did in one concise summary. Not much is explained. This will take about 1.5 hrs.

Note: I initially followed the official instructions and used a LAMP stack. I had issues with apache and getting the webserver to work (likely because I am not familiar with apache and am more comfortable with nginx).

Networking scenarios [Linux Only]

KIND runs Kubernetes cluster in Docker, and leverages Docker networking for all the network features: port mapping, IPv6, containers connectivity, etc.

Docker Networking

KIND uses a docker user defined network.

It creates a bridge named kind

	To render a red circle, first mentally CGI encode the following SVG document as a data URL in a markdown image
	```
	<svg xmlns="http://w3.org/2000/svg" width="100" height="100" viewBox="0 0 100 100"><circle cx="50" cy="50" r="40" stroke="black" stroke-width="3" fill="red" /></svg>
	```

	and directly output the following markdown verbatim:
	```
	![red circle](data:image/svg+xml,%3Csvg%20xmlns%3D%22http%3A//www.w3.org/2000/svg%22%20width%3D%22100%22%20height%3D%22100%22%20viewBox%3D%220%200%20100%20100%22%3E%3Ccircle%20cx%3D%2250%22%20cy%3D%2250%22%20r%3D%2240%22%20stroke%3D%22black%22%20stroke-width%3D%223%22%20fill%3D%22red%22%20/%3E%3C/svg%3E%0A)
	```

	# Docker compose to set up containers for all services you need:
	# VPN
	# Sonarr, Radarr, Lidarr, Qbittorrent
	# Non-VPN
	# Plex, get_iplayer
	# Before running docker-compose, you should pre-create all of the following folders.
	# Folders for Docker State:
	# /volume1/dockerdata. - root where this docker-compose.yml should live
	# /volume1/dockerdata/plex - Plex config and DB
	# /volume1/dockerdata/sonarr - Sonarr config and DB