Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts and experience preferred (super rare at this point).
My own notes from a few months back.
- Survey of LLMS
- Self-attention and transformer networks
- What are embeddings
- The Illustrated Word2vec - A Gentle Intro to Word Embeddings in Machine Learning (YouTube)
- Catching up on the weird world of LLMS
- Attention is all you Need
- Scaling Laws for Neural Language Models
- BERT
- Language Models are Unsupervised Multi-Task Learners
- Training Language Models to Follow Instructions
- Language Models are Few-Shot Learners
- Why host your own LLM?
- How to train your own LLMs
- Training Compute-Optimal Large Language Models
- Opt-175B Logbook
- The case for GZIP Classifiers and more on nearest neighbors algos
- Meta Recsys Using and extending Word2Vec
- The State of GPT (YouTube)
- What is ChatGPT doing and why does it work
- How is LlamaCPP Possible?
- On Prompt Engineering
- Transformers from Scratch
- Building LLM Applications for Production
- Challenges and Applications of Large Language Models
- All the Hard Stuff Nobody talks about when building products with LLMs
- Scaling Kubernetes to run ChatGPT
- Numbers every LLM Developer should know
Thanks to everyone who added suggestions on Twitter, Mastodon, and Bluesky.
A Visual and Interactive Guide to the Basics of Neural Networks - https://jalammar.github.io/visual-interactive-guide-basics-n...
A Visual And Interactive Look at Basic Neural Network Math - https://jalammar.github.io/feedforward-neural-networks-visua...
Visualizing A Neural Machine Translation Model (Mechanics of Seq2seq Models With Attention) - https://jalammar.github.io/visualizing-neural-machine-transl...
The Illustrated Transformer - https://jalammar.github.io/illustrated-transformer/
The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning) - https://jalammar.github.io/illustrated-bert/
The Illustrated GPT-2 (Visualizing Transformer Language Models) - https://jalammar.github.io/illustrated-gpt2/
How GPT3 Works - Visualizations and Animations - https://jalammar.github.io/how-gpt3-works-visualizations-ani...
The Illustrated Retrieval Transformer - https://jalammar.github.io/illustrated-retrieval-transformer...
The Illustrated Stable Diffusion - https://jalammar.github.io/illustrated-stable-diffusion/
If you want to learn how to code them, this book is great: https://d2l.ai/chapter_attention-mechanisms-and-transformers...