polya20/normcore-llm.md

Forked from veekaybee/normcore-llm.md

Created September 14, 2023 10:50

Star (0) You must be signed in to star a gist
Fork (0) You must be signed in to fork a gist

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/polya20/6014bed181a602fa3ef46776435ef507.js"></script>
Save polya20/6014bed181a602fa3ef46776435ef507 to your computer and use it in GitHub Desktop.

Download ZIP

Normcore LLM Reads

Raw

normcore-llm.md

Anti-hype LLM reading list

Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts and experience preferred (super rare at this point).

My own notes from a few months back.

Background

Foundational Papers

Training Your Own

Algos

Deployment

Evaluation

UX

Thanks to everyone who added suggestions on Twitter, Mastodon, and Bluesky.

Author

polya20 commented Oct 19, 2023

he only 5 links you need to understand the Transformer:
(1) https://youtube.com/watch?v=kCc8FmEb1nY Let's build GPT: from scratch, in code, spelled out by
@karpathy

(2) https://youtube.com/watch?v=iDulhoQ2pro Attention Is All You Need explained by
@ykilcher

(3) https://jalammar.github.io/illustrated-transformer/ Illustrated Transformer by
@jayalammar

(4) https://jaykmody.com/blog/gpt-from-scratch/ GPT in 60 Lines of NumPy by
@jaykmody

(5) https://ig.ft.com/generative-ai/ - Generative AI Visualization by the Financial Times

Author

polya20 commented Nov 19, 2023

https://jalammar.github.io/illustrated-transformer/
https://eugeneyan.com/writing/llm-patterns/
https://github.com/huggingface/llm_training_handbook#tensor-precision--data-types

Author

polya20 commented Nov 19, 2023

A Visual and Interactive Guide to the Basics of Neural Networks - https://jalammar.github.io/visual-interactive-guide-basics-n...

A Visual And Interactive Look at Basic Neural Network Math - https://jalammar.github.io/feedforward-neural-networks-visua...

Visualizing A Neural Machine Translation Model (Mechanics of Seq2seq Models With Attention) - https://jalammar.github.io/visualizing-neural-machine-transl...

The Illustrated Transformer - https://jalammar.github.io/illustrated-transformer/

The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning) - https://jalammar.github.io/illustrated-bert/

The Illustrated GPT-2 (Visualizing Transformer Language Models) - https://jalammar.github.io/illustrated-gpt2/

How GPT3 Works - Visualizations and Animations - https://jalammar.github.io/how-gpt3-works-visualizations-ani...

The Illustrated Retrieval Transformer - https://jalammar.github.io/illustrated-retrieval-transformer...

The Illustrated Stable Diffusion - https://jalammar.github.io/illustrated-stable-diffusion/

If you want to learn how to code them, this book is great: https://d2l.ai/chapter_attention-mechanisms-and-transformers...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment