Skip to content

Instantly share code, notes, and snippets.

LLM Wiki

A pattern for building personal knowledge bases using LLMs.

This is an idea file, it is designed to be copy pasted to your own LLM Agent (e.g. OpenAI Codex, Claude Code, OpenCode / Pi, or etc.). Its goal is to communicate the high level idea, but your agent will build out the specifics in collaboration with you.

The core idea

Most people's experience with LLMs and documents looks like RAG: you upload a collection of files, the LLM retrieves relevant chunks at query time, and generates an answer. This works, but the LLM is rediscovering knowledge from scratch on every question. There's no accumulation. Ask a subtle question that requires synthesizing five documents, and the LLM has to find and piece together the relevant fragments every time. Nothing is built up. NotebookLM, ChatGPT file uploads, and most RAG systems work this way.

@greenstevester
greenstevester / how-to-setup-ollama-on-a-macmini.md
Last active May 13, 2026 01:56
April 2026 TLDR setup for Ollama + Gemma 4 12B on a Mac mini (Apple Silicon) — auto-start, preload, and keep-alive

April 2026 TLDR setup for Ollama + Gemma 4 on a Mac mini (Apple Silicon) — auto-start, preload, and keep-alive

April 2026 TLDR Setup for Ollama + Gemma 4 on a Mac mini (Apple Silicon)

Prerequisites

  • Mac mini with Apple Silicon (M1/M2/M3/M4/M5)
  • At least 16GB unified memory for Gemma 4 (default 8B)
  • macOS with Homebrew installed
@burkeholland
burkeholland / ainstall.md
Last active May 18, 2026 14:21
Ultralight Orchestration

Ultralight Orchestration

A minimal multi-agent system with an orchestrator, a planner, a coder, and a designer working together providing orchestration between Claude, Codex and Gemini.

Instructions

Install all agents listed below into VS Code Insiders...

Title Type Description
@bddicken
bddicken / Contour.html
Last active April 24, 2025 13:34
Contour Images
<html>
<head>
<script is:inline defer src="https://cdn.jsdelivr.net/npm/img-comparison-slider@8/dist/index.js"></script>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/img-comparison-slider@8/dist/styles.css" />
<style>
.slider {
width: 70%;
@adtac
adtac / Dockerfile
Last active December 26, 2025 00:20
#!/usr/bin/env docker run
#!/usr/bin/env -S bash -c "docker run -p 8080:8080 -it --rm \$(docker build --progress plain -f \$0 . 2>&1 | tee /dev/stderr | grep -oP 'sha256:[0-9a-f]*')"
# syntax = docker/dockerfile:1.4.0
FROM node:20
WORKDIR /root
RUN npm install sqlite3
@jonstuebe
jonstuebe / ReSocket.ts
Created December 22, 2023 15:33
A WebSocket class with reconnection support
class ReSocket {
private token: string | undefined;
private socket: WebSocket | undefined;
private listeners: boolean = false;
private currentAttempt: number = 0;
private backoffTime: number = 1000;
private maxAttempts: number = 30;
private timer: NodeJS.Timeout | undefined;
private messageFn: (data: any) => void = (data) => {
//
@hyperupcall
hyperupcall / settings.jsonc
Last active March 17, 2026 09:13
VSCode config to disable popular extensions' annoyances (telemetry, notifications, welcome pages, etc.)
// I'm tired of extensions that automatically:
// - show welcome pages / walkthroughs
// - show release notes
// - send telemetry
// - recommend things
//
// This disables all of that stuff.
// If you have more config, leave a comment so I can add it!!
{
@rain-1
rain-1 / llama-home.md
Last active March 1, 2026 16:35
How to run Llama 13B with a 6GB graphics card

This worked on 14/May/23. The instructions will probably require updating in the future.

llama is a text prediction model similar to GPT-2, and the version of GPT-3 that has not been fine tuned yet. It is also possible to run fine tuned versions (like alpaca or vicuna with this. I think. Those versions are more focused on answering questions)

Note: I have been told that this does not support multiple GPUs. It can only use a single GPU.

It is possible to run LLama 13B with a 6GB graphics card now! (e.g. a RTX 2060). Thanks to the amazing work involved in llama.cpp. The latest change is CUDA/cuBLAS which allows you pick an arbitrary number of the transformer layers to be run on the GPU. This is perfect for low VRAM.

  • Clone llama.cpp from git, I am on commit 08737ef720f0510c7ec2aa84d7f70c691073c35d.
@gaearon
gaearon / 00-README-NEXT-SPA.md
Last active January 29, 2026 09:20
Next.js SPA example with dynamic client-only routing and static hosting

Next.js client-only SPA example

Made this example to show how to use Next.js router for a 100% SPA (no JS server) app.

You use Next.js router like normally, but don't define getStaticProps and such. Instead you do client-only fetching with swr, react-query, or similar methods.

You can generate HTML fallback for the page if there's something meaningful to show before you "know" the params. (Remember, HTML is static, so it can't respond to dynamic query. But it can be different per route.)

Don't like Next? Here's how to do the same in Gatsby.

@matheuscorreia
matheuscorreia / removeGlassDoorContentWall.js
Created February 27, 2023 20:49
A Tampermonkey user script to remove the content wall from Glassdoor review pages
// ==UserScript==
// @name Remove Glassdoor reviews content wall
// @namespace mailto:matheuscorreia2005@gmail.com
// @version 1.0
// @description Get rid of the annoying review content wall when browsing glassdoor.
// @author Matheus Correia
// @match https://www.glassdoor.com/Reviews/*
// @icon https://www.glassdoor.com/favicon.ico
// @grant none
// @run-at document-end