Christian Boyle christianboyle

LLM Wiki

A pattern for building personal knowledge bases using LLMs.

This is an idea file, it is designed to be copy pasted to your own LLM Agent (e.g. OpenAI Codex, Claude Code, OpenCode / Pi, or etc.). Its goal is to communicate the high level idea, but your agent will build out the specifics in collaboration with you.

The core idea

Most people's experience with LLMs and documents looks like RAG: you upload a collection of files, the LLM retrieves relevant chunks at query time, and generates an answer. This works, but the LLM is rediscovering knowledge from scratch on every question. There's no accumulation. Ask a subtle question that requires synthesizing five documents, and the LLM has to find and piece together the relevant fragments every time. Nothing is built up. NotebookLM, ChatGPT file uploads, and most RAG systems work this way.

April 2026 TLDR setup for Ollama + Gemma 4 on a Mac mini (Apple Silicon) — auto-start, preload, and keep-alive

April 2026 TLDR Setup for Ollama + Gemma 4 on a Mac mini (Apple Silicon)

Prerequisites

Mac mini with Apple Silicon (M1/M2/M3/M4/M5)
At least 16GB unified memory for Gemma 4 (default 8B)
macOS with Homebrew installed

Ultralight Orchestration

A minimal multi-agent system with an orchestrator, a planner, a coder, and a designer working together providing orchestration between Claude, Codex and Gemini.

Instructions

Install all agents listed below into VS Code Insiders...

Title	Type	Description

This worked on 14/May/23. The instructions will probably require updating in the future.

llama is a text prediction model similar to GPT-2, and the version of GPT-3 that has not been fine tuned yet. It is also possible to run fine tuned versions (like alpaca or vicuna with this. I think. Those versions are more focused on answering questions)

Note: I have been told that this does not support multiple GPUs. It can only use a single GPU.

It is possible to run LLama 13B with a 6GB graphics card now! (e.g. a RTX 2060). Thanks to the amazing work involved in llama.cpp. The latest change is CUDA/cuBLAS which allows you pick an arbitrary number of the transformer layers to be run on the GPU. This is perfect for low VRAM.

Clone llama.cpp from git, I am on commit 08737ef720f0510c7ec2aa84d7f70c691073c35d.

Next.js client-only SPA example

Made this example to show how to use Next.js router for a 100% SPA (no JS server) app.

You use Next.js router like normally, but don't define getStaticProps and such. Instead you do client-only fetching with swr, react-query, or similar methods.

You can generate HTML fallback for the page if there's something meaningful to show before you "know" the params. (Remember, HTML is static, so it can't respond to dynamic query. But it can be different per route.)

Don't like Next? Here's how to do the same in Gatsby.

	<html>

	<head>

	<script is:inline defer src="https://cdn.jsdelivr.net/npm/img-comparison-slider@8/dist/index.js"></script>
	<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/img-comparison-slider@8/dist/styles.css" />

	<style>
	.slider {
	width: 70%;

	#!/usr/bin/env -S bash -c "docker run -p 8080:8080 -it --rm \$(docker build --progress plain -f \$0 . 2>&1 \| tee /dev/stderr \| grep -oP 'sha256:[0-9a-f]*')"

	# syntax = docker/dockerfile:1.4.0

	FROM node:20

	WORKDIR /root

	RUN npm install sqlite3

	class ReSocket {
	private token: string \| undefined;
	private socket: WebSocket \| undefined;
	private listeners: boolean = false;
	private currentAttempt: number = 0;
	private backoffTime: number = 1000;
	private maxAttempts: number = 30;
	private timer: NodeJS.Timeout \| undefined;
	private messageFn: (data: any) => void = (data) => {
	//

	// I'm tired of extensions that automatically:
	// - show welcome pages / walkthroughs
	// - show release notes
	// - send telemetry
	// - recommend things
	//
	// This disables all of that stuff.
	// If you have more config, leave a comment so I can add it!!

	{

	// ==UserScript==
	// @name Remove Glassdoor reviews content wall
	// @namespace mailto:matheuscorreia2005@gmail.com
	// @version 1.0
	// @description Get rid of the annoying review content wall when browsing glassdoor.
	// @author Matheus Correia
	// @match https://www.glassdoor.com/Reviews/*
	// @icon https://www.glassdoor.com/favicon.ico
	// @grant none
	// @run-at document-end