Skip to content

Instantly share code, notes, and snippets.

View sseveran's full-sized avatar
:octocat:

Steve Severance sseveran

:octocat:
View GitHub Profile

LLM Wiki

A pattern for building personal knowledge bases using LLMs.

This is an idea file, it is designed to be copy pasted to your own LLM Agent (e.g. OpenAI Codex, Claude Code, OpenCode / Pi, or etc.). Its goal is to communicate the high level idea, but your agent will build out the specifics in collaboration with you.

The core idea

Most people's experience with LLMs and documents looks like RAG: you upload a collection of files, the LLM retrieves relevant chunks at query time, and generates an answer. This works, but the LLM is rediscovering knowledge from scratch on every question. There's no accumulation. Ask a subtle question that requires synthesizing five documents, and the LLM has to find and piece together the relevant fragments every time. Nothing is built up. NotebookLM, ChatGPT file uploads, and most RAG systems work this way.

@tamuhey
tamuhey / tokenizations_post.md
Last active July 27, 2024 14:46
How to calculate the alignment between BERT and spaCy tokens effectively and robustly

How to calculate the alignment between BERT and spaCy tokens effectively and robustly

image

site: https://tamuhey.github.io/tokenizations/

Natural Language Processing (NLP) has made great progress in recent years because of neural networks, which allows us to solve various tasks with end-to-end architecture. However, many NLP systems still require language-specific pre- and post-processing, especially in tokenizations. In this article, I describe an algorithm that simplifies calculating correspondence between tokens (e.g. BERT vs. spaCy), one such process. And I introduce Python and Rust libraries that implement this algorithm. Here are the library and the demo site links:

@piyueh
piyueh / tf_keras_tfp_lbfgs.py
Last active April 28, 2025 04:09
Optimize TensorFlow & Keras models with L-BFGS from TensorFlow Probability
#! /usr/bin/env python
# -*- coding: utf-8 -*-
# vim:fenc=utf-8
#
# Copyright © 2019 Pi-Yueh Chuang <pychuang@gwu.edu>
#
# Distributed under terms of the MIT license.
"""An example of using tfp.optimizer.lbfgs_minimize to optimize a TensorFlow model.
@Daenyth
Daenyth / SlickUpsert.scala
Created February 26, 2018 20:59
A slick profile extension to allow native postgres batch upsert
import com.github.tminglei.slickpg.ExPostgresProfile
import slick.SlickException
import slick.ast.ColumnOption.PrimaryKey
import slick.ast.{ColumnOption, FieldSymbol, Insert, Node, Select}
import slick.compiler.{InsertCompiler, Phase, QueryCompiler}
import slick.dbio.{Effect, NoStream}
import slick.jdbc.InsertBuilderResult
import slick.lifted.Query
// format: off
@ZeccaLehn
ZeccaLehn / pythonActualPrice.ipynb
Last active July 10, 2023 10:10
Py: Adjust Splits and Dividends from Real Prices using Quandl Finance Data
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
var active = false;
function changeRefer(details) {
if (!active) return;
for (var i = 0; i < details.requestHeaders.length; ++i) {
if (details.requestHeaders[i].name === 'Referer') {
details.requestHeaders[i].value = 'http://www.google.com/';
break;
}