Skip to content

Instantly share code, notes, and snippets.

@ubergarm
ubergarm / DeepSeek-R1-Quantized-GGUF-Gaming-Rig-Inferencing-Fast-NVMe-SSD.md
Last active June 16, 2025 23:57
Run DeepSeek R1 671B unsloth GGUF locally with ktransformers or llama.cpp on high end gaming rig!

tl;dr;

UPDATE Mon Mar 10 10:51:31 AM EDT 2025 Check out the newer ktransformers guide for how to get it running faster! About 3.5 tok/sec on this same gaming rig. Big thanks to Supreeth Koundinya with analyticsindiamag.com for the article!

You can run the real deal big boi R1 671B locally off a fast NVMe SSD even without enough RAM+VRAM to hold the 212GB dynamically quantized weights. No it is not swap and won't kill your SSD's read/write cycle lifetime. No this is not a distill model. It works fairly well despite quantization (check the unsloth blog for details on how they did that).

The basic idea is that most of the model itself is not loaded into RAM on startup, but mmap'd. Then kv cache will take up some RAM. Most of your system RAM is left available to serve as disk cache for whatever experts/weights are currently most u

@Artefact2
Artefact2 / README.md
Last active June 22, 2025 15:51
GGUF quantizations overview

Which GGUF is right for me? (Opinionated)

Good question! I am collecting human data on how quantization affects outputs. See here for more information: ggml-org/llama.cpp#5962

In the meantime, use the largest that fully fits in your GPU. If you can comfortably fit Q4_K_S, try using a model with more parameters.

llama.cpp feature matrix

See the wiki upstream: https://github.com/ggerganov/llama.cpp/wiki/Feature-matrix

@DanielAugusto191
DanielAugusto191 / LLVM 17.md
Created November 8, 2023 22:17
LLVM 17.md

LLVM stands for Low Level Virtual Machine is a compiler interface, that can represent and optimize some code like assembly.

To compile to LLVM we can use:

clang -S -emit-llvm {input} -o {output}.bc

It will produce an bytecode, just like assembly with registers, calls, and functions.

clang -c -emit-llvm {input} -o {output}.ll
@apla
apla / convert-jscad-v1.js
Last active March 18, 2024 15:13
WIP Convert JSCAD V1 => V2
const fs = require('fs').promises;
const { parse } = require('acorn');
const acornWalk = require('acorn-walk');
// Important limitations:
// Variables in attributes cannot be processed automatically, converter will throw in that case
// Whitespace and source comments lost sometimes, especially in first argument
// Constructs like x.difference(y), z.union(a, b, c) probably not supported - need parameter reordering
// If you see it not works and want to add support for that, modify `cluster.chunks.reduce` in `processCluster`
@pervognsen
pervognsen / shift_dfa.md
Last active May 23, 2025 10:30
Shift-based DFAs

A traditional table-based DFA implementation looks like this:

uint8_t table[NUM_STATES][256]

uint8_t run(const uint8_t *start, const uint8_t *end, uint8_t state) {
    for (const uint8_t *s = start; s != end; s++)
        state = table[state][*s];
    return state;
}
@bpsib
bpsib / BBC-Radio-HLS.m3u
Last active June 20, 2025 10:21 — forked from stengland/BBC-Radio.m3u
BBC Radio Streams
#EXTM3U
#EXTINF:-1,BBC - Radio 1
http://as-hls-ww-live.akamaized.net/pool_01505109/live/ww/bbc_radio_one/bbc_radio_one.isml/bbc_radio_one-audio%3d96000.norewind.m3u8
#EXTINF:-1,BBC - Radio 1Xtra
http://as-hls-ww-live.akamaized.net/pool_92079267/live/ww/bbc_1xtra/bbc_1xtra.isml/bbc_1xtra-audio%3d96000.norewind.m3u8
#EXTINF:-1,BBC - Radio 1Dance
http://as-hls-ww-live.akamaized.net/pool_62063831/live/ww/bbc_radio_one_dance/bbc_radio_one_dance.isml/bbc_radio_one_dance-audio%3d96000.norewind.m3u8
#EXTINF:-1,BBC - Radio 1 Anthems (UK Only)
http://as-hls-uk-live.akamaized.net/pool_904/live/uk/bbc_radio_one_anthems/bbc_radio_one_anthems.isml/bbc_radio_one_anthems-audio%3d96000.norewind.m3u8
#EXTINF:-1,BBC - Radio 2
// Copyright (C) 2019, Dan Ravensloft
// SPDX-License-Identifier: GPL-3.0-or-later
library(74series) {
// 7400 quad 2-input NAND gate
cell(7400_4xNAND2) {
area: 3;
pin(A) { direction: input; }
pin(B) { direction: input; }
pin(Y) { direction: output; function: "(A*B)'"; }
}
@rikka0w0
rikka0w0 / ft2232_to_digilent_jtag.md
Last active June 19, 2025 16:00
FT2232 to Digilent JTag for Xilinx FPGAs (ISE/Vivado)

The Digilent JTag uses FT2232, but its configuration EEPROM contains secrete data needed to be recoginzed by Xilinx ISE/Vivado. The following method only works on linux (tested on Ubuntu16.04), but the patched FT2232 doggle also works on Windows. Since WSL1 does not provide USB device access, the following method will not work for WSL1.

DONT use FT_Prog on offical Digilent cable, as it can trash the firmware! The offical eeprom contains secrete data that cannot be handled correctly by FT_Prog.

Here are steps to create a Digilent-like Jtag that can be used in Xilinx ISE and Vivado

  1. Install softwares: sudo apt-get install libftdi1 ftdi-eeprom
  2. Create a file "flash_digilent.conf" with the following content:
@AveYo
AveYo / .. MediaCreationTool.bat ..md
Last active June 16, 2025 08:54
Universal MediaCreationTool wrapper for all MCT Windows 10 versions - MOVED TO github.com/AveYo/MediaCreationTool.bat
@jovimon
jovimon / gist:524e116471f249626fd2ccd141f3fe05
Last active July 15, 2024 08:23
compile realtek network driver for pfsense 2.4.x

How to compile and install latest realtek network driver in pfSense 2.4.x (FreeBSD 11.1)

  1. Download FreeBSD 11.1 VMDK and create a VM with it as HDD.

  2. Get FreeBSD source tree for your exact FreeBSD version and uncompress it to /usr/src:

    fetch -o /tmp ftp://ftp.freebsd.org/pub/`uname -s`/releases/`uname -m`/`uname -r | cut -d'-' -f1,2`/src.txz
    tar -C / -xvf /tmp/src.txz
    
  3. Download latest Realtek network driver (you need to input an email address).