Udi Finkelstein udif

tl;dr;

UPDATE Mon Mar 10 10:51:31 AM EDT 2025 Check out the newer ktransformers guide for how to get it running faster! About 3.5 tok/sec on this same gaming rig. Big thanks to Supreeth Koundinya with analyticsindiamag.com for the article!

You can run the real deal big boi R1 671B locally off a fast NVMe SSD even without enough RAM+VRAM to hold the 212GB dynamically quantized weights. No it is not swap and won't kill your SSD's read/write cycle lifetime. No this is not a distill model. It works fairly well despite quantization (check the unsloth blog for details on how they did that).

The basic idea is that most of the model itself is not loaded into RAM on startup, but mmap'd. Then kv cache will take up some RAM. Most of your system RAM is left available to serve as disk cache for whatever experts/weights are currently most u

Which GGUF is right for me? (Opinionated)

Good question! I am collecting human data on how quantization affects outputs. See here for more information: ggml-org/llama.cpp#5962

In the meantime, use the largest that fully fits in your GPU. If you can comfortably fit Q4_K_S, try using a model with more parameters.

llama.cpp feature matrix

See the wiki upstream: https://github.com/ggerganov/llama.cpp/wiki/Feature-matrix

LLVM stands for Low Level Virtual Machine is a compiler interface, that can represent and optimize some code like assembly.

To compile to LLVM we can use:

clang -S -emit-llvm {input} -o {output}.bc

It will produce an bytecode, just like assembly with registers, calls, and functions.

clang -c -emit-llvm {input} -o {output}.ll

A traditional table-based DFA implementation looks like this:

uint8_t table[NUM_STATES][256]

uint8_t run(const uint8_t *start, const uint8_t *end, uint8_t state) {
    for (const uint8_t *s = start; s != end; s++)
        state = table[state][*s];
    return state;
}

The Digilent JTag uses FT2232, but its configuration EEPROM contains secrete data needed to be recoginzed by Xilinx ISE/Vivado. The following method only works on linux (tested on Ubuntu16.04), but the patched FT2232 doggle also works on Windows. Since WSL1 does not provide USB device access, the following method will not work for WSL1.

DONT use FT_Prog on offical Digilent cable, as it can trash the firmware! The offical eeprom contains secrete data that cannot be handled correctly by FT_Prog.

Here are steps to create a Digilent-like Jtag that can be used in Xilinx ISE and Vivado

Install softwares: sudo apt-get install libftdi1 ftdi-eeprom
Create a file "flash_digilent.conf" with the following content:

We did it! We broke gist.github.com ;) So head over to the new home! Thank you all!
2021.10.20: https://github.com/AveYo/MediaCreationTool.bat now open for interaction

Not just an Universal MediaCreationTool wrapper script with ingenious support for business editions,

A powerful yet simple windows 10 / 11 deployment automation tool as well!

discuss on MDL

How to compile and install latest realtek network driver in pfSense 2.4.x (FreeBSD 11.1)

Download FreeBSD 11.1 VMDK and create a VM with it as HDD.

Get FreeBSD source tree for your exact FreeBSD version and uncompress it to /usr/src:

fetch -o /tmp ftp://ftp.freebsd.org/pub/`uname -s`/releases/`uname -m`/`uname -r | cut -d'-' -f1,2`/src.txz
tar -C / -xvf /tmp/src.txz

Download latest Realtek network driver (you need to input an email address).

	const fs = require('fs').promises;

	const { parse } = require('acorn');
	const acornWalk = require('acorn-walk');

	// Important limitations:
	// Variables in attributes cannot be processed automatically, converter will throw in that case
	// Whitespace and source comments lost sometimes, especially in first argument
	// Constructs like x.difference(y), z.union(a, b, c) probably not supported - need parameter reordering
	// If you see it not works and want to add support for that, modify `cluster.chunks.reduce` in `processCluster`

	#EXTM3U
	#EXTINF:-1,BBC - Radio 1
	http://as-hls-ww-live.akamaized.net/pool_01505109/live/ww/bbc_radio_one/bbc_radio_one.isml/bbc_radio_one-audio%3d96000.norewind.m3u8
	#EXTINF:-1,BBC - Radio 1Xtra
	http://as-hls-ww-live.akamaized.net/pool_92079267/live/ww/bbc_1xtra/bbc_1xtra.isml/bbc_1xtra-audio%3d96000.norewind.m3u8
	#EXTINF:-1,BBC - Radio 1Dance
	http://as-hls-ww-live.akamaized.net/pool_62063831/live/ww/bbc_radio_one_dance/bbc_radio_one_dance.isml/bbc_radio_one_dance-audio%3d96000.norewind.m3u8
	#EXTINF:-1,BBC - Radio 1 Anthems (UK Only)
	http://as-hls-uk-live.akamaized.net/pool_904/live/uk/bbc_radio_one_anthems/bbc_radio_one_anthems.isml/bbc_radio_one_anthems-audio%3d96000.norewind.m3u8
	#EXTINF:-1,BBC - Radio 2

	// Copyright (C) 2019, Dan Ravensloft
	// SPDX-License-Identifier: GPL-3.0-or-later
	library(74series) {
	// 7400 quad 2-input NAND gate
	cell(7400_4xNAND2) {
	area: 3;
	pin(A) { direction: input; }
	pin(B) { direction: input; }
	pin(Y) { direction: output; function: "(A*B)'"; }
	}