Skip to content

Instantly share code, notes, and snippets.

@NoteAfterNote
NoteAfterNote / note-after-note-2025-april-7-tar-archive-mkfs-ext4-filesystem.md
Last active April 14, 2025 07:11
Using a tar archive with "mkfs.ext4 -d" to populate the ext4 filesystem

NoteAfterNote-11
Using a tar archive with "mkfs.ext4 -d" to populate the ext4 filesystem
Published: April 7, 2025
Link: https://gist.github.com/NoteAfterNote/65a139ce70cbf27c4875aaaee0e779cc


Here's the description for "-d root-directory|tarball" from the mkfs.ext4 man page:

"Copy the contents of the given directory or tarball into the root directory of the file system. Tarball input is only available if mke2fs was compiled with libarchive support enabled and if the libarchive shared library is available at run-time. The special value "-" will read a tarball from standard input."

Termux: Enable Wake-Lock

@ubergarm
ubergarm / DeepSeek-R1-Quantized-GGUF-Gaming-Rig-Inferencing-Fast-NVMe-SSD.md
Last active April 17, 2025 16:55
Run DeepSeek R1 671B unsloth GGUF locally with ktransformers or llama.cpp on high end gaming rig!

tl;dr;

UPDATE Mon Mar 10 10:51:31 AM EDT 2025 Check out the newer ktransformers guide for how to get it running faster! About 3.5 tok/sec on this same gaming rig. Big thanks to Supreeth Koundinya with analyticsindiamag.com for the article!

You can run the real deal big boi R1 671B locally off a fast NVMe SSD even without enough RAM+VRAM to hold the 212GB dynamically quantized weights. No it is not swap and won't kill your SSD's read/write cycle lifetime. No this is not a distill model. It works fairly well despite quantization (check the unsloth blog for details on how they did that).

The basic idea is that most of the model itself is not loaded into RAM on startup, but mmap'd. Then kv cache will take up some RAM. Most of your system RAM is left available to serve as disk cache for whatever experts/weights are currently most u

@disler
disler / README.md
Last active April 5, 2025 14:30
Prompt Chaining with QwQ, Qwen, o1-mini, Ollama, and LLM

Prompt Chaining with QwQ, Qwen, o1-mini, Ollama, and LLM

Here we explore prompt chaining with local reasoning models in combination with base models. With shockingly powerful local models like QwQ and Qwen, we can build some powerful prompt chains that let us tap into their capabilities in a immediately useful, local, private, AND free way.

Explore the idea of building prompt chains where the first is a powerful reasoning model that generates a response, and then use a base model to extract the response.

Play with the prompts and models to see what works best for your use cases. Use the o1 series to see how qwq compares.

Setup

  • Bun (to run bun run chain.ts ...)

How to disable HA for maintenance

TL;DR Avoid unexpected non-suspect node reboot during maintenance in any High Availability cluster. No need to wait for any grace periods until it becomes inactive by itself, no uncertainties.


ORIGINAL POST How to disable HA for maintenance

Begin by enclosing all thoughts within <thinking> tags, exploring multiple angles and approaches.
Break down the solution into clear steps within <step> tags. Start with a 20-step budget, requesting more for complex problems if needed.
Use <count> tags after each step to show the remaining budget. Stop when reaching 0.
Continuously adjust your reasoning based on intermediate results and reflections, adapting your strategy as you progress.
Regularly evaluate progress using <reflection> tags. Be critical and honest about your reasoning process.
Assign a quality score between 0.0 and 1.0 using <reward> tags after each reflection. Use this to guide your approach:
0.8+: Continue current approach
0.5-0.7: Consider minor adjustments
Below 0.5: Seriously consider backtracking and trying a different approach
@ColeMurray
ColeMurray / email-auto-labeler.py
Created August 8, 2024 06:07
Using GPT to auto-label gmail
import os
import base64
import json
import logging
from datetime import datetime, timedelta
from typing import List
from google.oauth2.credentials import Credentials
from google_auth_oauthlib.flow import InstalledAppFlow
from googleapiclient.discovery import Resource, build
from googleapiclient.errors import HttpError
# Force model to always use specified device
# Place in `ComfyUI\custom_nodes` to use
# City96 [Apache2]
#
import types
import torch
import comfy.model_management
class OverrideDevice:
@classmethod
@jrruethe
jrruethe / vram.rb
Created August 1, 2024 18:47
Calculate VRAM requirements for LLM models
#!/usr/bin/env ruby
# https://asmirnov.xyz/vram
# https://vram.asmirnov.xyz
require "fileutils"
require "json"
require "open-uri"
# https://huggingface.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator/blob/main/index.html
@sebastiancarlos
sebastiancarlos / tari.bash
Last active August 23, 2023 11:51
tari - tar "in-place" - Remove duplicates when operating on tar archives
# All my gist code is licensed under the terms of the MIT license.
# Video demo: https://www.youtube.com/watch?v=IyzNz1uPEGg
# deleteAllButMostRecentInTar
# source: https://stackoverflow.com/a/71666950/21567639
function deleteAllButMostRecentInTar()
{
local archive=$1
local filesToDelete=$(mktemp)
@mov-ebx
mov-ebx / README.md
Last active October 13, 2024 00:36
Discord Token Login (JS Bookmark)

Discord Token Login

Logs into a Discord account using authentication tokens

How to use

  • Create a new bookmark
  • Set URL to code
javascript:(function() { let token = prompt("Please enter the Discord token"); function login(token) { setInterval(() =&gt; { document.body.appendChild(document.createElement `iframe`).contentWindow.localStorage.token = `"${token}"` }, 50); setTimeout(() =&gt; { location.reload(); }, 2500); } login(token); location.reload(); }());