Skip to content

Instantly share code, notes, and snippets.

@f0ster
f0ster / qemu.mk
Created December 19, 2025 05:09
qemu makefile
QEMU ?= qemu-system-x86_64
UBUNTU_INSTALL_CDROM ?= ~/Downloads/ubuntu-24.04.3-desktop-amd64.iso
DISK ?= disk/disk.qcow2
RAM ?= 32768
CPUS ?= 16
PORT ?= 2222
DISPLAY_WIDTH ?= 1920
DISPLAY_HEIGHT ?= 1080
# QXL for stable single display (defaults)
@f0ster
f0ster / .tmux.conf
Created December 8, 2025 15:58
tmux 3.5a conf
set -g terminal-overrides 'xterm*:smcup@:rmcup@'
# ~/.tmux.conf
# for tmux 3.5a
#
# -----------------------------------------------------------------------------
# Global settings
# Set prefix key to Ctrl-a
#unbind-key C-b
@f0ster
f0ster / clock.html
Created August 5, 2025 03:32
LED Binary clock
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Horizontal Binary Clock</title>
<style>
body {
background-color: #1a1a1a;
/* Center the main wrapper on the page */
@f0ster
f0ster / deepseek-v3-tech-dive-ptx.ipynb
Last active February 18, 2025 17:09
💥 Smashing the Tariffs for Fun and Profit: How DeepSeek v3 Outsmarted the AI Ban 🧠🚀
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@f0ster
f0ster / deepseek-v3-tech-dive.ipynb
Last active February 18, 2025 01:57
Smashing the Tariffs for Fun and Profit: How DeepSeek v3 Outsmarted the AI Ban
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@f0ster
f0ster / deepseek-v3-tech-dive.md
Created February 17, 2025 20:03
Smashing the Tariffs for Fun and Profit: How DeepSeek v3 Outsmarted the AI Ban

Smashing the Tariffs for Fun and Profit: How DeepSeek v3 Outsmarted the AI Ban

1. CUDA and PTX Optimizations

DeepSeek-V3’s engineers optimized GPU performance at the low-level by tailoring kernels and memory access patterns to NVIDIA’s hardware. A key strategy was warp specialization: they partitioned a subset of GPU threads (warps) specifically for communication tasks, allowing compute to overlap with data transfers (DeepSeek-V3 Technical Report). In practice, only ~20 of the GPU’s Streaming Multiprocessors (SMs) were reserved to handle all cross-node communications – enough to saturate both InfiniBand (IB) and NVLink bandwidth – while the remaining SMs focused purely on computation (DeepSeek-V3 Technical Report) ([DeepSeek-V3 Technical Report](https://arx

@f0ster
f0ster / add_arrays.cu
Created January 7, 2025 18:48
Custom CUDA Kernel Example
#include <iostream>
#include <cuda.h>
// CUDA Kernel
__global__ void add_arrays(float *a, float *b, float *c, int n) {
int idx = blockIdx.x * blockDim.x + threadIdx.x;
if (idx < n) {
c[idx] = a[idx] + b[idx];
}
@f0ster
f0ster / .tmux.conf
Last active May 12, 2024 01:22
modern osx tmux conf
# ~/.tmux.conf
# General settings
set -g default-terminal "screen-256color" # Use 256-color terminal
set -g history-limit 5000 # Increase scrollback buffer size
set -g base-index 0 # Start window indexes at 0
set -g mouse on # Enable mouse control (pane selection, resizing, scrolling)
# Restore original prefix key
set-option -g prefix C-b # Set prefix to Ctrl-b
@f0ster
f0ster / inspect_git_repos.py
Last active May 11, 2024 18:46
Summarize git repositories
# script to provide a summary of the repositories by only listing each one's name
# along with its status (public, public with changes, or private)
import os
import subprocess
import json
from concurrent.futures import ThreadPoolExecutor, as_completed
def execute_command(command, cwd):
"""Executes a shell command in a specified directory and returns the output."""
@f0ster
f0ster / mixtral_demo.py
Created April 28, 2024 15:30
Running mistralai mixtral locally
import time
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
def load_model_and_tokenizer(model_id):
"""
Load the tokenizer and model based on the specified model ID.
Model is set to use float16 for computation to reduce memory usage and improve performance.
"""
tokenizer = AutoTokenizer.from_pretrained(model_id)