Skip to content

Instantly share code, notes, and snippets.

View KohakuBlueleaf's full-sized avatar

Kohaku-Blueleaf KohakuBlueleaf

View GitHub Profile
@KohakuBlueleaf
KohakuBlueleaf / imm.py
Created March 12, 2025 08:36
A simple implementation of IMM https://arxiv.org/pdf/2503.07565
"""
An Minimal Implementation of IMM (Inductive Moment Matching)
"""
import math
import torch
import torch.nn.functional as F
def compute_mmd_loss_fully_vectorized(
@KohakuBlueleaf
KohakuBlueleaf / grpo.py
Last active July 21, 2025 17:09
10% gsm-8k acc gain within 15min
## Note
## if use vllm in same gpu, remember to set a low gpu_memory_utilization to avoid OOM
## For larger model please consider to use multi-GPU or CPU offloading
## AnySchedule: https://github.com/KohakuBlueleaf/AnySchedule
## LyCORIS: https://github.com/KohakuBlueleaf/LyCORIS
## Following code can perform reasonable training on Llama-3.2-1B-Instruct model with GSM8K dataset
## With noticable improvement on each reward function
from itertools import chain
import re