Skip to content

Instantly share code, notes, and snippets.

@cooperleong00
cooperleong00 / white-box_jailbreak.py
Last active May 22, 2025 03:29
white-box LLM jailbreak using weight orthogonization
# %%
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
import random
random.seed(42)
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from datasets import load_dataset
# %%