Tim Dettmers TimDettmers

TimDettmers / gist:8b86443825cac5fcf5862c0037721421

Created February 20, 2025 21:46

AMD slurm SGLang install (Frontier)

TimDettmers / gist:385014b37f998c7857b15a3ea60b4cae

Last active August 14, 2023 23:07

Chip2 data evaluated with Rouge L and BERTScore for 4, 8, 16 bit LoRA and full finetuning

	Config: max_steps: 17500, lora_r: 64 , lr: 0.0002, bf16: False, lora_modules: all , bits: 4 , full_finetune: False, lora_dropout: 0.0 , warmup_steps: 100 , compress_statistics: True, dataset: NaN , gradient_accumulation_steps: NaN , learning_rate: NaN , quant_type: fp4 , adam_beta2: 0.999, update_freq: 6
	eval_bert_f1 mean (SE): 64.8716 (21.8331). 95% CI (22.079, 107.664). Sample size: 4
	eval_bert_f1 mean (SE): 64.8716 (21.8331). 95% CI (22.079, 107.664). Sample size: 4
	eval_rougeL mean (SE): 33.1083 (19.1162). 95% CI (-4.359, 70.576). Sample size: 4
	================================================================================
	Config: max_steps: 17500, lora_r: 64 , lr: 0.0002, bf16: False, lora_modules: all , bits: 4 , full_finetune: False, lora_dropout: 0.0 , warmup_steps: 100 , compress_statistics: False, dataset: NaN , gradient_accumulation_steps: NaN , learning_rate: NaN , quant_type: fp4 , adam_beta2: 0.999, update_freq: 6
	eval_bert_f1 mean (SE): 67.0044 (22.3593). 95% CI (23.180, 110.829).

TimDettmers / gist:a96b0d948a97583f8ab0599fa888c35c

Created August 10, 2023 04:57

Hyperparameter grid search for LLaMA models on Alpaca dataset for QLoRA finetuning

	This table contains data from multiple software versions. Some hyperparamter names are "NaN" meaning, they did not exist in that software version. The best 7B result is 40.08 MMLU.

	================================================================================
	Config: learning_rate: 0.005, adam_beta2: 0.999, lora_dropout: 0.0 , max_grad_norm: 1.0 , max_steps: 7320, lr_scheduler_type: <SchedulerType.COSINE: cosine>, weight_decay: 0.0 , base_model: /gscratch/zlab/llama/7B, quant_type: nf4 , gradient_accumulation_steps: 6 , per_device_train_batch_size: 2
	acc mean (SE): 0.2290 (nan). 95% CI (nan, nan). Sample size: 1
	================================================================================
	Config: learning_rate: 0.0002, adam_beta2: 0.999, lora_dropout: 0.0 , max_grad_norm: 0.3 , max_steps: 9750, lr_scheduler_type: <SchedulerType.CONSTANT: constant>, weight_decay: 0.0 , base_model: NaN , quant_type: nf4 , gradient_accumulation_steps: 2 , per_device_train_batch_size: 8
	acc mean (SE): 0.2290 (0.0

TimDettmers / inference_hf_8bit.py

Created October 11, 2022 15:32

Minimal example of 8-bit inference for LLMs via Hugging Face transformers + accelerate.

	import torch
	from transformers import AutoModelForCausalLM, AutoTokenizer

	MAX_NEW_TOKENS = 128
	model_name = 'facebook/opt-6.7b'

	text = """Hello, I am a prompt. Who are you?"""
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	input_ids = tokenizer(text, return_tensors="pt").input_ids

TimDettmers / find_huffman_ratio.py

Last active September 30, 2021 15:45

Calculate Huffman compression ratio with bitsandbytes

	import torch
	import bitsandbytes as bnb

	from heapq import heappush, heappop, heapify

	a = torch.normal(0, 0.5, size=(1024, 1024),device='cuda')

	def get_compression(x:torch.Tensor)->float:
	"""Yields the compression rate of Huffman Coding"""
	assert x.device.type == 'cuda'