---
config:
xyChart:
width: 900
height: 500
themeVariables:
xyChart:
plotColorPalette: "#3366cc, #cc3366"
---
xychart-beta
title "kimi-linear 48B.A3B Q6_K @ gfx1151 — throughput vs context depth"
x-axis "context depth (tokens)" [0, 16384, 32768, 49152, 65536, 81920, 98304, 114688]
y-axis "t/s" 0 --> 650
line [602.85, 392.18, 288.20, 228.47, 189.02, 161.29, 140.44, 124.52]
line [54.96, 51.17, 47.88, 44.94, 42.31, 39.90, 37.95, 36.12]
(.venv) kyle@dev:~/src/llama.cpp$ ./build/bin/llama-bench -m kimi-Q6_K.gguf -ngl 99 --mmap 0 \
-p 512 -n 128 -d 0-262144+16384
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 122880 MiB):
Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 122880 MiB
| model | size | params | backend | ngl | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | ---: | --------------: | -------------------: |
| kimi-linear 48B.A3B Q6_K | 37.59 GiB | 49.12 B | ROCm | 99 | 0 | pp512 | 602.85 ± 4.89 |
| kimi-linear 48B.A3B Q6_K | 37.59 GiB | 49.12 B | ROCm | 99 | 0 | tg128 | 54.96 ± 0.11 |
| kimi-linear 48B.A3B Q6_K | 37.59 GiB | 49.12 B | ROCm | 99 | 0 | pp512 @ d16384 | 392.18 ± 2.18 |
| kimi-linear 48B.A3B Q6_K | 37.59 GiB | 49.12 B | ROCm | 99 | 0 | tg128 @ d16384 | 51.17 ± 0.05 |
| kimi-linear 48B.A3B Q6_K | 37.59 GiB | 49.12 B | ROCm | 99 | 0 | pp512 @ d32768 | 288.20 ± 0.94 |
| kimi-linear 48B.A3B Q6_K | 37.59 GiB | 49.12 B | ROCm | 99 | 0 | tg128 @ d32768 | 47.88 ± 0.04 |
| kimi-linear 48B.A3B Q6_K | 37.59 GiB | 49.12 B | ROCm | 99 | 0 | pp512 @ d49152 | 228.47 ± 0.82 |
| kimi-linear 48B.A3B Q6_K | 37.59 GiB | 49.12 B | ROCm | 99 | 0 | tg128 @ d49152 | 44.94 ± 0.03 |
| kimi-linear 48B.A3B Q6_K | 37.59 GiB | 49.12 B | ROCm | 99 | 0 | pp512 @ d65536 | 189.02 ± 0.38 |
| kimi-linear 48B.A3B Q6_K | 37.59 GiB | 49.12 B | ROCm | 99 | 0 | tg128 @ d65536 | 42.31 ± 0.04 |
| kimi-linear 48B.A3B Q6_K | 37.59 GiB | 49.12 B | ROCm | 99 | 0 | pp512 @ d81920 | 161.29 ± 0.56 |
| kimi-linear 48B.A3B Q6_K | 37.59 GiB | 49.12 B | ROCm | 99 | 0 | tg128 @ d81920 | 39.90 ± 0.02 |
| kimi-linear 48B.A3B Q6_K | 37.59 GiB | 49.12 B | ROCm | 99 | 0 | pp512 @ d98304 | 140.44 ± 0.21 |
| kimi-linear 48B.A3B Q6_K | 37.59 GiB | 49.12 B | ROCm | 99 | 0 | tg128 @ d98304 | 37.95 ± 0.02 |
| kimi-linear 48B.A3B Q6_K | 37.59 GiB | 49.12 B | ROCm | 99 | 0 | pp512 @ d114688 | 124.52 ± 0.14 |
| kimi-linear 48B.A3B Q6_K | 37.59 GiB | 49.12 B | ROCm | 99 | 0 | tg128 @ d114688 | 36.12 ± 0.03 |