Skip to content

Instantly share code, notes, and snippets.

@gnomefin
Created April 25, 2026 08:13
Show Gist options
  • Select an option

  • Save gnomefin/e777048fea1339c01edf912886fafe0e to your computer and use it in GitHub Desktop.

Select an option

Save gnomefin/e777048fea1339c01edf912886fafe0e to your computer and use it in GitHub Desktop.
vllm-omni PR #3118 — evidence: pytest 29/29 + HTTP migration sanity (commit 8c5c4cda)
================================================================
VoxCPM2 PR #3118 — review-round evidence pack
Branch HEAD: 8c5c4cda
https://github.com/vllm-project/vllm-omni/pull/3118
================================================================
Contents:
Part 1 — pytest -v on tests/entrypoints/openai_api/test_serving_speech_voxcpm2.py
Part 2 — live HTTP curl checks of the deployed image, including:
* NEW vs OLD shape (extra_params migration)
* 400 range guard, 400 type guard
* 400 length cap on /v1/audio/speech (P2 fix)
* cfg A/B in pure-text mode and Hi-Fi mode (P1 fix)
================================================================
Part 1 — Unit test run (29/29)
================================================================
============================= test session starts ==============================
platform linux -- Python 3.12.13, pytest-9.0.3, pluggy-1.6.0 -- /usr/bin/python3
cachedir: .pytest_cache
rootdir: /tmp/pr-final
configfile: pyproject.toml
plugins: mock-3.15.1, asyncio-1.3.0, hydra-core-1.3.2, typeguard-4.5.1, anyio-4.13.0
asyncio: mode=Mode.AUTO, debug=False, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function
collecting ... collected 29 items
entrypoints/openai_api/test_serving_speech_voxcpm2.py::TestVoxCPM2Serving::test_voxcpm2_model_type_detection PASSED [ 3%]
entrypoints/openai_api/test_serving_speech_voxcpm2.py::TestVoxCPM2Serving::test_voxcpm2_accepts_any_text_input PASSED [ 6%]
entrypoints/openai_api/test_serving_speech_voxcpm2.py::TestVoxCPM2Serving::test_build_prompt_text_only PASSED [ 10%]
entrypoints/openai_api/test_serving_speech_voxcpm2.py::TestVoxCPM2Serving::test_build_prompt_prepends_instructions PASSED [ 13%]
entrypoints/openai_api/test_serving_speech_voxcpm2.py::TestVoxCPM2Serving::test_build_prompt_strips_instructions_whitespace PASSED [ 17%]
entrypoints/openai_api/test_serving_speech_voxcpm2.py::TestVoxCPM2Serving::test_build_prompt_stashes_cfg_value PASSED [ 20%]
entrypoints/openai_api/test_serving_speech_voxcpm2.py::TestVoxCPM2Serving::test_build_prompt_omits_cfg_value_when_extra_params_missing PASSED [ 24%]
entrypoints/openai_api/test_serving_speech_voxcpm2.py::TestVoxCPM2Serving::test_build_prompt_omits_cfg_value_when_extra_params_has_other_keys PASSED [ 27%]
entrypoints/openai_api/test_serving_speech_voxcpm2.py::TestVoxCPM2Serving::test_build_prompt_instructions_and_cfg_together PASSED [ 31%]
entrypoints/openai_api/test_serving_speech_voxcpm2.py::TestVoxCPM2Serving::test_build_prompt_hifi_cloning_ref_audio_ref_text_cfg PASSED [ 34%]
entrypoints/openai_api/test_serving_speech_voxcpm2.py::TestVoxCPM2Serving::test_build_prompt_hifi_mode_ignores_instructions PASSED [ 37%]
entrypoints/openai_api/test_serving_speech_voxcpm2.py::TestVoxCPM2Serving::test_validate_rejects_overlong_instructions PASSED [ 41%]
entrypoints/openai_api/test_serving_speech_voxcpm2.py::TestVoxCPM2Serving::test_validate_accepts_at_limit_instructions PASSED [ 44%]
entrypoints/openai_api/test_serving_speech_voxcpm2.py::TestVoxCPM2Serving::test_prepare_speech_generation_runs_validator_for_voxcpm2 PASSED [ 48%]
entrypoints/openai_api/test_serving_speech_voxcpm2.py::TestVoxCPM2Serving::test_cfg_value_accepts_range[0.1] PASSED [ 51%]
entrypoints/openai_api/test_serving_speech_voxcpm2.py::TestVoxCPM2Serving::test_cfg_value_accepts_range[0.5] PASSED [ 55%]
entrypoints/openai_api/test_serving_speech_voxcpm2.py::TestVoxCPM2Serving::test_cfg_value_accepts_range[1.5] PASSED [ 58%]
entrypoints/openai_api/test_serving_speech_voxcpm2.py::TestVoxCPM2Serving::test_cfg_value_accepts_range[2.0] PASSED [ 62%]
entrypoints/openai_api/test_serving_speech_voxcpm2.py::TestVoxCPM2Serving::test_cfg_value_accepts_range[2.7] PASSED [ 65%]
entrypoints/openai_api/test_serving_speech_voxcpm2.py::TestVoxCPM2Serving::test_cfg_value_accepts_range[5.0] PASSED [ 68%]
entrypoints/openai_api/test_serving_speech_voxcpm2.py::TestVoxCPM2Serving::test_cfg_value_accepts_range[10.0] PASSED [ 72%]
entrypoints/openai_api/test_serving_speech_voxcpm2.py::TestVoxCPM2Serving::test_cfg_value_rejects_out_of_range[0.0] PASSED [ 75%]
entrypoints/openai_api/test_serving_speech_voxcpm2.py::TestVoxCPM2Serving::test_cfg_value_rejects_out_of_range[-1.0] PASSED [ 79%]
entrypoints/openai_api/test_serving_speech_voxcpm2.py::TestVoxCPM2Serving::test_cfg_value_rejects_out_of_range[10.5] PASSED [ 82%]
entrypoints/openai_api/test_serving_speech_voxcpm2.py::TestVoxCPM2Serving::test_cfg_value_rejects_out_of_range[100.0] PASSED [ 86%]
entrypoints/openai_api/test_serving_speech_voxcpm2.py::TestVoxCPM2Serving::test_cfg_value_rejects_non_numeric[abc] PASSED [ 89%]
entrypoints/openai_api/test_serving_speech_voxcpm2.py::TestVoxCPM2Serving::test_cfg_value_rejects_non_numeric[None] PASSED [ 93%]
entrypoints/openai_api/test_serving_speech_voxcpm2.py::TestVoxCPM2Serving::test_cfg_value_rejects_non_numeric[bad2] PASSED [ 96%]
entrypoints/openai_api/test_serving_speech_voxcpm2.py::TestVoxCPM2Serving::test_cfg_value_rejects_non_numeric[bad3] PASSED [100%]
=============================== warnings summary ===============================
../vllm_omni/__init__.py:19
/tmp/pr-final/vllm_omni/__init__.py:19: RuntimeWarning: Failed to import version from _version.py: No module named 'vllm_omni._version'
This typically happens in development mode before building.
Using fallback version 'dev'.
from .version import __version__, __version_tuple__ # isort:skip # noqa: F401
<frozen importlib._bootstrap>:488
<frozen importlib._bootstrap>:488: DeprecationWarning: builtin type SwigPyPacked has no __module__ attribute
<frozen importlib._bootstrap>:488
<frozen importlib._bootstrap>:488: DeprecationWarning: builtin type SwigPyObject has no __module__ attribute
../../../usr/local/lib/python3.12/dist-packages/torch/jit/_script.py:362: 14 warnings
/usr/local/lib/python3.12/dist-packages/torch/jit/_script.py:362: DeprecationWarning: `torch.jit.script_method` is deprecated. Please switch to `torch.compile` or `torch.export`.
warnings.warn(
../vllm_omni/entrypoints/openai/protocol/audio.py:125
/tmp/pr-final/vllm_omni/entrypoints/openai/protocol/audio.py:125: PydanticDeprecatedSince20: Support for class-based `config` is deprecated, use ConfigDict instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.12/migration/
class CreateAudio(BaseModel):
../../../usr/local/lib/python3.12/dist-packages/torch/jit/_script.py:1480
/usr/local/lib/python3.12/dist-packages/torch/jit/_script.py:1480: DeprecationWarning: `torch.jit.script` is deprecated. Please switch to `torch.compile` or `torch.export`.
warnings.warn(
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
--- Running Summary
======================= 29 passed, 19 warnings in 2.68s ========================
============================================================
VoxCPM2 PR #3118 — Live HTTP migration sanity check
Branch HEAD: 8c5c4cda
Pod image:
Image SHA:
Date: 2026-04-25T08:10:02Z
============================================================
--- /v1/models ---
HTTP 200
served: voxcpm2
root: openbmb/VoxCPM2
--- 1. NEW shape: extra_params.cfg_value=2.7 (in-range) ---
HTTP 200, time=1.040353s, bytes=46124
--- 2. OLD shape: top-level cfg_value=2.7 (silently dropped after migration; field no longer in schema) ---
HTTP 200, time=1.252619s, bytes=107564
--- 3. Range guard: extra_params.cfg_value=15.0 (out of range, expect 400) ---
HTTP 400
{
"error": {
"message": "extra_params['cfg_value']=15.0 out of range (0.1-10.0)",
"type": "BadRequestError",
"param": null,
"code": 400
}
}
--- 4. Type guard: extra_params.cfg_value="abc" (non-numeric, expect 400) ---
HTTP 400
{
"error": {
"message": "extra_params['cfg_value'] must be a number: could not convert string to float: 'abc'",
"type": "BadRequestError",
"param": null,
"code": 400
}
}
--- 5. Length cap on /v1/audio/speech (single-request, used to bypass; expect 400 now) ---
HTTP 400
{
"error": {
"message": "Instructions too long (max 500 characters)",
"type": "BadRequestError",
"param": null,
"code": 400
}
}
--- 6. Decode-loop cfg propagation A/B: same input, only cfg differs ---
cfg=2.5 HTTP 200 time=2.702284s bytes=737324
cfg=2.7 HTTP 200 time=2.814807s bytes=737324
cfg=3.0 HTTP 200 time=2.571122s bytes=737324
--- WAV durations from the cfg sweep (sanity: cfg should affect total length) ---
cfg=2.5 bytes=737324 ~audio=7.68s
cfg=2.7 bytes=737324 ~audio=7.68s
cfg=3.0 bytes=737324 ~audio=7.68s
============================================================
End
============================================================
============================================================
ADDENDUM: cfg A/B with Hi-Fi clone (ref_audio + ref_text)
============================================================
Pure-text generation (#6 above) converges to a sentence boundary
regardless of cfg, so duration alone is a weak signal there. The
stronger signal is when ref_audio + ref_text are set: with the
decode-loop fix (commit 4e88314), cfg=3.0 now stops noticeably
earlier than cfg=2.5/2.7 because every decode step honors the
stricter guidance. Before the fix, the cfg value only affected
the first patch and total length barely moved.
cfg=2.5 HTTP 200 time=10.042026s bytes=860204
cfg=2.7 HTTP 200 time=4.438532s bytes=952364
cfg=3.0 HTTP 200 time=3.584766s bytes=952364
WAV durations:
cfg=2.5 bytes=860204 ~audio=8.96s
cfg=2.7 bytes=952364 ~audio=9.92s
cfg=3.0 bytes=952364 ~audio=9.92s
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment