- [Apple to Apple][INC Reference] Gaudi perf issue
- [Expert Consultation][No reference] Sage attention acc issue
- https://arxiv.org/pdf/2505.11594 dealta_s
- [Human Steer]task.md cross-projects workspace:
- ar - vllm - omni
- ct -> llmc -> vllm: quant primitive -> quant model -> inference model
- setup driver (root) -> user_install_cmd.sh https://github.com/yiliu30/torch-xpu-setup
- Tools(Agent): web, vscode, copilot cli, vscode(claude agent)...
- Skills: how to create skills and examples
- Others
- English Coach: https://github.com/tw93/Waza/blob/main/rules/english.md
- ssh remote-node "Hi," https://github.com/BBuf/SGLang-Auto-Driven-SKILLS/blob/main/skills/h100-sglang-diffusion/SKILL.md
Boundary, Steer
mxfp4-decompress plan doc
MXFP4 Decompression From Fresh
mainSummary
Implement MXFP4 decompression on top of a fresh branch created from
main.Keep the change set limited to standard MXFP4 support only. Do not add,
modify, or test any
RCEILbehavior.Key Changes
mainbranch in this repo and create anew feature branch before any edits.
MXFP4PackedCompressor.weight_scalefrom stored E8M0uint8values back to float withthe existing MX scale helper.
handling.
weightand decodedweight_scalein the same shape/dtypecontract used by the other decompressors.
mxfp4-pack-quantizedremains the format for FP4 withgroup_size=32.RCEILMXFP4 behavior.xfailin the module compress/decompress test and makethe existing MXFP4 round-trip cases pass.
reconstruction only.
RCEILtest coverage or touch existingRCEILutilities unlessrequired by shared code safety.
Test Plan
behavior.
llm-compressorcheckout:/home/yiliu7/workspace/llm-compressor/experimental/mxfp4/qwen3_mxfp4.py.transformers.non-corrupted.
MXFP4_RCEILscenario from the acceptance checklist.Assumptions
RCEILcode stays untouched unless a minimal shared-code adjustmentis unavoidable for the non-
RCEILfix.the normal load path without adding new format options.