To reproduce, download the .json files with raw responses from here.
Running in standard mode
uv run oai_pydantic.pyProcessing oai_response_1024.json...
model_validate taken 5.02ms
model_construct taken 5.86ms
model_validate_json taken 2.18ms
Processing oai_response_8192.json...
model_validate taken 12.85ms
model_construct taken 37.14ms
model_validate_json taken 15.86ms
Processing oai_response_16384.json...
model_validate taken 54.59ms
model_construct taken 52.33ms
model_validate_json taken 31.51ms
Processing oai_response_32768.json...
model_validate taken 127.27ms
model_construct taken 133.14ms
model_validate_json taken 63.51ms- Indeed,
model_validate_jsonis roughly 2x faster thanmodel_validateandmodel_construct. That's nice but this would still be a huge bottleneck in RL training with 4k+ batch sizes - As @baggiponte said, I am quite surprised that
model_validatealways takes ~same time asmodel_construct. Shouldn't the latter not do any validation? Maybe this is not true for submodels? If so, is there a way to also skip validation of submodels - all this data is entirely trusted so skipping validation entirely is fine - Nit: I am not quite sure why I am now measuring 130ms instead of the 180ms I got a couple of days ago on the 32k sample, maybe because I am using
perf_counternow instead of plaintimeor the CPU has a better day today, who knows. But really it's about the order of magnitude which is still the same. - Also, it does seem like constructing the Pydantic model scales ~linearly with the number of logprobs it has to parse.
Ignoring the logprobs field
uv run oai_pydantic.py --ignore-logprobsProcessing oai_response_1024.json...
model_validate taken 2.97ms
model_construct taken 3.38ms
model_validate_json taken 0.26ms
Processing oai_response_8192.json...
model_validate taken 0.03ms
model_construct taken 0.15ms
model_validate_json taken 0.10ms
Processing oai_response_16384.json...
model_validate taken 0.03ms
model_construct taken 0.13ms
model_validate_json taken 0.16ms
Processing oai_response_32768.json...
model_validate taken 0.03ms
model_construct taken 0.14ms
model_validate_json taken 0.26ms- It seems like it's only the logprob parsing that is taking a significant amount of time.
Using the hotfix that we use in prime-rl for now
uv run oai_pydantic.py --use-hotfixProcessing oai_response_1024.json...
model_validate taken 2.38ms
model_construct taken 3.03ms
model_validate_json taken 1.96ms
Processing oai_response_8192.json...
model_validate taken 0.17ms
model_construct taken 0.15ms
model_validate_json taken 13.58ms
Processing oai_response_16384.json...
model_validate taken 0.97ms
model_construct taken 0.14ms
model_validate_json taken 21.81ms
Processing oai_response_32768.json...
model_validate taken 1.83ms
model_construct taken 0.14ms
model_validate_json taken 43.73msOur hotfix essentially skips whatever Pydantic does to the logprobs field so we are still quick. Interestingly (I hadn't tested this before), model_validate_json does not seem to profit from it.
Super interested to hear where people think the bottleneck is and if we can find a more elegant general solution!:)
Thanks @rahuliyer95, what you've said makes loads of sense.
I've no idea why stainlessapi/openai have that
model_constructmethod and don't usemodel_validate_json.In a hurry to respond, I didn't check if it was a vanilla
BaseModel.@mikasenghaas it's also worth noting that you're conflating the time taken to decode the JSON with the time taken to run
model_construct- that won't have much effect here sincemodel_constructis so slow, but if you used a saner approach like just callingmodel_validate, it makes quite a lot of difference:this code
gives:
If you really care about performance but need validation, you can shave ~33% off validation time by using typed dicts and
TypeAdaptertype typed adapter example
gives:
Or if you just want to parse the JSON, you can almost halve the time again: