Skip to content

Instantly share code, notes, and snippets.

@RafalWilinski
Created August 16, 2024 12:29
Show Gist options
  • Save RafalWilinski/a324ffdb6bac48287609d118c1bf54d1 to your computer and use it in GitHub Desktop.
Save RafalWilinski/a324ffdb6bac48287609d118c1bf54d1 to your computer and use it in GitHub Desktop.
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 1 failed
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 1: 3147.1813 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 1: 3418.5543 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 1: 3418.7064 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 1: 4436.7569 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 1: 4677.3473 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 2: 2659.2542 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 2 failed
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 2: 2807.2270 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 1: 7188.8446 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 1: 7595.7037 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 2: 4646.0808 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 1 failed
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 1: 8525.0953 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 2: 4169.1336 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 1 failed
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 3: 2632.5129 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 3: 2914.4253 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 2: 4783.4380 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 3 failed
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 3: 2956.4669 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 2: 4375.7365 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 4: 2833.7894 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 3: 3902.6023 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 4: 4154.9833 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 4: 3299.4242 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 1: 14174.8062 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 5: 2690.4431 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 2: 8191.3049 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 2 failed
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 5: 2582.8440 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 3: 6127.9217 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 2 failed
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 4 failed
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 4: 3619.2235 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 5: 3516.5341 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 1: 17952.7829 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 3: 6076.2048 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 6: 3579.3428 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 2: 9729.5662 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 1: 18281.4510 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 6: 3411.8251 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 4: 3507.0821 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 5 failed
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 6: 2315.2041 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 5: 4175.3845 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 2: 3028.6294 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 1: 21229.6122 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 7: 3064.3334 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 7: 2765.6302 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 3: 7066.7889 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 1: 22488.0154 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 5: 4067.8945 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 3 failed
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 2: 5014.0017 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 3 failed
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 6 failed
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 6: 3658.3095 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 7: 4558.6485 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 4: 6441.8078 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 8: 3480.6049 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 8: 3274.7310 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 3: 4242.1633 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 7 failed
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 9: 2432.8221 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 2: 13583.2456 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 7: 3804.1768 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 4: 2674.9560 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 3: 9761.3454 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 2: 6302.4231 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 2: 7595.3121 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 8: 4607.8062 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 3: 5758.3642 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 4: 6626.2473 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 8 failed
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 9: 5182.4762 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 10: 2925.3471 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 6: 7476.1929 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 4 failed
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 4 failed
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 5: 6604.8563 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 8: 3662.7228 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 5: 4066.5810 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 9: 2918.4372 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 11: 2949.1120 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 9 failed
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 3: 4314.1000 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 10: 3519.4675 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 7: 3525.9519 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 9: 3711.1267 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 10: 3203.3688 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 5: 6251.8516 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 6: 4664.9846 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 11: 2171.5785 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 6: 4017.9716 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 12: 2889.0328 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 10 failed
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 4: 3921.1703 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 4: 8012.1432 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 3: 8389.4163 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 5 failed
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 4: 9977.2648 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 11: 3070.5164 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 12: 2592.1205 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 8: 4562.1544 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 13: 3559.7935 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 5 failed
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 10: 4816.9105 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 5: 2892.0143 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 11 failed
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 3: 12535.8868 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 7: 4744.1451 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 12: 3407.0734 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 9: 3339.3119 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 14: 2564.2400 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 6: 7222.8755 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 5: 5651.7782 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 13: 4397.3045 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 11: 3684.8485 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 6 failed
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 6: 4137.6680 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 12 failed
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 13: 3322.2592 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 15: 2978.6659 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 8: 5207.4068 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 5: 8623.9379 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 10: 4616.4304 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 7: 2714.0508 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 6 failed
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 7: 11376.1790 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 12: 3865.3786 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 13 failed
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 14: 4877.0330 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 16: 2778.6930 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 14: 2921.6659 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 6: 5947.0545 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 4: 12090.4843 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 7: 6727.7601 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 11: 3333.1671 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 7 failed
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 15: 2860.8213 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 8: 4201.2144 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 13: 3507.1316 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 8: 3981.1689 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 14 failed
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 17: 3807.2798 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 4: 11799.7873 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 15: 4325.6187 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 16: 2540.9875 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 9: 7474.5186 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 9: 2530.8544 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 9: 2914.2297 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 14: 3405.7932 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 7 failed
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 16: 2349.6256 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 8: 5438.2414 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 12: 4694.9164 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 18: 3298.4343 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 5: 6246.8020 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 7: 6971.1049 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 15 failed
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 17: 2796.3284 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 10: 3304.3286 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 10: 2953.2921 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 17: 2870.3550 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 18: 2344.0021 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 13: 3735.1358 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 15: 4107.4852 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 19: 3846.0039 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 8: 3707.2944 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 16 failed
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 11: 3076.9135 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 18: 2840.8973 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 19: 2386.0864 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 9: 6269.1252 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 8 failed
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 6: 14829.6558 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 20: 2720.7783 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 14: 3807.6745 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 19: 2321.7915 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 1: 62615.0067 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 16: 4283.4756 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 12: 2985.3933 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 20: 2741.6523 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 6: 7935.5524 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 17 failed
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 9: 4662.9008 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 11: 7843.8265 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 15: 2960.8375 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 1: 65277.2063 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 21: 4166.4160 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 20: 3218.6660 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 21: 3168.1070 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 10: 13125.6265 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 13: 4091.0881 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 18 failed
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 10: 6501.7430 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 17: 5685.5348 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 16: 3356.6503 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 9 failed
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 10: 4723.2252 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 22: 2485.0333 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 12: 4898.2608 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 11: 2550.3753 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 21: 3758.6451 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 5: 17629.6614 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 14: 3029.4267 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 19 failed
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 22: 5074.1280 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 8 failed
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 23: 2126.5457 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 17: 3270.4966 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 18: 3713.2435 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 15: 2984.4488 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 11: 5825.3639 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 24: 2269.9091 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 23: 3481.8260 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 7: 12818.1818 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 22: 4689.0592 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 12: 5425.4577 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 20 failed
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 25: 2078.7920 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 13: 6366.4104 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 10 failed
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 16: 2819.2843 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 19: 3998.0714 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 2: 11528.7268 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 18: 5045.2704 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 24: 2965.5073 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 23: 3259.0523 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 7: 14316.0505 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 26: 2509.2510 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 21 failed
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 12: 5362.2067 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 17: 2950.7589 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 13: 4064.6947 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 2: 16869.4876 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 25: 2800.9461 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 9 failed
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 14: 4599.0620 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 11: 11768.9088 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 24: 3037.1603 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 27: 2668.0961 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 6: 11089.8787 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 20: 5015.6086 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 19: 4746.2180 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 18: 2992.4988 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 8: 4477.8787 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 22 failed
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 26: 3369.0112 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 28: 2842.5704 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 11 failed
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 25: 3084.7022 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 19: 2598.6592 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 21: 3341.4613 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 8: 10947.0409 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 13: 6765.0070 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 15: 5447.1059 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 20: 4013.3013 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 29: 2676.2413 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 14: 7160.6079 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 27: 2907.6043 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 23 failed
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 20: 2838.0919 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 12: 7518.3398 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 26: 4464.1016 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 3: 11309.3363 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 22: 3875.7418 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 30: 2525.8238 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 28: 3069.4038 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 21: 4588.6203 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 16: 5521.3990 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 31: 2515.7801 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 10 failed
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 27: 3092.5086 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 12 failed
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 14: 5874.7769 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 21: 4123.8337 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 9: 9262.9333 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 24 failed
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 23: 3543.6135 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 13: 4345.6717 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 7: 12135.3050 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 32: 2326.9686 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 22: 2466.1180 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 29: 4608.7522 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 28: 4147.4149 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 25 failed
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 22: 5400.5673 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 33: 2443.6401 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 24: 4055.9308 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 14: 4243.9022 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 3: 17258.8445 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 15: 5599.2614 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 23: 3205.1343 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 15: 10869.2527 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 30: 3104.1154 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 10: 5729.9519 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 26 failed
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 34: 2651.5202 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 23: 3184.6482 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 25: 3460.4939 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 31: 2733.1978 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 24: 2780.4811 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 29: 4600.0718 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 17: 8797.6570 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 11 failed
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 4: 13038.2915 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 16: 4096.9075 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 13 failed
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 9: 16649.5148 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 35: 3585.6450 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 16: 5792.8300 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 24: 4100.0160 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 25: 3209.2724 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 32: 3240.0584 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 26: 3737.1875 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 27 failed
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 30: 3592.8929 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 15: 7033.3411 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 17: 3050.2121 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 18: 4787.0584 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 36: 2586.1102 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 8: 11929.4167 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 11: 7890.1303 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 25: 2772.2104 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 33: 2741.1750 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 28 failed
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 26: 4053.6273 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 31: 3540.1830 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 27: 4117.9922 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 37: 2941.4448 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 4: 11409.6775 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 14 failed
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 34: 2881.1274 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 12 failed
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 18: 4811.6100 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 17: 6858.6624 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 38: 2086.9700 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 29 failed
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 32: 3058.0897 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 19: 5491.8034 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 28: 3870.5877 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 19: 2372.2188 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 35: 2959.2297 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 5: 10491.4702 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 10: 9816.7827 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 27: 5521.2171 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 26: 6927.3648 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 30 failed
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 12: 7818.7686 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 33: 3849.2047 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 36: 2836.2399 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 20: 3152.3385 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 29: 3491.2886 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 15 failed
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 28: 2696.3280 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 16: 11823.8640 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 18: 6058.2753 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 39: 5974.8898 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 31 failed
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 27: 3458.9757 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 13 failed
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 20: 6568.4605 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 34: 2933.9654 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 21: 2471.4506 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 5: 9618.3006 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 29: 2899.8795 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 40: 2928.9447 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 37: 4658.4933 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 28: 3262.9077 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 30: 4742.2267 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 32 failed
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 9: 14947.2188 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 22: 2933.2643 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 6: 8444.7882 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 19: 5048.4388 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 35: 3741.1027 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 13: 7791.9077 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 16 failed
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 30: 2849.1264 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 41: 3189.3448 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 38: 3153.1270 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 11: 11307.8635 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 17: 7643.1678 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 14 failed
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 33 failed
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 31: 3742.3627 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 31: 2350.9723 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 29: 4992.2538 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 36: 3805.6517 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 42: 2583.2070 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 39: 2546.1313 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 21: 8934.2089 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 32: 2720.8520 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 34 failed
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 43: 2100.5548 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 18: 3648.6601 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 7: 6983.2754 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 32: 4016.2120 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 37: 3132.8425 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 17 failed
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 40: 3435.6259 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 23: 8375.4270 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 30: 4425.8925 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 20: 8109.8000 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 44: 2662.8122 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 35 failed
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 22: 4148.7434 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 38: 2484.0473 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 6: 12725.3363 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 24: 2247.5490 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 33: 4481.8525 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 41: 2619.2636 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 33: 3689.6988 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 19: 4443.2652 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 31: 2956.4878 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 15 failed
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 45: 2785.9137 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 36 failed
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 39: 3118.0787 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 34: 2907.9493 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 10: 13722.3873 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 18 failed
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 21: 5199.9322 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 42: 3172.5155 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 34: 3457.2557 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 32: 2956.0416 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 46: 3156.4753 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 20: 4160.2811 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 40: 2691.4076 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 8: 8830.6526 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 25: 5631.0279 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 37 failed
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 43: 2841.2315 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 14: 16320.3462 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 47: 2558.3281 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 35: 3955.7808 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 33: 3669.6682 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 41: 2776.1446 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 26: 2664.0623 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 38 failed
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 16 failed
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 23: 9844.4693 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 21: 4461.5962 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 7: 9342.9545 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 22: 6035.0551 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 44: 3220.7950 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 48: 2868.1842 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 34: 2737.3890 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 19 failed
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 42: 3391.2110 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 39 failed
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 36: 4442.4813 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 45: 2725.1227 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 27: 4607.3350 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 49: 3285.1013 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 9: 7995.1874 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 35: 2818.0893 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 43: 2704.7947 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 35: 11375.9747 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 15: 7933.9853 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 23: 5269.1457 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 22: 5521.8634 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 46: 2778.1943 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 28: 2283.9918 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 17 failed
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 37: 3523.0512 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 40 failed
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 11: 13623.3171 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Run 50: 3274.4755 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] First call time: 3418.7064 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Cold-start penalty: 16.1506%
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Average time: 2943.3405 ms
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Success rate: 100.0000%
[gpt-4o-mini-non-strict-tool (Wide JSON Schema)] Cost: 0.0078
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 36: 3807.9420 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 20 failed
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 36: 3412.1458 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 29: 2553.9740 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 47: 2872.3324 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 24: 9275.0580 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 24: 5298.0248 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 23: 5270.8551 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 16: 5878.0803 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 48: 2415.8161 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 12: 27949.0190 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 41 failed
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 37: 3457.8697 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 38: 4888.0591 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 10: 7538.5403 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 37: 3579.0161 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 8: 12041.6095 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 18 failed
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 24: 3474.4970 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 38: 3303.4992 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 42 failed
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 49: 3929.4250 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 25: 6019.0457 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 38: 3885.3970 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 25: 5399.6194 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 39: 4822.9671 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 17: 5796.6865 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Run 50: 2630.7050 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] First call time: 3418.5543 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Cold-start penalty: 8.5418%
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Average time: 3149.5288 ms
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Success rate: 100.0000%
[gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)] Cost: 0.1313
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 39: 3141.5339 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 21 failed
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 44: 13153.1580 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 39: 2219.4460 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 25: 4250.7824 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 30: 10089.1224 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 43 failed
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 11: 7608.7996 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 26: 4327.4831 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 13: 8577.7692 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 19 failed
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 40: 4220.2717 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 40: 2851.9761 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 45: 2925.1542 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 44 failed
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 40: 4536.9417 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 26: 6380.5902 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 9: 10492.2433 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 26: 4118.2173 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 12: 16695.3194 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 41: 3660.7665 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 41: 3480.5463 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 27: 4968.6286 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 18: 8148.7778 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 22 failed
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 46: 4222.5443 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 45 failed
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 42: 2414.0022 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 31: 8677.3544 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 42: 3525.3913 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 47: 2607.8977 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 41: 5538.1125 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 20 failed
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 27: 6076.5620 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 43: 2635.9756 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 46 failed
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 14: 10293.3595 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 28: 6391.0896 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 32: 3635.4081 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 23 failed
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 43: 3852.1993 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 44: 2886.3609 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 12: 13257.4344 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 42: 4716.1267 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 48: 4786.7262 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 33: 2350.6458 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 27: 11231.1410 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 44: 3863.9278 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 47 failed
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 45: 3187.9882 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 43: 3189.6897 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 13: 11828.4973 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 28: 7762.3638 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 49: 4206.8350 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 29: 6551.1550 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 34: 4337.9633 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 19: 13214.4355 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 46: 2598.8929 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 24 failed
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 48 failed
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 15: 9016.1739 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 45: 4005.1368 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 44: 4286.1834 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Run 50: 3478.6402 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] First call time: 3147.1813 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Cold-start penalty: -12.6515%
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Average time: 3603.0182 ms
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Success rate: 100.0000%
[gpt-4o-mini-non-strict-json (Wide JSON Schema)] Cost: 0.0115
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 35: 2942.7728 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 28: 7344.2710 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 29: 4901.0372 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 49 failed
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 30: 4748.1082 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 36: 2364.0242 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 46: 3623.2399 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 13: 10568.2390 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 45: 2999.9443 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 47: 5084.8080 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 20: 5493.9390 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 25 failed
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Run 50 failed
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Average time: 0.0000 ms
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Success rate: 0.0000%
[gpt-4o-mini-non-strict-tool (Complex JSON Schema)] Cost: 0.0000
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 29: 4501.9012 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 37: 3094.8735 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 14: 10356.5225 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 46: 3523.3833 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 48: 3526.2304 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 21 failed
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 31: 5716.6601 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 30: 6860.3163 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 47: 5789.4109 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 16: 10843.8676 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 49: 3090.0727 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 30: 4588.4385 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 38: 4965.4431 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 47: 4632.7655 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 21: 7719.9845 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 14: 8693.8565 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Run 50: 2900.2303 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] First call time: 18281.4510 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Cold-start penalty: 374.4961%
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Average time: 3852.8140 ms
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Success rate: 100.0000%
[gpt-4o-2024-08-06-strict-json (Wide JSON Schema)] Cost: 0.1023
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 26 failed
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 48: 4621.9805 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 31: 5338.0410 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 39: 2520.3178 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 10: 31186.0437 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 32: 6895.1180 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 22 failed
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 31: 4906.8138 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 48: 4873.5396 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 40: 2945.5059 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 22: 5808.3082 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 33: 3678.4801 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 27 failed
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 41: 2505.3489 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 49: 5826.2081 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 15: 12990.3261 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 32: 4570.9438 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 49: 4404.2674 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 32: 7070.9862 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 42: 2547.7399 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 23 failed
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 23: 4898.3483 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Run 50: 4290.4845 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] First call time: 4436.7569 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Cold-start penalty: 9.1312%
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Average time: 4065.5266 ms
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Success rate: 100.0000%
[gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)] Cost: 0.1964
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 33: 4029.8100 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Run 50: 3643.9479 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] First call time: 4677.3473 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Cold-start penalty: 14.6666%
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Average time: 4079.0854 ms
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Success rate: 100.0000%
[gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)] Cost: 0.1680
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 34: 5896.5717 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 43: 4533.1961 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 28 failed
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 24 failed
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 17: 17610.9289 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 11: 13851.6948 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 44: 2352.8543 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 24: 6353.9433 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 34: 4833.4926 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 35: 4730.2495 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 16: 12268.7773 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 45: 3662.1491 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 33: 11548.5849 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 25: 4212.0881 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 35: 5443.1409 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 29 failed
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 15: 22806.2842 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 46: 3176.3571 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 36: 6242.9034 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 12: 8621.8974 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 47: 2398.0812 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 25 failed
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 34: 6814.4628 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 26: 6601.7746 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 37: 4110.6647 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 18: 12653.1931 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 36: 6181.7088 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 48: 4098.4992 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 16: 7231.3883 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 30 failed
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 49: 2589.3402 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 26 failed
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 35: 6970.1669 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 27: 7392.6030 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Run 50: 2437.9920 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] First call time: 17952.7829 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Cold-start penalty: 296.2323%
[gpt-4o-mini-strict-json (Wide JSON Schema)] Average time: 4530.8732 ms
[gpt-4o-mini-strict-json (Wide JSON Schema)] Success rate: 100.0000%
[gpt-4o-mini-strict-json (Wide JSON Schema)] Cost: 0.0060
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 17: 15534.2748 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 37: 6964.9327 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 13: 11672.2050 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 38: 9224.7128 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 31 failed
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 27 failed
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 28: 5766.8642 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 36: 7175.2801 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 38: 6544.2300 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 39: 6412.3183 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 17: 14235.3007 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 32 failed
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 39: 4256.8705 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 14: 10248.5212 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 29: 5745.2323 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 37: 5706.6171 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 40: 5166.3323 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 18: 15062.5857 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 40: 4434.7749 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 41: 4253.1175 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 38: 6075.9018 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 30: 7636.9091 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 15: 8337.2064 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 41: 4095.2486 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 28 failed
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 33 failed
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 18: 12741.5438 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 42: 4581.4685 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 42: 4006.2202 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 39: 6752.3527 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 31: 5804.1190 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 19: 12244.5663 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 34 failed
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 43: 5583.5931 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 43: 4812.6730 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 19: 35880.0037 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 16: 9520.7198 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 32: 4858.0383 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 40: 4915.3307 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 29 failed
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 44: 4277.5905 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 19: 10824.2902 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 35 failed
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 33: 4839.1776 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 44: 6106.4120 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 45: 4010.1710 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 41: 6547.4590 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 20: 7454.3898 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 30 failed
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 17: 9125.7400 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 34: 4604.1159 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 20: 12436.2850 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 45: 5301.5407 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 46: 3814.5756 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 36 failed
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 42: 5980.5383 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 20: 10383.4641 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 35: 4873.4589 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 31 failed
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 47: 7804.3698 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 43: 6847.6439 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 46: 9686.3300 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 21: 13190.8705 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 18: 11701.2507 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 37 failed
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 21: 11829.0205 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 21: 8688.8286 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 36: 7875.9613 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 32 failed
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 47: 4292.3417 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 48: 6447.8260 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 44: 5212.7115 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 38 failed
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 37: 6382.1805 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 22: 8514.4623 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 48: 4462.9518 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 49: 4432.7123 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 22: 7339.8106 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 19: 9296.9138 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 45: 6083.8300 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 49: 4371.6414 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 33 failed
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 39 failed
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 38: 6035.5745 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Run 50: 7082.7555 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] First call time: 7595.7037 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Cold-start penalty: 29.8940%
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Average time: 5847.6183 ms
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Success rate: 100.0000%
[gpt-4o-mini-non-strict-json (Complex JSON Schema)] Cost: 0.0175
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 46: 5983.1148 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Run 50: 3892.4046 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] First call time: 21229.6122 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Cold-start penalty: 261.8959%
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Average time: 5866.2200 ms
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Success rate: 100.0000%
[gpt-4o-2024-08-06-strict-json (Complex JSON Schema)] Cost: 0.1528
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 23: 8662.7710 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 20: 11148.2900 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 23: 11270.6096 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 47: 4603.2044 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 40 failed
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 34 failed
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 22: 20399.8665 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 48: 4601.8819 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 24: 8954.1691 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 35 failed
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 41 failed
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 21: 9378.4572 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 39: 15719.8370 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 49: 5003.6303 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 42 failed
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 40: 5389.0047 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 36 failed
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 24: 16802.6048 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 25: 11619.7984 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 23: 15625.1179 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Run 50: 8525.8752 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] First call time: 7188.8446 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Cold-start penalty: 13.8485%
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Average time: 6314.3933 ms
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Success rate: 100.0000%
[gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)] Cost: 0.3026
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 22: 9772.2836 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 43 failed
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 41: 6574.4563 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 25: 8853.1653 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 23: 8702.6522 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 44 failed
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 24: 11128.3104 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 37 failed
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 42: 12055.1479 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 26: 8312.9380 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 45 failed
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 24: 8929.8803 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 26: 20283.4502 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 43: 6634.7484 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 25: 13423.6006 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 46 failed
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 27: 8375.1995 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 38 failed
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 27: 8690.7140 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 25: 11156.0464 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 44: 8113.7209 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 47 failed
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 28: 7215.0443 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 39 failed
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 45: 4717.1137 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 26: 8220.1747 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 48 failed
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 26: 14788.2750 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 29: 7576.4550 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 46: 4629.9067 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 40 failed
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 49 failed
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 41 failed
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 30: 7496.5769 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 28: 19538.7735 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 27: 9478.6709 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 27: 11468.4210 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Run 50 failed
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Average time: 0.0000 ms
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Success rate: 0.0000%
[gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)] Cost: 0.0000
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 47: 14678.2241 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 31: 8693.6108 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 29: 9316.3450 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 28: 10813.0009 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 42 failed
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 28: 11400.7830 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 32: 7050.3771 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 48: 11199.7131 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 29: 7845.9365 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 43 failed
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 33: 6607.4702 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 30: 13516.7270 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 49: 5658.1382 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 29: 11876.3023 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 34: 6435.3925 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 44 failed
[gpt-4o-mini-strict-json (Complex JSON Schema)] Run 50: 6620.3114 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] First call time: 22488.0154 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Cold-start penalty: 186.1613%
[gpt-4o-mini-strict-json (Complex JSON Schema)] Average time: 7858.5114 ms
[gpt-4o-mini-strict-json (Complex JSON Schema)] Success rate: 100.0000%
[gpt-4o-mini-strict-json (Complex JSON Schema)] Cost: 0.0146
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 30: 11939.8872 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 45 failed
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 35: 7059.1462 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 31: 13414.0809 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 30: 15164.7845 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 31: 11139.4332 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 36: 7533.1480 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 32: 11877.1129 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 46 failed
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 37: 7859.7675 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 31: 11870.3780 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 32: 13629.7246 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 47 failed
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 33: 13499.1873 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 32: 11320.0640 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 38: 15917.1997 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 48 failed
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 33: 16277.2298 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 34: 12005.0276 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 33: 9607.5640 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 49 failed
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 34: 9185.3037 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 39: 16555.5921 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 34: 10925.2891 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 35: 11315.6497 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Run 50 failed
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Average time: 0.0000 ms
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Success rate: 0.0000%
[gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)] Cost: 0.0000
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 40: 12429.8822 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 36: 11166.1864 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 35: 12332.5123 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 35: 17044.4681 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 41: 7392.0539 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 36: 7086.1957 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 37: 11480.5239 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 42: 6742.2491 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 36: 13173.3425 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 37: 9386.4116 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 43: 7573.1448 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 37: 11004.8667 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 38: 14784.9055 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 38: 10015.0484 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 44: 8607.5567 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 39: 9677.6349 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 45: 9059.5013 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 38: 13758.2957 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 39: 13264.9383 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 46: 7050.6843 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 40: 8876.8895 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 40: 8635.5110 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 39: 12587.6687 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 47: 7951.8687 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 41: 10670.8175 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 48: 6510.1299 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 41: 14401.5246 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 40: 13027.8978 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 42: 9306.6987 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 49: 10266.6656 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 41: 11832.5862 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 42: 17107.8588 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Run 50: 8136.4995 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] First call time: 65277.2063 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Cold-start penalty: 507.6241%
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Average time: 10743.0250 ms
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Success rate: 100.0000%
[gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)] Cost: 0.3004
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 43: 18423.6190 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 42: 15522.3730 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 44: 7500.7399 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 43: 12378.9607 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 45: 11704.0630 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 43: 37952.2628 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 44: 12260.6484 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 46: 12204.5889 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 44: 9721.0005 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 45: 12905.3169 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 46: 14210.3869 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 45: 18671.1077 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 47: 9887.2888 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 47: 38298.6164 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 46: 13455.4005 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 48: 7813.1053 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 48: 14394.2366 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 47: 15014.8566 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 49: 9146.4673 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 49: 14697.1951 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 48: 9702.2503 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Run 50: 12154.5681 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] First call time: 8525.0953 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Cold-start penalty: -33.8339%
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Average time: 12884.3888 ms
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Success rate: 100.0000%
[gpt-4o-mini-non-strict-json (Super Complex JSON Schema)] Cost: 0.0393
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Run 50: 10674.0442 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] First call time: 14174.8062 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Cold-start penalty: 8.6919%
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Average time: 13041.2693 ms
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Success rate: 100.0000%
[gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)] Cost: 0.6619
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 49: 12253.8107 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Run 50: 10549.9986 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] First call time: 62615.0067 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Cold-start penalty: 371.1655%
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Average time: 13289.3869 ms
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Success rate: 100.0000%
[gpt-4o-mini-strict-json (Super Complex JSON Schema)] Cost: 0.0241
Comprehensive Report:
Report for schema: Complex JSON Schema
Methods sorted by performance (fastest to slowest):
1. gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)
Average time: 4079.0854 ms
Success rate: 100.0000%
Cost: 0.1680
2. gpt-4o-mini-non-strict-json (Complex JSON Schema)
Average time: 5847.6183 ms
Success rate: 100.0000%
Cost: 0.0175
3. gpt-4o-2024-08-06-strict-json (Complex JSON Schema)
Average time: 5866.2200 ms
Success rate: 100.0000%
Cost: 0.1528
4. gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)
Average time: 6314.3933 ms
Success rate: 100.0000%
Cost: 0.3026
5. gpt-4o-mini-strict-json (Complex JSON Schema)
Average time: 7858.5114 ms
Success rate: 100.0000%
Cost: 0.0146
6. gpt-4o-mini-non-strict-tool (Complex JSON Schema)
Average time: 0.0000 ms
Success rate: 0.0000%
Cost: 0.0000
Methods sorted by cost (cheapest to most expensive):
1. gpt-4o-mini-strict-json (Complex JSON Schema)
Cost: 0.0146
Average time: 7858.5114 ms
Success rate: 100.0000%
2. gpt-4o-mini-non-strict-json (Complex JSON Schema)
Cost: 0.0175
Average time: 5847.6183 ms
Success rate: 100.0000%
3. gpt-4o-2024-08-06-strict-json (Complex JSON Schema)
Cost: 0.1528
Average time: 5866.2200 ms
Success rate: 100.0000%
4. gpt-4o-2024-08-06-non-strict-tool (Complex JSON Schema)
Cost: 0.1680
Average time: 4079.0854 ms
Success rate: 100.0000%
5. gpt-4o-2024-08-06-non-strict-json (Complex JSON Schema)
Cost: 0.3026
Average time: 6314.3933 ms
Success rate: 100.0000%
6. gpt-4o-mini-non-strict-tool (Complex JSON Schema)
Cost: 0.0000
Average time: 0.0000 ms
Success rate: 0.0000%
Report for schema: Wide JSON Schema
Methods sorted by performance (fastest to slowest):
1. gpt-4o-mini-non-strict-tool (Wide JSON Schema)
Average time: 2943.3405 ms
Success rate: 100.0000%
Cost: 0.0078
2. gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)
Average time: 3149.5288 ms
Success rate: 100.0000%
Cost: 0.1313
3. gpt-4o-mini-non-strict-json (Wide JSON Schema)
Average time: 3603.0182 ms
Success rate: 100.0000%
Cost: 0.0115
4. gpt-4o-2024-08-06-strict-json (Wide JSON Schema)
Average time: 3852.8140 ms
Success rate: 100.0000%
Cost: 0.1023
5. gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)
Average time: 4065.5266 ms
Success rate: 100.0000%
Cost: 0.1964
6. gpt-4o-mini-strict-json (Wide JSON Schema)
Average time: 4530.8732 ms
Success rate: 100.0000%
Cost: 0.0060
Methods sorted by cost (cheapest to most expensive):
1. gpt-4o-mini-strict-json (Wide JSON Schema)
Cost: 0.0060
Average time: 4530.8732 ms
Success rate: 100.0000%
2. gpt-4o-mini-non-strict-tool (Wide JSON Schema)
Cost: 0.0078
Average time: 2943.3405 ms
Success rate: 100.0000%
3. gpt-4o-mini-non-strict-json (Wide JSON Schema)
Cost: 0.0115
Average time: 3603.0182 ms
Success rate: 100.0000%
4. gpt-4o-2024-08-06-strict-json (Wide JSON Schema)
Cost: 0.1023
Average time: 3852.8140 ms
Success rate: 100.0000%
5. gpt-4o-2024-08-06-non-strict-tool (Wide JSON Schema)
Cost: 0.1313
Average time: 3149.5288 ms
Success rate: 100.0000%
6. gpt-4o-2024-08-06-non-strict-json (Wide JSON Schema)
Cost: 0.1964
Average time: 4065.5266 ms
Success rate: 100.0000%
Report for schema: Super Complex JSON Schema
Methods sorted by performance (fastest to slowest):
1. gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)
Average time: 10743.0250 ms
Success rate: 100.0000%
Cost: 0.3004
2. gpt-4o-mini-non-strict-json (Super Complex JSON Schema)
Average time: 12884.3888 ms
Success rate: 100.0000%
Cost: 0.0393
3. gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)
Average time: 13041.2693 ms
Success rate: 100.0000%
Cost: 0.6619
4. gpt-4o-mini-strict-json (Super Complex JSON Schema)
Average time: 13289.3869 ms
Success rate: 100.0000%
Cost: 0.0241
5. gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)
Average time: 0.0000 ms
Success rate: 0.0000%
Cost: 0.0000
6. gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)
Average time: 0.0000 ms
Success rate: 0.0000%
Cost: 0.0000
Methods sorted by cost (cheapest to most expensive):
1. gpt-4o-mini-strict-json (Super Complex JSON Schema)
Cost: 0.0241
Average time: 13289.3869 ms
Success rate: 100.0000%
2. gpt-4o-mini-non-strict-json (Super Complex JSON Schema)
Cost: 0.0393
Average time: 12884.3888 ms
Success rate: 100.0000%
3. gpt-4o-2024-08-06-strict-json (Super Complex JSON Schema)
Cost: 0.3004
Average time: 10743.0250 ms
Success rate: 100.0000%
4. gpt-4o-2024-08-06-non-strict-json (Super Complex JSON Schema)
Cost: 0.6619
Average time: 13041.2693 ms
Success rate: 100.0000%
5. gpt-4o-2024-08-06-non-strict-tool (Super Complex JSON Schema)
Cost: 0.0000
Average time: 0.0000 ms
Success rate: 0.0000%
6. gpt-4o-mini-non-strict-tool (Super Complex JSON Schema)
Cost: 0.0000
Average time: 0.0000 ms
Success rate: 0.0000%
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment