Something really strange is going on with Gemini 2.5 Pro.
On one hand, it's supposedly the smartest coding model ever made. But on the other hand, I ask it to add one single parameter, and instead of a simple 2-line diff, it generates a 35-line one where it randomly changes logic, removes a time.sleep() from an API call pagination loop, and is generally just totally "drunk" about what I asked it to do. It's somehow both pedantic and drunk at the same time.
Every other model, even much smaller ones, can easily make the 2-line change and leave everything else alone.
I'm wondering how this thing beat the Aider leaderboard. Did something change since the launch?