Oh wow that is fun. Also if the writeup isn’t misrepresenting the situation, then I feel like it’s actually a good point - if there’s an easy drop-in speed-up, why does it matter whether it’s suggest by a human or an LLM agent?
Oh wow that is fun. Also if the writeup isn’t misrepresenting the situation, then I feel like it’s actually a good point - if there’s an easy drop-in speed-up, why does it matter whether it’s suggest by a human or an LLM agent?
Not everything is about being 100% efficient.
LLM didn't discover this issue, developers found it. Instead of fixing it themselves, they intentionally turned the problem into an issue, left it open for a new human contributor to pick up, and tagged it as such.
If everything was about efficiency, the issue wouldn't have been open to begin with, as writing it (https://github.com/matplotlib/matplotlib/issues/31130) and fending off LLM attempts at fixing them absolutely took more effort than if they were to fix it themselves (https://github.com/matplotlib/matplotlib/pull/31132/changes).
And then there's the actual discussion in #31130 which came to the conclusion that the performance increase had uncertain gains and wasn't worth it.
In this case, the bot explicitly ignored that by only operating off the initial issue.
Good first issues are curated to help humans onboard.
I think this is what worries me the most about coding agents- I'm not convinced they'll be able to do my job anytime soon but most of the things I use it for are the types of tasks I would have previously set aside for an intern at my old company. Hard to imagine myself getting into coding without those easy problems that teach a newbie a lot but are trivial for a mid-level engineer.
The other side of the coin is half the time you do set aside that simple task for a newbie, they paste it into an LLM and learn nothing now.
It doesn’t represent the situation accurately. There’s a whole thread where humans debate the performance optimization and come to the conclusion that it’s a wash but a good project for an amateur human to look into.
The issue is misrepresenting the situation.
One of those operations makes a row-major array, the other makes a col-major array. Downstream functions will have different performance based on which is passed.
It matters because if the code is illegal, stolen, contains a backdoor, or whatever, you can jail a human author after the fact to disincentivize such naughty behavior.