Let's come back in 12 months and discuss your singularity then. Meanwhile I spent like $30 on a few models as a test yesterday, none of them could tell me why my goroutine system was failing, even though it was painfully obvious (I purposefully added one too many wg.Done), gemini, codex, minimax 2.5, they all shat the bed on a very obvious problem but I am to believe they're 98% conscious and better at logic and math than 99% of the population.
Every new model release neckbeards come out of the basements to tell us the singularity will be there in two more weeks
Meanwhile I've been using Kimi K2T and K2.5 to work in Go with a fair amount of concurrency and it's been able to write concurrent Go code and debug issues with goroutines equal to, and much more complex then, your issue, involving race conditions and more, just fine.
(Note that org-lsp has a much improved version of the same indexer as oxen; the first was purely my design, the second I decided to listen to K2.5 more and it found a bunch of potential race conditions and fixed them)
Out of curiosity, did you give a test for them to validate the code?
I had a test failing because I introduced a silly comparison bug (> instead of <), and claude 4.6 opus figured out it wasn't the test the problem, but the code and fixed the bug (which I had missed).
There was a test and a very useful golang error that literally explain what was wrong. The model tried implementing a solution, failed and when I pointed out the error most of them just rolled back the "solution"
What do you believe this shows? Sometimes I have difficulty finding bugs in other people's code when they do things in ways I would never use. I can rewrite their code so it works, but I can't necessarily quickly identify the specific bug.
Expecting a model to be perfect on every problem isn't reasonable. No known entity is able to do that. AIs aren't supposed to be gods.
(Well not yet anyway - there is as yet insufficient data for a meaningful answer.)
It's hard to evaluate "logic" and "math", since they're made up of many largely disparate things. But I think modern AI models are clearly better at coding, for example, than 99% of the population. If you asked 100 people at your local grocery store why your goroutine system was failing, do you think multiple of them would know the answer?
Who said they’re godlike today?
And yes, you are probably using them wrong if you don’t find them useful or don’t see the rapid improvement.
Let's come back in 12 months and discuss your singularity then. Meanwhile I spent like $30 on a few models as a test yesterday, none of them could tell me why my goroutine system was failing, even though it was painfully obvious (I purposefully added one too many wg.Done), gemini, codex, minimax 2.5, they all shat the bed on a very obvious problem but I am to believe they're 98% conscious and better at logic and math than 99% of the population.
Every new model release neckbeards come out of the basements to tell us the singularity will be there in two more weeks
On the flip side, twice I put about 800K tokens of code into Gemini and asked it to find why my code was misbehaving, and it found it.
The logic related to the bug wasn't all contained in one file, but across several files.
This was Gemini 2.5 Pro. A whole generation old.
You are fighting straw men here. Any further discussion would be pointless.
Of course, n-1 wasn't good enough but n+1 will be singularity, just two more weeks my dudes, two more week... rinse and repeat ad infinitum
Like I said, pointless strawmanning.
You’ve once again made up a claim of “two more weeks” to argue against even though it’s not something anybody here has claimed.
If you feel the need to make an argument against claims that exist only in your head, maybe you can also keep the argument only in your head too?
It's presumably a reference to this saying: https://www.urbandictionary.com/define.php?term=2%20more%20w...
Mind sharing the file?
Also, did you use Codex 5.3 Xhigh through the Codex CLI or Codex App?
Post the file here
Meanwhile I've been using Kimi K2T and K2.5 to work in Go with a fair amount of concurrency and it's been able to write concurrent Go code and debug issues with goroutines equal to, and much more complex then, your issue, involving race conditions and more, just fine.
Projects:
https://github.com/alexispurslane/oxen
https://github.com/alexispurslane/org-lsp
(Note that org-lsp has a much improved version of the same indexer as oxen; the first was purely my design, the second I decided to listen to K2.5 more and it found a bunch of potential race conditions and fixed them)
shrug
Out of curiosity, did you give a test for them to validate the code?
I had a test failing because I introduced a silly comparison bug (> instead of <), and claude 4.6 opus figured out it wasn't the test the problem, but the code and fixed the bug (which I had missed).
There was a test and a very useful golang error that literally explain what was wrong. The model tried implementing a solution, failed and when I pointed out the error most of them just rolled back the "solution"
What exact models were you using? And with what settings? 4.6 / 5.3 codex both with thinking / high modes?
Ok, thanks for the info
> I purposefully added one too many wg.Done
What do you believe this shows? Sometimes I have difficulty finding bugs in other people's code when they do things in ways I would never use. I can rewrite their code so it works, but I can't necessarily quickly identify the specific bug.
Expecting a model to be perfect on every problem isn't reasonable. No known entity is able to do that. AIs aren't supposed to be gods.
(Well not yet anyway - there is as yet insufficient data for a meaningful answer.)
It's hard to evaluate "logic" and "math", since they're made up of many largely disparate things. But I think modern AI models are clearly better at coding, for example, than 99% of the population. If you asked 100 people at your local grocery store why your goroutine system was failing, do you think multiple of them would know the answer?