Google is absolutely running away with it. The greatest trick they ever pulled was letting people think they were behind.

Their models might be impressive, but their products absolutely suck donkey balls. I’ve given Gemini web/cli two months and ran away back to ChatGPT. Seriously, it would just COMPLETELY forget context mid dialog. When asked about improving air quality it just gave me a list of (mediocre) air purifiers without asking for any context whatsoever, and I can list thousands of conversations like that. Shopping or comparing options is just nonexistent. It uses Russian propaganda sources for answers and switches to Chinese mid sentence (!), while explaining some generic Python functionality. It’s an embarrassment and I don’t know how they justify 20 euro price tag on it.

I agree. On top of that, in true Google style, basic things just don't work.

Any time I upload an attachment, it just fails with something vague like "couldn't process file". Whether that's a simple .MD or .txt with less than 100 lines or a PDF. I tried making a gem today. It just wouldn't let me save it, with some vague error too.

I also tried having it read and write stuff to "my stuff" and Google drive. But it would consistently write but not be able to read from it again. Or would read one file from Google drive and ignore everything else.

Their models are seriously impressive. But as usual Google sucks at making them work well in real products.

I don't find that at all. At work, we've no access to the API, so we have to force feed a dozen (or more) documents, code and instruction prompts through the web interface upload interface. The only failures I've ever had in well over 300 sessions were due to connectivity issues, not interface failures.

Context window blowouts? All the time, but never document upload failures.

I'm talking about Gemini in the app and on the web. As well as AI studio. At work we go through Copilot, but there the agentic mode with Gemini isn't the best either.

Honestly this is as Google product as you can get. Prizes for some, beatings for others.

Antigravity is an embarrassment.

The models feel terrible, somehow, like they're being fed terrible system prompts.

Plus the damn thing kept crashing and asking me to "restart it". What?!

At least Kiro does what it says on the tin.

My experience with Antigravity is the opposite. It's the first time in over 10 years that an IDE has managed to take me out a bit out of the jetbrain suite. I did not think that was something possible as I am a hardcore jetbrain user/lover.

It's literally just vscode? I tried it the other day and I couldn't tell it apart from windsurf besides the icon in my dock

How can the models be impressive if they switch to Chinese mid-sentence? I've observed those bizarre bugs too. Even GPT-3 didn't have those. Maybe GPT-2 did. It's actually impressive that they managed to botch it so badly.

Google is great at some things, but this isn't it.

It's so capable at some things, and others are garbage. I uploaded a photo of some words for a spelling bee and asked it to quiz my kid on the words. The first word it asked, wasn't on the list. After multiple attempts to get it to start asking only the words in the uploaded pic, it did, and then would get the spellings wrong in the Q&A. I gave up.

I had it process a photo of my D&D character sheet and help me debug it as I'm a n00b at the game. Also did a decent, although not perfect, job of adding up a handwritten bowling score sheet.

100x agree. It gives inconsistent edits, would regularly try to perform things I explicitly command to not.

And it gives incorrect answers about itself and google’s services all the time. It kept pointing me to nonexistent ui elements. At least it apologizes profusely! ffs

Agreed on the product. I can't make Gemini read my emails on GMail. One day it says it doesn't have access, the other day it says Query unsuccessful. Claude Desktop has no problem reaching to GMail, on the other hand :)

Sadly true.

It is also one of the worst models to have a sort of ongoing conversation with.

Their models are absolutely not impressive.

Not a single person is using it for coding (outside of Google itself).

Maybe some people on a very generous free plan.

Their model is a fine mid 2025 model, backed by enormous compute resources and an army of GDM engineers to help the “researchers” keep the model on task as it traverses the “tree of thoughts”.

But that isn’t “the model” that’s an old model backed by massive money.

Uhh, just false.

I don't have any of these issues with Gemini. I use it heavily everyday. A few glitches here and there, but it's been enormously productive for me. Far more so then chatgpt, which I find mostly useless.

Peacetime Google is not like wartime Google.

Peacetime Google is slow, bumbling, bureaucratic. Wartime Google gets shit done.

OpenAI is the best thing that happened to Google apparently.

Just not search. The search product has pretty much become useless over the past 3 years and the AI answers often will get just to the level of 5 years ago. This creates a sense that that things are better - but really it’s just become impossible to get reliable information from an avenue that used to work very well.

I don’t think this is intentional, but I think they stopped fighting SEO entirely to focus on AI. Recipes are the best example - completely gutted and almost all receive sites (therefore the entire search page) run by the same company. I didn’t realize how utterly consolidated huge portions of information on the internet was until every recipe site about 3 months ago simultaneously implemented the same anti-Adblock.

The search product become useless on a particular day of 2019 as discussed on HN News some time ago:

https://news.ycombinator.com/item?id=40133976

Competition always is. I think there was a real fear that their core product was going to be replaced. They're already cannibalizing it internally so it was THE wake up call.

Next they compete on ads...

Wartime Google gave us Google+. Wartime Google is still bumbling, and despite OpenAI's numerous missteps, I don't think it has to worry about Google hurting its business yet.

Google+ was fun. Failed in the market though.

Apple made a social network called Ping. Disaster. MobileMe was silly.

Microsoft made Zune and the Kin 1 and Kin 2 devices and Windows phone and all sorts of other disasters.

These things happen.

I do miss Google+. For my brain / use case, it was by far the best social network out there, and the Circle friends and interest management system is still unparalleled :)

But wait two hours for what OpenAI has! I love the competition and how someone just a few days ago was telling how ARC-AGI-2 was proof that LLMs can't reason. The goalposts will shift again. I feel like most of human endeavor will soon be just about trying to continuously show that AI's don't have AGI.

> I feel like most of human endeavor will soon be just about trying to continuously show that AI's don't have AGI.

I think you overestimate how much your average person-on-the-street cares about LLM benchmarks. They already treat ChatGPT or whichever as generally intelligent (including to their own detriment), are frustrated about their social media feeds filling up with slop and, maybe, if they're white-collar, worry about their jobs disappearing due to AI. Apart from a tiny minority in some specific field, people already know themselves to be less intelligent along any measurable axis than someone somewhere.

"AGI" doesn't mean anything concrete, so it's all a bunch of non-sequiturs. Your goalposts don't exist.

Anyone with any sense is interested in how well these tools work and how they can be harnessed, not some imaginary milestone that is not defined and cannot be measured.

I agree. I think the emergence of LLMs have shown that AGI really has no teeth. I think for decades the Turing test was viewed as the gold standard, but it's clear that there doesn't appear to be any good metric.

The turing test was passed in the 80s, somehow it has remained relevant in pop culture despite the fact that it's not a particularly difficult technical achievement

It wasn’t passed in the 80s. Not the general Turing test.

c. 2022 for me.

Soon they can drop the bioweapon to welcome our replacement.

It was obvious to me that they were top contender 2 years ago ... https://www.reddit.com/r/LocalLLaMA/comments/1c0je6h/google_...

Not in my experience with Gemini Pro and coding. It hallucinates APIs that aren't there. Claude does not do that.

Gemini has flashes of brilliance, but I regard it as unpolished some things work amazingly, some basics don't work.

It's very hard to tell the difference between bad models and stinginess with compute.

I subscribe to both Gemini ($20/mo) and ChatGPT Pro ($200/mo).

If I give the same question to "Gemini 3.0 Pro" and "ChatGPT 5.2 Thinking + Heavy thinking", the latter is 4x slower and it gives smarter answers.

I shouldn't have to enumerate all the different plausible explanations for this observation. Anything from Gemini deciding to nerf the reasoning effort to save compute, versus TPUs being faster, to Gemini being worse, to this being my idiosyncratic experience, all fit the same data, and are all plausible.

You nailed it. Gemini 3 Pro seems very "lazy" and seems to never reason for more than 30 seconds, which significantly impacts the quality of its outputs.

Don't let the benchmarks fool you. Gemini models are completely useless not matter how smart they are. Google still hasn't figure out tool calling and making the model follow instructions. They seem to only care about benchmarking and being the most intelligent model on paper. This has been a problem of Gemini since 1.0 and they still haven't fixed it.

Also the worst model in terms of hallucinations.

Disagree.

Claude Code is great for coding, Gemini is better than everything else for everything else.

What is "everything else" in your view? Just curious -- I really only seriously use models for coding, so I am curious what I am missing.

Are you using Gemini model itself or using the Gemini App? They are different.

Both

They seem to be optimizing for benchmarks instead of real world use

Those black nazis in the first image model were a cause of inside trading.

I'm leery to use a Google product in light of their history of discontinuing services. It'd have to be significantly better than a similar product from a committed competitor.

Gemini's UX (and of course privacy cred as with anything Google) is the worst of all the AI apps. In the eyes of the Common Man, it's UI that will win out, and ChatGPT's is still the best.

Google privacy cred is ... excellent? The worst data breach I know of them having was a flaw that allowed access to names and emails of 500k users.

Link? Are you conflating with "500k Gmail accounts leaked [by a third party]" with Gmail having a breach?

Afaik, Google has had no breaches ever.

If you consider "privacy" to be 'a giant corporation tracks every bit of possible information about you and everyone else'?

OpenAI is running ads. Do you think they'll track less?

Their SECURITY cred is fantastic.

Privacy, not so much. How many hundreds of millions have they been fined for “incognito mode” in chrome being a blatant lie?

> Their SECURITY cred is fantastic.

In a world where Android vulnerabilities and exploits don't exist

They don't even let you have multiple chats if you disable their "App Activity" or whatever (wtf is with that ass naming? they don't even have a "Privacy" section in their settings the last time I checked)

and when I swap back into the Gemini app on my iPhone after a minute or so the chat disappears. and other weird passive-aggressive take-my-toys-away behavior if you don't bare your body and soul to Googlezebub.

ChatGPT and Grok work so much better without accounts or with high privacy settings.

> Gemini's UX ... is the worst of all the AI apps

Been using Gemini + OpenCode for the past couple weeks.

Suddenly, I get a "you need a Gemini Access Code license" error but when you go to the project page there is no mention of this or how to get the license.

You really feel the "We're the phone company and we don't care. Why? Because we don't have to." [0] when you use these Google products.

PS for those that don't get the reference: US phone companies in the 1970s had a monopoly on local and long distance phone service. Similar to Google for search/ads (really a "near" monopoly but close enough).

0 - https://vimeo.com/355556831

I find Gemini's web page much snappier to use than ChatGPT - I've largely swapped to it for most things except more agentic tasks.

You mean AI Studio or something like that, right? Because I can't see a problem with Google's standard chat interface. All other AI offerings are confusing both regarding their intended use and their UX, though, I have to concur with that.

The lack of "projects" alone makes their chat interface really unpleasant compared to ChatGPT and Claude.

AI Studio is also significantly improved as of yesterday.

No projects, completely forgets context mid dialog, mediocre responses even on thinking, research got kneecapped somehow and is completely uses now, uses propaganda Russian videos as the search material (what’s wrong with you, Google?), janky on mobile, consumes GIGABYTES of RAM on web (seriously, what the fuck?). Left a couple of tabs over night, Mac is almost complete frozen because 10 tabs consumed 8 GBs of RAM doing nothing. It’s a complete joke.

Fair enough. I'm always astonished how different experiences are because mine is the complete opposite. I almost solely use it for help with Go and Javascript programming and found Gemini Pro to be more useful than any other model. ChatGPT was the worst offender so far, completely useless, but Claude has also been suboptimal for my use cases.

I guess it depends a lot on what you use LLMs for and how they are prompted. For example, Gemini fails the simple "count from 1 to 200 in words" test whereas Claude does it without further questions.

Another possible explanation would be that processing time is distributed unevenly across the globe and companies stay silent about this. Maybe depending on time zones?

Gemini is completely unusable in VS Code. It's rated 2/5 stars, pathetic: https://marketplace.visualstudio.com/items?itemName=Google.g...

Requests regularly time out, the whole window freezes, it gets stuck in schizophrenic loops, edits cannot be reverted and more.

It doesn't even come close to Claude or ChatGPT.

Once Google launched Antigravity, I stopped using VS Code.

Smart idea to say anything against Google here from a throwaway account, I'm sitting in negative karma for that :')

Anti Google comments do pretty well on average. It's a popular sentiment. However, low effort comments don't.

Trick? Lol not a chance. Alphabet is a pure play tech firm that has to produce products to make the tech accessible. They really lack in the latter and this is visible when you see the interactions of their VP's. Luckily for them, if you start to create enough of a lead with the tech, you get many chances to sort out the product stuff.

You sound like Russ Hanneman from SV

It's not about how much you earn. It's about what you're worth.

Google is still behind the largest models I'd say, in real world utility. Gemini 3 Pro still has many issues.

They were behind. Way behind. But they caught up.