Hacker News

sundarurfriend 5 hours ago [ - ]

As an English-as-second-language speaker and writer, one thing Grok really shines at is capturing the tone and level of "formality" of a piece of text and the replicating it correctly. It seems to understand the little human subtleties of language in a way the other major providers don't. Chatgpt goes overly stiff and formal sounding, or ends up in a weird "aye guvnor" type informal language (Claude is sometimes better but not always).

Grok seems in general better at being "human" in ways that are hard to define: for eg. if I ask it "does this message roughly convey things correctly, to the level it can given this length", it will likely answer like a human would (either a yes or a change suggestion that sticks to the tone and length), while Chatgpt would write a dissertation on the message that still doesn't clear anything up.

Recently I've noticed that Grok seems to have gotten really good at dictation too (that feature where you click the mic to ask it something). Chatgpt has like 90-95% accuracy with my accent, the speech input on Android's Gboard something like 75%, Grok surprisingly gets something like 98% of my words correct.

michaelbuckbee 2 hours ago [ - ]

I did a quick eval comparing Grok 4.3, Opus 4.7 and GPT 4.1 and they actually seem pretty similar:

https://ofw640g9re.evvl.io/

They all did pretty well at a more "formal" tone, but GPT4.1 was the only one that didn't make me cringe with a "casual" tone.

[edit] fwiw, grok was also the fastest+cheapest model, claude was slowest and priciest.

sundarurfriend 15 minutes ago [ - ]

This is the most basic level of eval, of whether they can produce output that will be considered by someone somewhere (usually a young urban US American) as informal toned. Real human communication is far more nuanced than this, different groups have different linguistic registers they're used to and things outside it sound odd even if they can't articulate why. You could also want to be informal but not over-familiar with the other person (for eg. in a discord chat to a new acquaintance) - actually looking at the outputs here, the Claude output seems best fitting for that (in my subjective view anyway) than to the one you gave it - or want many other little variations.

What makes one cringe and another recognize as familiar and comfortable is also pretty subtle and hard to define. These things need nuanced descriptions and examples to actually get right, and it's in understanding those nuances and figuring out the register of the examples that Grok outshines the others.

jasonjmcghee 13 minutes ago [ - ]

That's Grok 4.2 not 4.3 right?

And why are you comparing to gpt-4.1? (As opposed to one of the 6? model releases since then - would have expected gpt 5.5)

rafram 18 minutes ago [ - ]

All of these were frankly terrible. I guess Grok’s “informal” version sounded the most like a real human, but only because it reads exactly like an Elon tweet (including his favorite emoji!). It’s obvious what they’ve been training on.

embedding-shape 2 hours ago [ - ]

I know it's just an evaluation, but seeing an informal message and a prompt to ask to rewrite this informal message to the tone of an "informal message" when the original one sounds just fine, just makes me sad... Not because of this evaluation, but because it reminds me that this is how some people use LLMs, basically asking it to remove your own voice from texts that are generally fine already.

michaelbuckbee an hour ago [ - ]

My sister in law is a pharmacist and the heaviest non-dev ChatGPT user I know and her main use case is writing professionally polite messages to doctors on how the drugs they prescribed to a patient would have killed them had she not caught a particular interaction or common side effect.

There's a lot of "tone" in it as she's not trying to anger these folks, but also it's quite serious, but also there's just everything else happening in medicine.

Feels like a great use.

19 minutes ago [ - ]

[deleted]

accrual an hour ago [ - ]

All three did well, and while I'm a Claude user, I found the Opus reply here added some unnecessary detail, like "Impact: Minimal; no downstream dependencies are currently at risk". Downstream dependencies weren't mentioned in the original message; for all we know downstream could be relying on a poorly performing API and is impacted by waiting another week for replacement.

djyde 5 hours ago [ - ]

I've also noticed that when I communicate with Grok in my native language, its tone is more natural than other models. I think this is due to the advantage of being trained on a large amount of Twitter data. However, as Twitter contains more and more AI-generated content now, I'm afraid continued training will make it less natural.

adjejmxbdjdn 2 hours ago [ - ]

The causation could also be the other way round.

Twitter language has started seeming normal casual to us, rather than us using normal casual language in Twitter.

pacific01 4 hours ago [ - ]

Did you try meta? I was into grok but now meta works well for me

thunderbong 4 hours ago [ - ]

I'm sure Twitter knows which are the bot accounts and is surely excluding them from their model training. Twitter bots aren't a new phenomenon after all.

cowsup 3 hours ago [ - ]

I don't think Twitter/X know for sure who the bots are, since Elon has been pretty vocal about trying to stop them for ages, yet I still get lots of spam DMs (as do others with far fewer followers/reach).

Even if 95% of the spam gets actively reported and dealt with, that still leaves a ton of nonsense on the platform, getting fed into the LLM. And spam has only gotten worse over the years, as the barrier to entry has lowered and lowered.

GTP an hour ago [ - ]

Are the spam DMs advertisements or more generally something linked to a product or service? I wouldn't be surprised if X is more lenient towards bots that pay them for adverts.

Zancarius 2 minutes ago [ - ]

Most of what I get seem to be advertisements or automated messages if you follow large(r) accounts.

One of the most interesting things that I've noticed is these advertisements will be triggered if you follow accounts that are positioned as influencers. I followed one out of curiosity and received a DM from that account advertising some cryptocurrency service.

It's a good way to filter out and block accounts that have almost certainly not grown organically.

HarHarVeryFunny an hour ago [ - ]

I'd have guessed that at least some of the bots are Twitter itself, trying to draw you in with some sense of engagement. Given that Musk is the owner, and everything we know about him and have seen him do, I'd not be surprised if some of the MAGA bots are his too.

joncrane 2 hours ago [ - ]

>Elon has been pretty vocal about trying to stop them for ages

You know people lie, right? Especially when the lie casts them in a better light and/or makes them more money.

subscribed 2 hours ago [ - ]

Elon lied on record many times, admitting to the lies only when forced, under oath.

hackinthebochs an hour ago [ - ]

Highly doubtful seeing as my 14 year old twitter account got caught in a recent bot ban wave with no means of contacting a human for recovery.

pixel_popping 4 hours ago [ - ]

There is bots everywhere, it has nothing to do with the platform, it has to do with attackers having an incentive to do mass account farming, no platform is secure against it.

rglullis 2 hours ago [ - ]

Super easy, just make a web-of-trust type of thing: messages are only visible to those who already vouched for you. Otherwise, you pay $0.01/per message/per user reached.

kedihacker 3 hours ago [ - ]

With banning and deboosting they need to be very accurate but with filtering they can be more liberal in excluding

simianwords 3 hours ago [ - ]

not really. there are easy heuristics to filter out bots with good confidence. FWIW i don't see any bots posting anything in my feed

pixel_popping 3 hours ago [ - ]

Yes your individual feed isn't really relevant if we talk about the masses, Reddit accounts are for sale quite cheap, HN as well, X too and so-on, it's literally just a matter of means/methodology. If I want today to do 1000 random posts talking about a certain thing, I could.

simianwords 3 hours ago [ - ]

my individual feed does matter because it shows that it is possible to curate something without bots which is obviously what XAI would do

darkerside 2 hours ago [ - ]

Sadly, it's more likely that people will just start talking like bots

pdimitar 17 minutes ago [ - ]

I've seen this expressed as a concern even from one of my colleagues. My retort was:

"English is not my native language and LLMs taught me quite a few very useful formalisms that do land well for people and they change their attitude towards you to be more respectful afterwards. It also showed me how to frame and reframe certain arguments. I agree sounding like an LLM is kind of sad but I am getting a lot of educational value -- and with time I'll sneak my own voice back in these newly learned idioms and ways to talk."

JKCalhoun 2 hours ago [ - ]

You're absolutely right!

2 hours ago [ - ]

[deleted]

nex-z 2 hours ago [ - ]

[dead]

techjamie 2 hours ago [ - ]

There was already evidence last year[1] that pointed to ChatGPT-specific words like "meticulous," "delve," etc becoming more frequently used than they were previously. The linked study used audio of academic talks and podcasts to determine this.

[1] https://arxiv.org/abs/2409.01754

pohl 41 minutes ago [ - ]

Part of me wanted to object to those two examples, which I’ve used frequently since the reaching adulthood in the 80s. Another part of me has been triggered by an apparent uptick in the word “crisp”, which my gut takes as an coding-LLM tell.

AntiUSAbah 3 hours ago [ - ]

[flagged]

Scroll_Swe 33 minutes ago [ - ]

Hitler grok probably loves me.

I´m a blonde, blue eyed Swedish man.

But English is not my main language of course.

But I assume you mean brown people, yes, same sentiment.

The "refugees welcome" period ended after the 2015 crisis in Europe.

0xy 2 hours ago [ - ]

Isn't it exhausting to view everything an ideological lens instead of reviewing technical achievements on their merits?

AntiUSAbah 2 hours ago [ - ]

From the richest person on the whole planet? Who literaly proactivly injects himself directly into global poltics? Which affects you and me and everyone else?

You don't think fighting child porn is worth while? Facism? For democracy?

Isn't it cheating and ignorant from you to not care a single bit about anything at all?

When do you even start thinking drawing a line? Let me guess, as late as it affects only you right?

Leynos 2 hours ago [ - ]

There are limits to being willing to overlook ideology.

SpicyLemonZest an hour ago [ - ]

It's very exhausting! But Elon Musk chose to leverage his fortune from Tesla and SpaceX into an ideological project to destroy a lot of things I care about, so he's left me no choice. If he'd like people to review his work on its technical merits, shouldn't he at the bare minimum apologize and promise not to do it again?

loneboat 2 hours ago [ - ]

The hitler Grok? What? I genuinely don't understand what you're trying to say in this comment.

2ndorderthought 2 hours ago [ - ]

https://www.forbes.com/sites/tylerroush/2025/07/09/elon-musk...

JKCalhoun 2 hours ago [ - ]

Close enough—Grok called itself "MechaHitler" (a link was posted).

AntiUSAbah an hour ago [ - ]

Elon Musk didn't like how Grok would contradict his opinion on Twitter/X.

So he started to work against this with playing around with.

For example grok started to pull in Musks tweets before responding, Musk introduced Grokipedia as a new data source and Grok got trained/adjusted differently.

These mechanism lead to Grok at one point, becoming very rasist.

greenavocado 2 hours ago [ - ]

He's equating Grok to Hitler which is absurd. If you want to speak with the führer you need to visit https://hitler.ai