Hacker News

I've seen them all over the place.

The best are shockingly good… so long as their context doesn't expire and they forget e.g. the Vector class they just created has methods `.mul(…)` rather than `.multiply(…)` or similar. Even the longer context windows are still too short to really take over our jobs (for now), the haystack tests seem to be over-estimating their quality in this regard.

The worst LLM's that I've seen (one of the downloadable run-locally models but I forget which) — one of my standard tests is that I ask them to "write Tetris as a web app", and it started off doing something a little bit wrong (square grid), before giving up on that task entirely and switching from JavaScript to python and continuing by writing a script to train a new machine learning model (and people still ask how these things will "get out of the box" :P)

People who see more of the latter? I can empathise with them dismissing the whole thing as "just autocomplete on steroids".