Microsoft rewriting typescript tools in Go and getting a 10x speedup? It's wild that they would choose Go for that. And a surprising level of speedup.

https://devblogs.microsoft.com/typescript/typescript-native-...

In my experience, Go is one of the best LLM targets due to simplicity of the language (no complex reasoning in the type system or borrow checker), a high quality, unified, and language-integrated dependency ecosystem[1] for which source is available, and vast training data.

[1]: Specifically, Go community was trained for the longest time not to make backward-incompatible API updates so that helps quite a bit in consistency of dependencies across time.

I have never understood why people want to use LLMs for programming outside of learning. I have written Perl, C, C#, Rust, and Ruby professionally and to this day I feel like they would slow me down.

I have used golang in the past and I was not am still not a fan. But I recently had to break it out for a new project. LLMs actually make golang not a totally miserable experience to write, to the point I’m honestly astonished that people have found it pleasant to work with before they were available. There is so much boilerplate and unnecessary toil. And the LLMs thankfully can do most of that work for you, because most of the time you’re hand-crafting artisanal reimplementations of things that would be a single function call in every other language. An LLM can recognize that pattern before you’ve even finished the first line of it.

I’m not sure that speaks well of the language.

> I have never understood why people want to use LLMs for programming outside of learning

"I have never understood why people want to use C for programming outside of learning m. I have written PDP11, Motorola 6800, 8086 assembly professionally and to this day I feel like they would slow me down. I have used C in the past and I was not am still not a fan. But I recently had to break it out for a new project. Turbo C actually make C not a totally miserable experience to write, to the point I’m honestly astonished that people have found it pleasant to work with before they were available. There is so much boilerplate and unnecessary toil. And Turbo C with a macro library thankfully can do most of that work for you, because most of the time you’re hand-crafting artisanal reimplementations of things that would be a single function call in every other language. A macro can recognize that pattern before you’ve even finished the first line of it. I’m not sure that speaks well of the language."

They are enormously powerful tools. I cannot imagine LLMs not being one of the primary tools in a programmer's toolbox, well... for as long as coding exists.

Right now they are fancy autocompletes. That is enormously useful for a language where 90% of the typing is boilerplate in desperate need of autocompletion.

Most of the “interesting” logic I write is nowhere close to autocompleted successfully and most of it needs to be thrown out. If you’re spending most of your days writing glue that translates one set of JSON documents or HTTP requests into another I’m sure they’re wildly useful.

I don't know which models you are using, but in my experience they have been way more than fancy autocomplete today. I have had thousand line programs written and refined with just a few prompts. On the analysis and code review side, they have been even more impressive, finding issues and potential impacts of changes and describing the intent behind the code. I implore you to revisit good models like Gemini 2.5 Pro. To wit, there was an actual Linux kernel vulnerability in SMB protocol stack discovered with LLM a few days ago.

Even if we take the narrow use case of boilerplate glue code that transforms data from one place to another, that encompasses almost all programs people write, statistically. There was a running joke at Google "we are just moving protobufs." I would not call this "fancy autocomplete."

It comes back to the nature of the work; I've got a hobby project which is basically an emulator of CP/M, a system from the 70s, and there is a bug in it.

My emulator runs BBC Basic, Zork, Turbo Pascal, etc, etc, but when it is used to run a vintage C compiler from the 80s it gives the wrong results.

Can an LLM help me identify the source of this bug? No. Can I say "fix it"? No. In the past I said "Write a test-case for this CP/M BDOS function, in the same style as the existing tests" and it said "Nope" and hallucinated functions in my codebase which it tried to call.

Basically if I use an LLM as an auto-completer it works slightly better than my Emacs setup already did, but anything more than that, for me, fails and worse still fails in a way that eats my time.

> Can an LLM help me identify the source of this bug? No. Can I say "fix it"? No. In the past I said "Write a test-case for this CP/M BDOS function, in the same style as the existing tests"

These are all things I've done successfully with ChatGPT o1 and o3 in a 7.5kloc Rust codebase.

I find the key is to include all information which may be necessary to solve the problem in the prompt. That simple.

I wrote a summary of my issue on a github comment, and I guess I will try again

https://github.com/skx/cpmulator/issues/234#issuecomment-291...

But I'm not optimistic; all previous attempts at "identify the bug", "fix the bug", "highlight the area where the bug occurs" just turn into timesinks and failures.

It seems like your problem may be related to asking it to analyze the whole emulator _and_ compiler to find the bug. I'd recommend working first to pare the bug down to a minimal test case which triggers the issue - the LLM can help with this task - and then feed the LLM the minimal test case along with the emulator source and a description of the bug and any state you can exfiltrate from the system as it experiences the issue.

Indeed running a vintage, closed-source, binary under an emulator it's hard to see what it is trying to do, short of decompiling it, and understanding it. Then I can use that knowledge to improve the emulation until it successfully runs.

I suggested in my initial comment I'd had essentially zero success in using LLMs for these kind of tasks, and your initial reply was "I've done it, just give all the information in the prompt", and I guess here we are! LLMs clearly work for some people, and some tasks, but for these kind of issues I'd say we're not ready and my attempts just waste my time, and give me a poor impression of the state of the art.

Even "Looking at this project which areas of the CP/M 2.2 BIOS or BDOS implementations look sucpicious?", "Identify bugs in the current codebase?", "Improve test-coverage to 99% of the BIOS functionality" - prompts like these feel like they should cut the job in half, because they don't relate to running specific binaries also do nothing useful. Asking for test-coverage is an exercise in hallucination, and asking for omissions against the well-known CP/M "spec" results in noise. It's all rather disheartening.

> Indeed running a vintage, closed-source, binary under an emulator it's hard to see what it is trying to do, short of decompiling it, and understanding it.

Break it down. Tell the LLM you're having trouble figuring out what the compiler running under the emulator is doing to trigger the issue, what you've done already, and ask for it's help using a debugger and other tools to inspect the system. When I did this o1 taught me some new LLDB tricks I'd never seen before. That helped me track down the cause of a particularly pernicious infinite recursion in the geometry processing code of a CAD kernel.

> Even "Looking at this project which areas of the CP/M 2.2 BIOS or BDOS implementations look sucpicious?", "Identify bugs in the current codebase?", "Improve test-coverage to 99% of the BIOS functionality" - prompts like these feel like they should cut the job in half, because they don't relate to running specific binaries also do nothing useful.

These prompts seem very vague. I always include a full copy of the codebase I'm working on in the prompt, along with a full copy of whatever references are needed, and rarely ask it questions as general as "find all the bugs". That is quite open ended and provides little context for it to work with. Asking it to "find all the buffer overflows" will yield better results. As it would with a human. The more specific you can get the better your results will be. It's also a good idea to ask the LLM to help you make better prompts for the LLM.

> Asking for test-coverage is an exercise in hallucination, and asking for omissions against the well-known CP/M "spec" results in noise.

In my experience hallucinations are a symptom of not including the necessary relevant information in the prompt. LLM memories, like human memories, are lossy and if you force it to recall something from memory you are much more likely to get a hallucination as a result. I have never experienced a hallucination from a reasoning model when prompted with a full codebase and all relevant references. It just reads the references and uses them.

It seems like you've chosen a particularly extreme example - a vintage, closed-source, binary under an emulator - didn't immediately succeed, and have written off the whole thing as a result.

A friend of mine only had an ancient compiled java app as a reference, he uploaded the binary right in the prompt, and the LLM one-shotted a rewrite in javascript that worked first time. Sometimes it just takes a little creativity and willingness to experiment.

7.5 kloc is pretty tiny, sounds like you may be able to get the entire thing into the context.

Lots of Rust libraries are relatively small since Cargo makes using many libraries in a single project relatively easy. I think that works in favor of both humans and LLMs. Treating the context window as an indication that splitting code up into smaller chunks might be a good idea is an interesting practice.

I generally have to maintain the code I write, often by myself; thousands of lines of uninspired slop code is the last thing I need in my life.

Friction is the birth place of evolution.

Some people go to camping now and then to hunt their own food and feel connected to nature and feel that friction. They just won't want it every day. Just like they don't tend to generate the underlying uninspired assembly themselves. FWIW if your premise is the code they generate is necessarily unmaintainable compared to an average CS college graduate human baseline, I'd argue against that premise.

I've always found it fascinating how frequently I've seen the complaint about Go re: boilerplate and unnecessary toil, but in previous statements Rust was uttered with an uncritical breath. I agree with the complaint about Go, but I have the same problem with Rust. LLMs have made Rust much more joyful for me to write, and I am sure much of this is obviously subjective.

I do like automating all the endless `Result<T, E>` plumbing, `?` operator chains, custom error enums, and `From` conversions. Manual trait impls for simple wrappers like `Deref`, `AsRef`, `Display`, etc. 90% of this is structural too, so it feels like busy work. You know exactly what to write, but the compiler can't/won’t do it for you. The LLM fills that gap pretty well a significant percentage of the time.

But to your original point, the LLM is very good at autocompleting this type of code zero-shot. I just don't think it speaks ill of Rust as a consequence.

This is akin to saying that you prefer a horse to a car because you don't have to buy gas for a horse, it can eat for free so why use it?

The first cars were probably much less useful than horses. They didn’t go very far, gas pumping infrastructure wasn’t widely available, and you needed specialized knowledge to operate them.

Sure, they got better. But at the outset they were a pretty poor value proposition.

Well it certainly makes error handling easy. No need to reason about complex global exception handlers and non-linear control structures. If you see an error, return it as a value and eventually it will bubble up. If err != nil is verbose but it makes LLMs and type checkers happy.

I have never seen any AI system could explain correctly on the following Golang code:

    package main

    func alwaysFalse() bool {
     return false
    }

    func main() {
     switch alwaysFalse() // don't format the code
     {
     case true:
      println("true")
     case false:
      println("false")
     }
    }
> Go community was trained for the longest time not to make backward-incompatible API updates so that helps quite a bit in consistency of dependencies across time

Not true for Go 1.22 toolchains. When you use Go 1.21-, 1.22 and 1.23+ toolchains to build the following Go code, the outputs are not consistent:

    //go:build go1.21
    package main

    import "fmt"

    func main() {
     for counter, n := 0, 2; n >= 0; n-- {
      defer func(v int) {
          fmt.Print("#", counter, ": ", v, "\n")
          counter++
      }(n)
     }
    }

You're bringing up exceptions rather than a rule. Sure you can find things they mess up. The whole premise of a lot of the "AI" stuff is approximately solving hard problems rather than precisely solving easy ones.

The opposite is true, they sometimes guess correctly, even a broken watch is right two times a day.

I believe future AI systems can make correct answers. The rule is clearly specified in Go specification.

BTW, I haven't found an AI system can get the correct output for the following Go code:

    package main

    import "fmt"

    func main() {
        for counter, n := 0, 2; n >= 0; n-- {
            defer func(v int) {
                fmt.Print("#", counter, ": ", v, "\n")
                counter++
            }(n)
        }
    }

What do you base that prediction on? Without a fundamental shift in the underlying technology, they will still just be guessing.

Because I am indeed experiencing the fact that AI systems do better and better.

It can easily explain it with a little nudge.

Not sure why you feel smug about knowing such a small trivia, ‘gofmt’ would rewrite it to semicolon anyway.

I write code in notebook++ and never format my code. :D

Go is a great target for LLM because it needs so much boilerplate and LLMs are good at generating that.

AFAIK the borrow checker is not strictly needed to compile Rust. I think one of the GCC Rust projects started with only a compiler and deferred adding borrow checking later.

The borrow checker does not change behavior, so any correct program will be fine without borrow checking. The job of borrow checking is to reject programs only.

mrustc also does not implement a borrow checker.

Not that much different than a type checker in any language (arguably it is the same thing).

I have been using various LLMs extensively with Rust. It's not just borrow checker. The dependencies are ever-changing too. Go and Python seem to be the RISC of LLM targets. Comparatively, the most problematic thing about generated Go code is the requirement of using every imported package and declared symbol.

[deleted]

> And a surprising level of speedup.

Not surprising at all; I keep pointing out that the language benchmarking game is rarely, if at all, reflective of real-world usage.

Any time you point out how slow JS is someone always jumps up with a link to some benchmark showing that it is only 2x slower than Go (or Java, or whatever).

The benchmarks game, especially in GC'ed languages, are not at all indicative of real-world usage of the language. Real world usage (i.e. idiomatic usage) of language $FOO is substantially different from the code written for the benchmarks games.

Perhaps "real-world usage" is "… rarely, if at all, reflective of [other] real-world usage …".

Perhaps when you write "idiomatic usage" you mean un-optimized.

It doesn't surprise me at all.

Idiomatic Go leans on value types, simple loops and conditionals, gives you just enough tools to avoid unnecessary allocations, doesn't default to passing around pointers for everything, gives you more control over memory layout.

JS runtimes have to do a lot of work in order to spit out efficient code. It also requires more specialized knowledge from programmers to write fast JS.

I think esbuild and hugo are two programs that showcase this pretty well. Esbuild specifically made a splash in the JS world.

A tooling team selects language widely used in tooling circles - wild, shocking.

Here's the FAQs, where they explain the decision to go with Go and not, say, rust.

https://github.com/microsoft/typescript-go/discussions/categ...

Hejlsberg also says in this video, about 3.3x performance is from going native and the other 2-3x is by using multithreading. https://www.youtube.com/watch?v=pNlq-EVld70&t=51s

My surprise is typescipt is so slow. I have never used it yet, but I think will never too.

At the risk of feeling silly for not knowing this ... why is TypeScript considered a programming language, and how can you make it "faster"?

I have used its it came out so I do know what it is, but I have people ask if they should write their new program in TypeScript, thinking this is something they can write it in and then run it.

My usage of it is limited to JavaScript, so I see it as adding a static typing layer to JavaScript, so that development is easier and more logical, and this typing information is stripped out when transpiled, resulting in pure JavaScript which is all the browser understands.

The industry calls it a programming language, so I do too just because this is not some semantic battle I want to get into. But in my mind it's not.

There's probably a word for what it is, I just can't think of it.

Type system?

And I don't understand a "10x speedup" on TypeScript, because it doesn't execute.

I can understand language services for things like VS Code that handle the TypeScript types getting 10x faster, but not TypeScript itself. I assume that is what they are talking about in most cases. But if this logic isn't right, let me know.

The "10x speedup" is for the compilation step from TS to JS, eg how much faster the new Typescript compiler is, not the runtime performance of the JS output.

Theoretically(!) using TS over JS may indirectly result in slightly better perf though because it nudges you towards not mutating the shape of runtime objects (which may cause the JS engine to re-JIT your code). The extra compilation step might also allow to do some source-level optimizations, like constant folding. But I think TS doesn't do this (other JS minifiers/optimizers do though).

I suspect the particular use-case of parsing/compiling is pathologically bad for JavaScript runtimes. That said, they are still leaps faster than reference Python and Ruby interpreters.

[deleted]

Depends what you mean by slow. The Typescript code was 3x slower than the Go code, and a 3x overhead is pretty much the best you can do for a dynamically typed language.

Languages like Python and Ruby are much much slower than that (CPython is easily 10x slower than V8) and people don't seem to care.

Technically Typescript can't really be slow, since it's just a preprocessor for Javascript, and the speed of its programs will depend on which Javascript implementation you use.

Typescript's sweet spot is making existing Javascript codebases more manageable.

It can also be fine in I/O-heavy scenarios where Javascript's async model allows it to perform well despite not having the raw execution performance of lower-level languages.

I thought that (for example) deno executed typescript natively?

It executes typescript without you compiling it to JavaScript first, it doesn’t make code execution any faster.