May also be of interest on the history of ARM - up to 1997:

https://thechipletter.substack.com/p/the-arm-story-part-1-fr...

https://thechipletter.substack.com/p/the-arm-story-part-2-ar...

https://thechipletter.substack.com/p/the-arm-story-part-3-cr...

This interview with ARM's first CEO Robin Saxby is also really entertaining and informative. His energy is really infectious:

https://www.youtube.com/watch?v=FO5PsAY5aaI&t=1823s

I think there's a case that Saxby himself was more responsible for ARM's success than the technical merits their early designs may or may not have had.

What those early Archimedes systems demonstrated was that the whole thing did actually work, but their design was not what the market needed. Saxby was the right guy to lead them, and his energy just seems like something else.

Sophie (a major computing hero for many in the UK in my era) went on to do esoteric VLIW work at Broadcom for ADSL iirc.

Agree 100%. I think that without Saxby there would be no Arm as we know it today.

IIRC Steve Furber and colleagues considered the licensing model and decided it would never work. Saxby made it work. All credit to him too for standing down before he overstayed his welcome and keeping out the limelight since.

Yes, re Sophie it was Firepath I think as per this presentation (2014).

https://old.hotchips.org/wp-content/uploads/hc_archives/hc14...

BTW, there are also other interesting, low-power RISC architectures that are used in millions of devices, but most people have never heard of them. For example:

  * SuperH [0], 32bit only, now basically dead, but microcontrollers are still available
  * AVR32 [1], 32 bit only, also quite dead
  * ARC [2], 32/64bit, still quite popular in an automotive
[0]: https://en.wikipedia.org/wiki/SuperH

[1]: https://en.wikipedia.org/wiki/AVR32

[2]: <https://en.wikipedia.org/wiki/ARC_(processor)>

IIRC Arm licensed 'short instruction' encoding from Hitachi where it had been used SuperH to create Thumb which was key in getting Arm into phones.

> The ARM chip was also designed to run at very low power. Wilson explained that this was entirely a cost-saving measure—the team wanted to use a plastic case for the chip instead of a ceramic one, so they set a maximum target of 1 watt of power usage.

With my (limited) understanding of how ARM conquered the market, I guess this turned out to be a very consequential cost-saving measure.

I love the Acorn story and British computing in general. BBC made a good movie on the subject called Micro Men. Definitely worth a watch despite some inaccuracies. Note that Sophie Wilson was previously known as Roger Wilson and has a cameo at the end of the film.

Nice article. Just a couple of comments:

Doesn't the x86 chips also use microcode? There are several differences between RISC and CISC not mentioned here.

(also Sophie was called Roger at this point in time, so the article has been retconed)

Yes, the 80286 used microcode and so instruction took several cycles. That's what the article was gesturing at when it said "But another reason was that more complex instructions took longer for a chip to execute. By keeping them simple, you could make every instruction execute in a single clock cycle."

EDIT: Though later, in the Pentium era, x86 started to do simple instructions like `ADD AX, [BX]` without microcode.

Eh, that's just an internal design decision. The ARM1 used microcode as well.

ARM1 had multicycle instructions (LDM and STM) and the ARM2 added more (MUL and MLA) but as far as I know these were controlled by hardwired finite state machines, not microcode.

https://www.righto.com/2016/02/reverse-engineering-arm1-proc...

Thanks for the link. As Ken points out at the end, "Probably the biggest objection to calling the ARM1 microcoded is that the designers of the ARM chip didn't consider it that way.[4] Furber mentions that some commercial RISC processors use microcode, but doesn't apply that term to the ARM1". My opinion is the same as Steve Furber's, though I can certainly see Ken Sherriff's viewpoint.

In theory PLAs and ROMs are fully equivalent. In practice, while the ROM can accept any possible "microcode", a PLA might have to be enlarged if you want to change some of the "micro instruction". This need to change the hardware to change the functionality of an instruction is what makes me consider this design hardwired instead of microcoded.

[EDIT] Another issue is that the ARM1 has three pipeline stages. The "microcode" here is not used for the fetch and decode stages, only the execute one. So though register to register operations take 3 clock cycles to execute, only one "micro instruction" is needed (the second line in the table).

> Thanks for the link. As Ken points out at the end, "Probably the biggest objection to calling the ARM1 microcoded is that the designers of the ARM chip didn't consider it that way.[4] Furber mentions that some commercial RISC processors use microcode, but doesn't apply that term to the ARM1". My opinion is the same as Steve Furber's, though I can certainly see Ken Sherriff's viewpoint.

So reading, the citation fully, it seems that Furber doesn't really dive deep into the ARM1, instead saving a deep dive for the ARM2 as well as an additional chapter about the changes then made for the ARM3. I think kens might be steel manning the position.

> In theory PLAs and ROMs are fully equivalent. In practice, while the ROM can accept any possible "microcode", a PLA might have to be enlarged if you want to change some of the "micro instruction". This need to change the hardware to change the functionality of an instruction is what makes me consider this design hardwired instead of microcoded.

Traditionally, the difference would be defined as whether or not it's structured as an address being decoded to create a one hot enable signal for a single row of the array at a time. When you take the FSM signals as a sub address, and the init, interrupt, and decoded instr bits as a segment address, this is what you see here. And that matches the structure seen in traditional CISC microcode ROMs.

Additionally, there is extra space on the die for an additional few rows.

> [EDIT] Another issue is that the ARM1 has three pipeline stages. The "microcode" here is not used for the fetch and decode stages, only the execute one. So though register to register operations take 3 clock cycles to execute, only one "micro instruction" is needed (the second line in the table).

The pipelining doesn't really matter here. The 486 for instance more or less completed one (simple) instruction a clock, but had a rather deep for the time piepline where those instructions would have several cycles of latency. Those simple instructions were also a single micro op despite being being processed in several pipeline stages. And the micro decode was not the first stage of the pipeline either. The 486 had a deeper pipeline with fetch, decode 1, decode 2, execute, and writeback pipeline stages, but didn't start emitting micro instructions except as the output of the third pipeline stage.

I felt this article didn't really explain why a RISC chip with more ops could be as fast as a CISC chip with fewer ops.

I think the actual explanation is that the CISC ops are decoded to more or less the same or similar types of RISC ops, but requiring more physical hardware to do the decode, correct?

The tradeoff here being lower memory for instructions, but more silicon+transistors needed for decode hardware.

The link to the IBM RISC paper didn't seem to work and I'm not sure if this is the paper that they're linking to since it is from 1990, but I found one of IBM's paper on RISC: https://sci-hub.se/10.1147/rd.341.0004

What I didn't realise is RISC existed and was an initiative by IBM prior to Dave Patterson research and coining of the term.

The joke goes something like: "I people programmed RISC in assembly when you were still in kindergarten" haha

I remember the quote but not the source :(

For anyone interested in this era. I have a couple of sets of RISC OS 2 ROMs chips from 1988 sitting on my desk. I don't know if they still work. If there's a good home in the UK I'd be happy to post them.

The Centre for Computing History may be interested

Thanks, I'll get in touch. But I imagine that there are lots of them floating around due to OS upgrades.

> In fact, one of the first test boards the team plugged the ARM into had a broken connection and was not attached to any power at all. It was a big surprise when they found the fault because the CPU had been working the whole time. It had turned on just from electrical leakage coming from the support chips.

So this not an urban legend after all, and it's about the first ever ARM CPU! Very cool story indeed

There's even an interview with Steve Furber, who co-designed it, where he talks about it. https://www.youtube.com/watch?v=1jOJl8gRPyQ&t=508s

(2022)

Now they are trying to extract every penny from its licensees and competing with their customers.