I don't really understand the purpose of hyping up a launch announcement and then not making any effort whatsoever to make the progress comprehensible to anyone without advanced expertise in the field.

That's the intention. Fill it up with enough jargon and gobbledegook that it looks impressive to investors, while hiding the fact that there's no real technology underneath.

You not comprehending a technology does not automatically make it vaporware.

>jargon and gobbledegook

>no real technology underneath

They're literally shipping real hardware. They also put out a paper + posted their code too.

Flippant insults will not cut it.

Nice try. It's smoke and mirrors. Tell me one thing it does better than a 20 year old CPU.

This hardware is an analog simulator for Gibbs sampling, which is an idealized physical process that describes random systems with large scale structure. The energy efficient gains come from the fact that it's analog. It may seem like jargon, but Gibbs sampling is an extremely well known concept with decades of work with connections to many areas of statistics, probability theory, and machine learning. The algorithmic problem they need to solve is how to harness Gibbs sampling for large scale ML tasks, but arguably this isn't really a huge leap, it's very similar to EBM learning/sampling but with the advantage of being able to sample larger systems for the same energy.

> The algorithmic problem they need to solve is how to harness Gibbs sampling for large scale ML tasks, but arguably this isn't really a huge leap,

Is it?

The paper is pretty dense, but Figure 1 is Fashion-MNIST which is "28x28 grayscale images" - which does not seem very real-life for me. Can they work on a bigger data? I assume not yet, otherwise they'd put something more impressive for figure 1.

In the same way, it is totally unclear what kind of energy are they talking about, in the absolute terms - if you say "we've saved 0.1J on training jobs" this is simply not impressive enough. And how much overhead is it - Amdahl law is a thing, if you super-optimize the step that takes 1% of the time, the overall improvement would be negligible even if savings for that step are enormous.

I've written a few CS papers myself back in the day, and the general idea was to always put the best results at the front. So they are either bad communicators, or they don't highlight answers to my questions because they don't have many impressive things (yet?). Their website is nifty, so I suspect the latter.

More insults and a blanket refusal to engage with the material. Ok.

If you think comparing hardware performance is an insult, then you have some emotional issues or are a troll.

Ah, more insults. This will be my final reply to you.

I'll say it again. The hardware exists. The paper and code are there. If someone wants to insist that it's fake or whatever, they need to come up with something better than permutations of "u r stoopid" (your response to their paper: https://news.ycombinator.com/item?id=45753471). Just engage with the actual material. If there's a solid criticism, I'd like to hear it too.

I've noticed recently that HN is resembling slashdot more. I wonder what's causing it.

The fact that there's real hardware and a paper doesn't mean the product is actually worth anything. It's very possible to make something (especially some extremely simplified 'proof of concept' which is not actually useful at all) and massively oversell it. Looking at the paper, it looks like it may have some very niche applications but it's really not obvious that it would be enough to justify the investment needed to make it better than existing general purpose hardware, and the amount of effort that's been put into 'sizzle' aimed at investors makes it look disingenuous.

>The fact that there's real hardware and a paper doesn't mean the product is actually worth anything.

I said you can't dismiss someone's hardware + paper + code solely based on insults. That's what I said. That was my argument. Speaking of which:

>disingenuous

>sizzle

>oversell

>dubious niche value

>window dressing

>suspicious

For the life of me I can't understand how any of this is an appropriate response when the other guy is showing you math and circuits.

No, they're not showing just math and circuits, they're also showing a very splashy and snazzy front page which makes all kinds of vague, exciting sounding claims that aren't really backed up by the very boring (though sometimes useful) application of that math and circuits (neat how the design of those circuits may be).

If this was just the paper, I'd say 'cool area of research, dunno if it'll find application though'. I'm criticizing the business case and the messaging around it, not the implementation.

Two important questions I think illustrate my point:

1) The paper shows an FPGA implementation which has a 10x speedup compared to a CPU or GPU implementation. Extropic's first customer would have leapt up and started trying to use the FPGA version immediately. Has anyone done this?

2) The paper shows the projected real implementation being ~10x faster than the FPGA version. This is similar to the speedup going from an FPGA to an ASIC implementation of a digital circuit, which is a standard process which requires some notable up-front cost but much less than developing and debugging custom analog chips. Why not go this route, at least initially?

The fact that they show a comparison with an FPGA is a red flag, because large scale generative AI is their biggest weakness.

FPGAs are superior in every respect for models of up to a few megabytes in size and scale all the way down to zero. If they are going for generative AI, they wouldn't even have bothered with FPGAs, because only the highest end FPGAs with HBM are even viable and even then, they come with dedicated AI accelerators.

One thing seems pretty clear from the papers and technical information is that the product is not really aimed at the current approach that is used by mainstream AI models in the first place (where random numbers are far from the bottleneck and sampling from a distribution is generally done by either picking a random starting point and then having a neural net move towards the 'closest' point or by having a neural net spit out a simplified distribution for part of the result and picking randomly from that. In this approach the neural net computation is completely deterministic and takes the bulk of the compute time).

The stuff they talk about in the paper is mainly about things that were in vogue when AI was called Machine Learning where you're essentially trying to construct and sample from very complicated distributions to try to represent your problem in a Bayesian way (i.e. trying to create a situation where you can calculate 'what's the most probable answer given this problem'. In this approach it's often useful to have a relatively small 'model' but to be able to feed random numbers predicated on it back into itself to be able to sample from a distribution which would otherwise be essentially intractable to sample from). This kind of thing was very successful for some tasks but AFAIK those tasks would generally be considered quite small today (and I don't know how many of those have now been taken over by neural nets anyway).

This is why I say it looks very niche and it feels like the website tries to just ride on the AI hypetrain by association with the term.

I don't know about your earlier point, but those questions are perfectly reasonable and a springboard for further discussion. Yes, that's where the rubber hits the road. That's the way to go.

If Extropic (or any similar competitor) can unlock these hypothetical gains, I'd like to see it sorted out asap.

If they could answer those questions _before_ making fancy website with claims like "our breakthrough AI algorithms and hardware, which can run generative AI workloads using radically less energy than deep learning algorithms running on GPUs", they would be much better received. But they jumped to bombastic claims right away, so they now cause that scammy feeling. Hence the comments there.

"no really technology underneath" zzzzzzzzzzz

What's not comprehensible?

It's just miniaturized lava lamps.

A lava lamps that just produces randomness, ie for cryptology purposes, is different than the benefit here, which is to produce specific randomness at low energy-cost