> What's the innovation here? > Having a distribution to pick from?

As I understand it, it's exactly this. Specifically, representing neurons in a neural network via a probability distribution of logic gates and then collapsing the distribution into the optimal logic gate for a given neuron via hyper-parameter tuning in the form of gradient descent. The author has a few more details in their thesis:

https://arxiv.org/abs/2209.00616

Specifically it's the training approach that's patented. I'm glad to see that people are trying to improve on his method, so the patent will likely become irrelevant in the future as better methods emerge.

The author also published an approach on applying their idea onto convolutional kernels in CNN's:

https://arxiv.org/abs/2411.04732

In the paper they promise to update their difflogic library with the resulting code, but apparently they seem to have conveniently forgotten to do this.

I also think their patent is too broad, but I guess it speaks for the entire ML community that we haven't seen more patents in this area. I could also imagine that, given that the approach promises some very impressive performance improvements, they're somewhat afraid that this will be used for embedded military applications.

Liquid NN are also able to generate decision trees.