Just using a scaled up and cleverly tweaked version of linear regression analysis...

That is, the probability distribution that the network should learn is defined by which probability distribution the network has learned. Brilliant!