Hacker News

greenavocado 8 hours ago [ - ]

So distribute copies of the model in RAM to multiple machines, have each machine update different parts of the model weights, and sync updates over the network

olliepro 5 hours ago [ - ]

decentralized training makes a lot more sense when the required hardware isn't a $40K GPU...