training inspired on nanochat for diffusion models: https://github.com/ZHZisZZ/dllm
now someone needs to make it work with vllm or something
training inspired on nanochat for diffusion models: https://github.com/ZHZisZZ/dllm
now someone needs to make it work with vllm or something