Thanks!

To be clear, I use AI for editing all the time. Actually, diagrams are nice.

Just some pieces like that look like copy-paste (I mean, empty lines before, code get no special typography, etc):

  If we write the boundary information for a packed batch as:
  
  B = { lengths, cu_seqlens, max_seqlen, mask structure }
  
  then every transformer layer in that forward pass consumes the same B.
  
  If the model has L layers, rebuilding or re-synchronizing on B once per layer is not new work. It is the same information being reconstructed again and again.
  
  In other words, the useful work is:
  
  build B once, use it L times.
  
  The wasteful version is:
  
  build B + build B + ⋯ + build B (L times)

> Actually, diagrams are nice.

I especially use AI to generate code for things like Mermaid[0]. It's just easier to describe the flow I want to outline than to remember all the nuances of Mermaid or similar code -> graph / diagram tooling. The output still looks nice too.

[0]: https://mermaid.js.org/