Wow, awesome project! I gave a look a bit at the code (before realizing this was made with coding agents as said below on the website) and this seems really well made.

Still, I would really like to know how you approached this from an architecture perspective. I'm also curious to how much coding agents structured the code like so by themselves or if you had to steer them heavily (I've just tried a bit gemini flash from antigravity cli so I'm a bit behind on this).

Also how did you approach the rendering correspondence with actual tikz code? Do you have some tests that like render the tikz using latex and your js pipeline and compare the result for differences?

The first part I implemented was the basic parser -> SVG renderer (restricted to the simplest TikZ constructs) and then put in a basic drag-and-drop interface to validate whether the architecture was promising. Code structure was decided pretty much entirely by Codex -- it asks my opinion with multiple choice questions during plan mode, which I like. I tend to alternate between feature expansion and code quality passes (e.g. making sure no files are too big, folder structure makes sense, test coverage is good, etc).

Indeed I have scripts for compiling a given tikz figure using latex (in particular dvisvgm so I get an SVG instead of a PDF) as well as my js-based renderer. I apply that script to various corpuses, mostly particular pages from the tikz manual (see https://tikz.dev), but there are also a few books about TikZ that have downloadable zips of all the examples they use. I then inspect the correspondence between the two renderers by eye and give Codex a list of which figures are wrong and why, and it then goes and fixes the underlying issues.

You'd think that finding discrepancies between the renderers could be done automatically, but it hasn't worked well in my experience. The models are multimodal but still kinda blind; they think two pictures are the same even if they are very much not the same. But once you tell them whats wrong, they're then pretty good at iterating until it is fixed. (One could also try to do a pixel diff of rasterized images, but that's super noisy, and text rendering isn't going to be pixel perfect anyway.)