Data realism tends to be the quiet differentiator in generation systems. Once base model capability becomes commoditized, the biggest performance gap often comes from dataset curation, labeling quality, and how closely the training data reflects real deployment conditions. In visual generation workflows, even small improvements in dataset realism can significantly reduce the “synthetic look” that usually breaks usability in production contexts.
Agree that “data realism” is the quiet differentiator in mature visual generation domains.
Floor plans / technical drawings feel a lot less mature though — we don’t really have generators that are “good” in the sense that they preserve the constraints that matter (scale, closure, topology, entrances, unit stats, cross-floor consistency, etc.). A lot of outputs can look plausible but fall apart the moment you treat them as geometry for downstream tasks.
That’s why I’ve been pushing the idea that simplistic generators are kind of doomed without a context graph (spatial topology + semantics + building/unit/site constraints, ideally with environmental context). Otherwise you’re generating pretty pictures, not usable plans.
Also: I’m a bit surprised how few researchers have used these datasets for basic EDA. Even before training anything, there’s a ton of value in just mapping distributions, correlations, biases, and failure modes. Feels like we’re skipping the “understand the data” step far too often.