Hacker News

re: git repositories

That's partly because repositories rarely need to be cloned in their entirety. As such, even when you need to do it and it's a couple hundreds mb taking a few minutes, it's tolerated.

In situations where a document needs to be cold loaded often, the size of the document is felt more acutely. Figma has a notion of GC-ing tombstones. But the tombstones in questions aren't even something that get created in regular editing, it happens in a much more narrow case having to do with local copies of shared components. Even that caused problems for a small portion of files -- but if a file got that large, it was also likely to be important.

josephg an hour ago [ - ]

> even when you need to do it and it's a couple hundreds mb taking a few minutes, it's tolerated.

Well written CRDTs should grow slower than git repositories.

> In situations where a document needs to be cold loaded often, the size of the document is felt more acutely.

With eg-walker (and similar algorithms) you can usually just load and store the native document snapshot at the current point-in-time, and work with that. And by that I mean, just the actual json or text or whatever that the application is interacting with.

Many current apps will first download an entire automerge document (with full history), then using the automerge API to fetch the data. Instead, using eg-walker and similar approaches, you can just download 2 things:

- Current state (eg a single string for a text file, or raw JSON for other kinds of data) alongside the current version

- History. This can be as detailed & go as far back as you want. If you download the full history (with data), you can reconstruct the document at any point in time. If you only download the metadata, you can't go back in time. But you can merge changes. You need history to merge changes. But you only need access to history metadata, and only as far back as the fork point with what you're merging.

If you're working online (eg figma), you can just download history lazily.

For client/server editing, in most cases you don't need any history at all. You can just fetch the current document snapshot (eg a text or json object). And users can start editing immediately. It only gets fancy when you try to merge concurrent changes. But thats quite rare in practice.

rudi-c an hour ago [ - ]

> If you're working online (eg figma), you can just download history lazily.

You can download the history lazily, that's a special case of incrementally loading the document.

If the history is only history, then sure. But I was understanding that we were talking about something that may need to be referenced but can eventually be GCed (e.g. tombstones-style). Then lazy loading makes user operations that need to reference that data go from synchronous operations to asynchronous operations. This is a massive jump in complexity. Incrementally loading data is not the hard part. The hard part is how product features need to be built on the assumption that the data is incrementally loaded.

Something that not all collaborative apps will care about, but Figma certainly did, is the concept that the client may go offline with unsynced changes for an indefinite amount of time. So the tombstones may need to be referenced by new-looking old changes, which increases the likelihood of hitting "almost-never" edge cases by quite a bit.