This approach is pretty much like the TED approach from a few years back. As far as I remember there wasn’t a ridiculous amount of fold diversity there either. It turns out evolution isn’t averse to a bit of liberal protein plagiarism.

https://www.science.org/doi/10.1126/science.adq4946

> Natural selection has no analogy with any aspect of human behavior, However, if one wanted to play with a comparision, one would have to say natural selection does not work as an engineer works. It works like a tinkerer - a tinkerer who does not know exactly what he is going to produce but uses whatever he finds around him whether it be pieces of string, fragments or wood, or old cardboards; in short it works like a tinkerer who uses everything at his disposal to produce some kind of workable object.

―François Jacob, “Evolution and Tinkering” (https://web.mit.edu/~tkonkle/www/BrainEvolution/Meeting9/Jac...)

Tinker tailor fold or die?

They found "several thousand" novel folds? I had remembered that there were around 1000:

https://pmc.ncbi.nlm.nih.gov/articles/PMC7072414/

Oh ok, I misremembered:

"This review has focused only on small fragments of fold space with examples given for folds generated from a single secondary structure string consisting of around ten SSEs. Even in this small corner, the number of possible folds, under the current constraints, is of the order of 1000"

I think there was a Twitter/Bluesky thread on the results from adding all the predicted folds from metagenomics too, and not ending up with many new clusters. If this continues to hold true as we keep looking at stuff, I will be relieved that at least natural protein folds and domains has a limited (tractable) solution space. All we need to do now is annotate the variation of these couple of thousands of fold variants. Challenging, but at least a bounded problem.

What plagiarism even means in context of proteins? That one protein steals a fold of another protein without giving proper credit to it?

I understood it as metaphor - just that evolutionarily distant sequences can adopt the same (or very similar) folds because there are only a limited number of stable, accessible folds that are possible.

Yes, that is exactly what I meant! Here’s an experiment to try: Frances Arnold got a nobel prize for work related to directed evolution. However, we know evolution is limited by the tools available to it as you mention. If we add random chaperones and co-factors to bacteria that we know other organisms use, can we push evolution outside of the known fold space? Is the limited fold space an absolute limit or the “accessible” limit?

I see. I meant 'energetically accessible', but you mean more like 'affordably accessible' (in the sense that the molecular toolkit of a cell is what can 'afford' certain structures, due to chaperones available and so on).

Who knows what might be possible if you designed a cell from scratch - perhaps you could rework all the machinery to access other parts of fold space. After all, there are some weird and wonderful machines out there like the 'Vault' (https://en.wikipedia.org/wiki/Vault_(organelle)) that can fit whole proteins inside them. Possibly a different cage-like structure could help fold designed proteins into as-before unseen structures.

It could also mean "evolutionarily accessible". The basin of attraction in sequence space has to be sufficiently large that evolution could stumble across it.