When you look at how monstrously large (and obviously not thought through at all, if you understand even the most minimal basics of the linear algebra and math of a transformer LLM) the components are that are ablated (weights set to zero) in his "Ablation Strategies" section, it is no surprise.

    Strategy            What it does  Use case
    .......................................................
    layer_removal       Zero out      entire transformer layers
    head_pruning        Zero out      individual attention heads
    ffn_ablation        Zero out      feed-forward blocks
    embedding_ablation  Zero out      embedding dimension ranges
https://github.com/elder-plinius/OBLITERATUS?tab=readme-ov-f...