It's the bitter-lesson to feature-engineering lifecycle.

When a technique or technology is new people are making massive gains by just applying it to some use case, or gathering more data for training, or giving it more resources.

As time goes on those "bitter lesson" gains start to hit the shallow part of the logistic curve and companies have to start investing more and more effort into engineering for each small, incremental gain.

I got a very different message from this, actually much closer to the problem of incumbent advantage.

The known-good thing has been heavily optimized for performance, making it much harder for new technologies to prove that they are better. This is similar to the problem of gas vs electric engines - we had a century of optimization and ecosystem development around gas engines, which creates an uphill battle for electric motors even though they are (eventually) superior on every way /except/ having that massive ecosystem.

The problem isn't as bad here, because software is much more flexible than hardware, and scaling laws give a reasonable way to try things out at smaller scale before going whole hog.

I assume the choice of phrase "bitter lesson" is intentional irony (since the original concept is that you get better results by just scaling up and not trying to be clever with domain-specific knowledge)?

I assume that the bitterness of the bitter lesson is not for engineers but for subject matter experts. I can only imagine how it would feel to discover that your decades of hard-earned expertise don't amount to a whole lot when it comes to domain-specific ML modeling, compared to simply throwing more compute at the problem.

Maybe the ultimate bitter lesson is that entropy always wins in the end.

[dead]

[deleted]

Well put, thanks.

[dead]