For the "only few levels" claim, I think this one is sort of evident from the way they work. Solving a logical problem can have an arbitrary number of steps, and in a single pass there is only so many connection within a LLM to do some "work".

As mentioned, there are good ways to counter this problem (e.g. writing a plan and then iteratively going over those less-complex ones, or simply using the proper tool for the problem: use e.g. a SAT solver and just "translate" the problem to and from the appropriate format)

Nonetheless, I'm always open to new information/evidence and it will surely improve a lot in a year. As for reference, to date this is my favorite description of LLMs: https://news.ycombinator.com/item?id=46561537