You can really see the limitations of qwen3.5:9b in reasoning traces- it’s fascinating. When a question “goes bad”, sometimes the thinking tokens are WILD - it’s like watching the Poirot after a head injury.

Example: “what is the air speed velocity of a swallow?” - qwen knew it was a Monty Python gag, but couldnt and didnt figure out which one.

As a person who also knows there's a connection between that phrase and Monty Python and not much more information beyond that, I'm not sure how to feel.

could that be some of the RL trying to get it to not regurgitate?

the gag is giving in detail which one

https://gist.github.com/mikewaters/7ebfbc73eb8624f917c5b4167...

It thinks like it’s memory is broken and it’s unaware of it; over 100 lines like this:

    - Wait, no, that's not right either.
    - Let's recall the specific line. It goes like this:
        - Knight A: "How can you have a swallow?"
        - Knight B: "It is the air speed velocity of a swallow."
        - Actually, the most common citation is from the movie where they ask an expert on swallows? No.

African or European?

My favourite colour is blue. Oh, no, it is...