Thanks! That's some interesting evidence right there. Like you suggest, not enough to outright verify the WaPo article, but it should make us not dismiss it outright either. I really wonder how they are defining the difference, because if it's just like your demonstration then it's probably not meaningfully different.

Can you repeat with libraries? I'm not sure I'll be able to analyze that security though.

And are you able you share the prompts and output?