They have added a lot of optimization focussing on the KV-cache, so they can have a much larger window without eating all the VRAM.

The 1M window might be usable, but it will probably underperform against a smaller window of course.