> I understand a big gap is that any LLM based ai agent isn't aware of the consequences of its actions because it barely understands the future state its actions will have, hence this model that can.

These are probably equivalent. Ie, awareness of consequences is the same as understanding the future state. And the present state for that matter, I don't see how someone could be said to understand something if they can't predict the consequences of interacting with it. It is forcing the model to develop a more complex internal world model.