It probably works similar to how Gemini works in Android for a while now.

You can point or select anywhere on the screen and it understands and searches the context. If you select a text block, even text inside an image, it allows to copy or search the text online. Otherwise it can search the image.

I use it often. It's intuitive and fast even on non-flagship phones.

I'd wager their A/B tests went well enough to warrant a port from phones to their new "Chromebook".

Their video is completely different from what Gemini does now. It analyses mouse movements, like circling around things, underlining things with the mouse, pointing at things to indicate where they need to go. It's a lot like the interfaces you might see in sci-fi movies, where generic gestures are understood within context in a way that modern computers can't handle.