Hacker News

octember a day ago [ - ]

Cool idea, but keyframes are not videos. Motion, object permanence, are not things Claude can infer from a set of images. Nice demo though!

fzysingularity 21 hours ago [ - ]

Exactly! We experimented with a whole bunch of video encoding techniques for LLMs here: https://vlm-run.github.io/mm/encoders/#video

sawjet 21 hours ago [ - ]

I have been going through this with claude and qwenvl3:8b this week. Both are pretty decent at inferring context and analyzing contact sheets. Finding high visual interest moments with a mixture of coarse and fine keyframes.

octember 7 hours ago [ - ]

Might be time to check gemma :)

cortexosmain 22 minutes ago [ - ]

[flagged]

cortexosmain 2 hours ago [ - ]

[flagged]