V-JEPA based prediction
I’m pretty enthusiatic about world-models and especially JEPA (and V-JEPA), so I started small prediction engine, that predicts the next few frames based on the imput stream’s current frame, visualizes the latent space (and its movement) and creates a guess about motion. I’m planning to use this in a car to build up a driving world-model.

A world model is an AI system that learns to predict what will happen next. Unlike traditional computer vision models that just classify images, world models understand dynamics how things move, change, and evolve over time.
This is exactly what you need for robotics, autonomous vehicles, or any AI that needs to act in the real world. Before taking an action, you want to ask: “What will happen if I do X?”
// comments