World models (or models of the world) represent one of the most promising directions in current artificial intelligence research.
They aim to equip AI systems with a rich and dynamic internal representation of the environment, capable of predicting the evolution of events, simulating scenarios, and planning actions autonomously.
Unlike purely reactive models, a world model allows AI to “understand” the physical laws, causalities, and uncertainties of the real world. This article, written by the Yiaho team, offers a balanced overview of their history, definition, and operation.
History of World Models in AI
The foundational ideas of world models appeared at the very beginning of AI.
In the 1950s and 1960s, researchers like Alan Turing and Herbert Simon discussed the need for a machine to have an internal representation of its environment to reason effectively.
The 1970s marked a first concrete realization with symbolic AI: Terry Winograd’s SHRDLU system (1972) manipulated a simplified virtual world of blocks by relying on an explicit modeling of objects and their relationships.
In the 1980s-1990s, the introduction of probabilistic models (Bayesian networks, hidden Markov models) made it possible to manage uncertainty. These approaches were applied in robotics and planning, particularly for space exploration robots.
The rise of deep learning in the 2010s changed the game. Systems like AlphaGo (DeepMind, 2016) and then MuZero (2020) implicitly learned complex environmental dynamics without manually coded rules.
The term “world model” became popular in 2018 thanks to the influential work of David Ha and Jürgen Schmidhuber. Their article “World Models” demonstrated that a neural network can learn to compress observations and generate internal simulations, allowing an agent to learn in a “dream” without real interaction.
Since then, several laboratories have accelerated progress:
- OpenAI with Sora (2024), a model capable of generating coherent videos by learning physical dynamics from visual data.
- Google DeepMind with Genie (2024) and other virtual world simulation projects.
- Researchers like Yann LeCun (Meta then independent) are developing the JEPA (Joint Embedding Predictive Architecture) family, which favors predictions in abstract latent spaces rather than pixel-by-pixel generative ones.
- Other initiatives, such as World Labs or academic work, are exploring multimodal world models (vision, action, language).
In 2025, world models are at the heart of debates on the path towards more general and reliable artificial intelligence.
Read about this: Yann LeCun to launch “AMI Labs” and is preparing to raise half a billion!
Definition of a World Model in AI
A world model is an internal representation learned by an AI system, which models the properties and dynamics of the environment in a predictive and probabilistic manner.
It generally includes:
- An encoder that transforms raw observations (images, sounds, sensors) into a compact and semantic representation (latent space).
- A dynamics model that predicts the future evolution of this state based on possible actions and uncertainties.
- Mechanisms for managing abstraction hierarchies (from sensory detail to high-level concepts).
Unlike a simple predictive model (like LLMs that predict the next token), a world model aims to capture the invariant laws of the world (physics, causality, geometry) to enable simulation, planning, and knowledge transfer between tasks.
Explanation of How World Models Work
The operation of a world model breaks down into several key stages.
Self-supervised learning
The AI observes large amounts of data (mainly videos or sensory sequences from the real world) without explicit labels. The goal is to predict masked or future parts of observations.
Examples of techniques:
- Variational autoencoders (VAEs) for compression into latent space.
- Spatio-temporal masking (as in some video models).
- Prediction in latent space rather than exact reconstruction (to avoid wasting capacity on unnecessary details).
Modeling dynamics and uncertainty
The model learns that some transitions are deterministic (law of gravity) while others are stochastic (human behavior). This allows for the representation of multiple possible futures.
Use for planning
Once trained, the world model becomes a fast internal simulator. The AI can:
- Mentally test thousands of action sequences (“dreaming” or “model-based planning”).
- Use algorithms like Model Predictive Control (MPC) or evolutionary methods to select the best trajectory.
- Learn much more efficiently, as errors occur in simulation rather than in the real world (important in robotics).
Major advantages:
- Drastic reduction in the need for real-world interactions.
- Better generalization and robustness.
- Potential basis for intelligence closer to that of humans or animals.
Current limitations:
- High computational cost.
- Risk of hallucinations or biased predictions if training data is incomplete.
- Difficulty in guaranteeing perfect fidelity to real physical laws.
World models: a turning point in artificial intelligence design
Moving from AI that memorizes and reproduces patterns to AI capable of simulating, anticipating, and understanding the world autonomously. Driven by advances in many laboratories (DeepMind, OpenAI, former Meta work, emerging startups), they open up exciting prospects in robotics, autonomous driving, games, scientific simulation, and medicine.
As data, algorithms, and computing power progress, world models could become a central component of future generations of AI, bringing machines closer to a truly comprehensive understanding of the world.


