Deep Dive Papers

Comprehensive guides covering theory, code examples, mental models, and interview preparation for mastering autonomous driving simulation.

Waymax Deep Dive

Core simulator architecture, data-driven simulation, metrics system, and evaluation framework for autonomous driving.

Waymo Open Sim Agents Challenge evaluation framework, realism metrics, and winning strategies from 2023-2025.

How to scale RL training across GPUs/TPUs using JAX primitives: jit, vmap, pmap, scan, and distributed PPO.

Complete RL training pipeline on top of Waymax including ScenarioMax, observation design, and reward hierarchy.

State-of-the-art sim agent modeling with transformers, Next-Patch Prediction, and the 2024 WOSAC winner approach.

Bridging virtual and physical worlds: perception, actuation, and behavioral gaps with neural rendering and world models.

Safety-critical testing at scale: adversarial generation, scenario mining, and coverage metrics for AV validation.

Scaling RL to billions of steps: PureJaxRL, actor-learner architectures, and GPU-accelerated simulation infrastructure.

For the best learning experience, we recommend reading the papers in order. Each paper builds upon concepts from the previous ones.

1
Waymax Deep Dive
Core simulator architecture, data-driven simulation, metrics system, and evaluation framework for autonomous driving.
2
WOSAC Challenge Deep Dive
Waymo Open Sim Agents Challenge evaluation framework, realism metrics, and winning strategies from 2023-2025.
3
JAX Scaling RL Deep Dive
How to scale RL training across GPUs/TPUs using JAX primitives: jit, vmap, pmap, scan, and distributed PPO.
4
V-Max Framework Deep Dive
Complete RL training pipeline on top of Waymax including ScenarioMax, observation design, and reward hierarchy.
5
BehaviorGPT Deep Dive
State-of-the-art sim agent modeling with transformers, Next-Patch Prediction, and the 2024 WOSAC winner approach.
6
Sim-to-Real Gap Deep Dive
Bridging virtual and physical worlds: perception, actuation, and behavioral gaps with neural rendering and world models.
7
Long-Tail Scenarios Deep Dive
Safety-critical testing at scale: adversarial generation, scenario mining, and coverage metrics for AV validation.
8
Distributed Training Deep Dive
Scaling RL to billions of steps: PureJaxRL, actor-learner architectures, and GPU-accelerated simulation infrastructure.