RIFT

Achieving both realism and controllability in closed-loop traffic simulation remains a key challenge in autonomous driving. Dataset-based methods reproduce realistic trajectories but suffer from covariate shift in closed-loop deployment, compounded by simplified dynamics models that further reduce reliability. Conversely, physics-based simulation methods enhance reliable and controllable closed-loop interactions but often lack expert demonstrations, compromising realism. To address these challenges, we introduce a dual-stage AV-centric simulation framework that conducts imitation learning pre-training in a data-driven simulator to capture trajectory-level realism and route-level controllability, followed by reinforcement learning fine-tuning in a physics-based simulator to enhance style-level controllability and mitigate covariate shift. In the fine-tuning stage, we propose RIFT, a novel group-relative RL fine-tuning strategy that evaluates all candidate modalities through group-relative formulation and employs a surrogate objective for stable optimization, enhancing style-level controllability and mitigating covariate shift while preserving the trajectory-level realism and route-level controllability inherited from IL pre-training. Extensive experiments demonstrate that RIFT improves realism and controllability in traffic simulation while simultaneously exposing the limitations of modern AV systems in closed-loop evaluation.

Overview of the RIFT: Building on the IL pre-trained model, RIFT first performs route-level interaction analysis to identify critical background vehicles and their associated reference lines, enabling the generation of realistic and multimodal trajectories. To isolate style-level controllability from the trajectory-level realism and route-level controllability established during pre-training, only the scoring head is fine-tuned via RIFT, with the remaining components kept frozen. Specifically, RIFT computes group-relative advantages over all candidate rollouts, promoting alignment with user-preferred styles and mitigating covariate shift through RL fine-tuning.

Scenarios Demo

The AV-centric traffic simulation consists of the autonomoud vehicle(AV, implemented as PDM-Lite), Critical Background Vehicles (CBVs), and background vehicles (BVs), where the AV follows a predefined global route and the CBVs may interact with it at route level.

Curved-lane Following

Intersection Navigation

Straight-lane Following

Lane Merging

Intersection Navigation

Lane Merging

End-to-End AV Evaluation (RIFT as CBVs)

SparseDrive (AV View)

SparseDrive (Third person view)

SparseDrive (AV View)

SparseDrive (Third person view)

SparseDrive (AV View)

SparseDrive (Third person view)

UniAD (AV View)

UniAD (Third person view)

UniAD (AV View)

UniAD (Third person view)

VAD (AV View)

VAD (Third person view)

VAD (AV View)

VAD (Third person view)

Realism and Controllability Quantitative Results

RIFT consistently outperforms all baselines in both aspects across most settings. While supervised learning methods achieve slightly lower CPK and ORR, this improvement is primarily due to their inherently conservative behavior, derived from the expert PDM-Lite, which prioritizes safety by avoiding risky maneuvers.

AV Evaluation Results

RIFT produces realistic and well-structured scenarios that are effective at exposing the limitations of modern AV systems.

Speed and Acceleration Realism Results

RIFT demonstrates higher average speed and acceleration, indicating more interactive behavior, while maintaining realistic motion profiles.

BibTeX

If you find the project helpful for your research, please consider citing our paper:

@article{chen2025riftclosedlooprlfinetuning,
  title={RIFT: Closed-Loop RL Fine-Tuning for Realistic and Controllable Traffic Simulation},
  author={Keyu Chen and Wenchao Sun and Hao Cheng and Sifa Zheng},
  journal={arXiv preprint arXiv:2505.03344},
  year={2025}
}

Group-Relative RL Fine-Tuning for Realistic and Controllable Traffic Simulation

Abstract

Model Overview

Scenarios Demo

End-to-End AV Evaluation (RIFT as CBVs)

Realism and Controllability Quantitative Results

AV Evaluation Results

Speed and Acceleration Realism Results

BibTeX