Overview of the RIFT: The upper section illustrates the overall architecture of RIFT. To enhance controllability, only the trajectory scoring head is fine-tuned, with the rest of the pre-trained network kept frozen to preserve trajectory-level realism. (a) The CBV identification mechanism introduces route-level interactions between the AV and CBVs. (b) Closed-loop fine-tuning improves user-aligned controllability and mitigates covariate shift.
The AV-centered traffic simulation environment consists of the autonomoud vehicle(AV, implemented as PDM-Lite), Critical Background Vehicles (CBVs), and background vehicles (BVs), where the AV follows a predefined global route and the CBVs may interact with it at route level.
RIFT consistently outperforms all baselines in both aspects across most settings. While supervised learning methods achieve slightly lower infraction rates, this improvement is primarily due to their inherently conservative behavior, derived from the expert PDM-Lite, which prioritizes safety by avoiding risky maneuvers.
under the RIFT-generated scenarios, AVs achieve near-optimal Driving Score (DS), Route Completion (RC), and Infraction Score (IS), along with the lowest BR. These findings highlight RIFT’s effectiveness in minimizing blocking while maintaining realism and interactivity, underscoring its superiority in closed-loop AV evaluation.
RIFT demonstrates higher average speed and acceleration, indicating more interactive behavior, while maintaining realistic motion profiles.
@article{chen2025riftclosedlooprlfinetuning,
title={RIFT: Closed-Loop RL Fine-Tuning for Realistic and Controllable Traffic Simulation},
author={Keyu Chen and Wenchao Sun and Hao Cheng and Sifa Zheng},
journal={arXiv preprint arXiv:2505.03344},
year={2025}
}