ECVA | European Computer Vision Association

PreTraM: Self-Supervised Pre-training via Connecting Trajectory and Map

Chenfeng Xu, Tian Li, Chen Tang, Lingfeng Sun, Kurt Keutzer, Masayoshi Tomizuka, Alireza Fathi, Wei Zhan ;

Abstract

"Deep learning has recently achieved significant progress in trajectory forecasting. However, the scarcity of trajectory data inhibits the data-hungry deep-learning models from learning good representations. While pre-training methods for representation learning exist in computer vision and natural language processing, they still require large-scale data. It is hard to replicate their success in trajectory forecasting due to the inadequate trajectory data (e.g., 34K samples in the nuScenes dataset). To work around the scarcity of trajectory data, we resort to another data modality closely related to trajectories--HD-maps, which are abundantly provided in existing datasets. In this paper, we propose PreTraM, a self-supervised Pre-training scheme via connecting Trajectories and Maps for trajectory forecasting. PreTraM consists of two parts: 1) Trajectory-Map Contrastive Learning, where we project trajectories and maps to a shared embedding space with cross-modal contrastive learning, 2) Map Contrastive Learning, where we enhance map representation with contrastive learning on large quantities of HD-maps. On top of popular baselines such as AgentFormer and Trajectron++, PreTraM reduces their errors by 5.5% and 6.9% relatively on the nuScenes dataset. We show that PreTraM improves data efficiency and scales well with model size."

Related Material

[pdf] [supplementary material] [DOI]