S2F2: Single-Stage Flow Forecasting for Future Multiple Trajectories Prediction
"In this work, we present a single-stage framework, named S2F2, for forecasting multiple human trajectories from raw video images by predicting future optical flows. S2F2 differs from the previous two-stage approaches in that it performs detection, Re-ID, and forecasting of multiple pedestrians at the same time. The architecture of S2F2 consists of two primary parts: (1) a context feature extractor responsible for extracting a shared latent feature embedding for performing detection and Re-ID, and (2) a forecasting module responsible for extracting a shared latent feature embedding for forecasting. The outputs of the two parts are then processed to generate the final predicted trajectories of pedestrians. Unlike previous approaches, the computational burden of S2F2 remains consistent even if the number of pedestrians grows. In order to fairly compare S2F2 against the other approaches, we designed a StaticMOT dataset that excludes video sequences involving egocentric motions. The experimental results demonstrate that S2F2 is able to outperform two conventional trajectory forecasting algorithms and a recent learning-based two-stage model, while maintaining tracking performance on par with the contemporary MOT models."