MINet: A Pedestrian Trajectory Forecasting Method with Multi-Information Feature Fusion
Keywords:
Trajectory forecasting, scene feature encoding, destination encoding, walkable areaAbstract
Pedestrian trajectory prediction plays an exceptionally vital role in autonomous driving, enabling advanced analysis and decision-making in certain scenarios to ensure driving safety. Predicting pedestrian trajectories is a highly complex task, encompassing static scenes, dynamic scenes, and subjective intent. To enhance the accuracy of pedestrian trajectory prediction, it is crucial to model these scenarios, extract relevant features, and fuse them effectively. However, existing methods only consider some of the scenarios mentioned above and extract static scene features through manual annotation of road key points, which fails to meet the demands of autonomous driving in complex traffic scenarios. To overcome these limitations, this paper introduces MINet – a network that employs multi-information feature fusion. Unlike previous approaches, MINet adopts a more automated approach to extract static scenes, including sidewalks and lawns. Moreover, the network incorporates pedestrian destination modeling to improve prediction accuracy. Furthermore, to tackle the challenge of collision avoidance in crowded spaces, this paper incorporates the extraction of dynamic scene changes through relative velocity modeling of objects. The proposed network achieved an improvement of 47.7 % in the ADE metric and 62.6 % in the FDE metric on the ETH/UCY dataset. In the SDD dataset, there was an improvement of 18.4 % in the ADE metric and 35.2 % in the FDE metric.