Modeling Pedestrian Dynamics in Crowds

Author: ORCID icon
Sekhon, Jasmine, Computer Engineering - School of Engineering and Applied Science, University of Virginia
Fleming, Cody, Engineering Systems and Environment, University of Virginia

Autonomous agents are increasingly being adopted for applications where they are required to safely navigate through human crowds. Humans have the ability to anticipate the future trajectories of their neighbors while navigating through a crowd. Safe navigation of autonomous agents in human centric environments similarly requires the ability to anticipate the future motion of neighboring pedestrians. Therefore, accurate pedestrian intent prediction is crucial towards developing safe autonomous systems. However, predicting pedestrian intent is a complex problem since pedestrian navigation is governed by complex social navigation norms, is dependent on neighborhood influences, and is multimodal in nature. While navigating through crowds, people engage in several forms of socially compliant behavior, such as, taking a detour to avoid colliding with a neighbor, respecting personal space, exhibiting `group behavior' such as walking in pairs, groups, leader follower behavior, etc.

Deep learning based frameworks are increasingly being adopted for tasks across diverse domains owing to their ability to match or even surpass human level performance across various tasks. Deep learning based methods have been proposed previously for intent prediction of pedestrians navigating in crowds. However, these deep learning based frameworks for intent prediction suffer from several limitations. These include the lack of interpretability, relying on strict assumptions regarding spatial influence, and a general inability to capture socially compliant navigation behavior.

Through the course of this dissertation, we will address each of these limitations. Firstly, we propose a framework called SCAN, which is a Spatial Context Attentive Network that can jointly predict trajectories for all pedestrians in a scene. SCAN encodes the influence of spatially close neighbors using a spatial attention mechanism, in a manner that relies on fewer assumptions, is parameter efficient, and is more interpretable compared to prior spatial interaction modeling approaches.

Secondly, we propose Social UDE, which is a novel universal differential equations based framework to model pedestrian intent in a manner that can account for neighborhood influences as well as dynamical constraints. Through a qualitative analysis of the predictions of our proposed framework and prior trajectory prediction frameworks, we are able to conclude that displacement errors as performance metrics for pedestrian intent prediction are misleading and existing approaches are in fact incapable of modeling agent-agent interactions.

Consequently, we explore the potential of several contrastive learning approaches and negative sampling strategies in the social setting to explicitly encourage the model to adequately capture socially compliant behavior that is characteristic of pedestrian navigation in crowds. We propose SimCLR-Social, a simple framework for contrastive learning of social representations, using which a simple encoder can be trained to learn similar representations for similar social behavior exhibited by a pedestrian across samples. Through a qualitative analysis and quantitative evaluation on the downstream task of agent-agent interaction classification, we are able to validate the ability of our proposed framework to adequately capture spatial interactions in crowd navigation.

PHD (Doctor of Philosophy)
All rights reserved (no additional license for public reuse)
Issued Date: