Non-Linear Associations Between the Urban Built Environment and Commuting Modal Split: A Random Forest Approach and SHAP Evaluation

The study of commuting mode choice is crucial since driving, with all its associated environmental and economic consequences, is the United States’ most popular mode of transportation due to urban sprawl, priority to road construction and America’s love affair with the automobi...

Full description

Bibliographic Details
Main Authors: Faizeh Hatami, Md. Mokhlesur Rahman, Behnam Nikparvar, Jean-Claude Thill
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10035376/
Description
Summary:The study of commuting mode choice is crucial since driving, with all its associated environmental and economic consequences, is the United States’ most popular mode of transportation due to urban sprawl, priority to road construction and America’s love affair with the automobile. More attention needs to be paid to sustainable modes such as public transit and walking. The built environment is expected to have an impact on commuting mode choice. Built environments with higher density, diversity, intentional design, destination accessibility, and shorter distance to transit (collectively known as the 5 Ds of the built environment) are hypothesized to lead to more sustainable mode choices, including public transit and walking. In this paper, we evaluate the impact of built environment variables on commuting modal split, including the four modes of public transit-bus, public transit-rail, walking, and driving. The study is conducted in Mecklenburg County, North Carolina, at the geographic level of census block groups in year 2015. Given the complexity of relationships in the built environment-travel behavior subject, the random forest method is used to predict aggregated commuting mode choice. Random forest is employed as it is capable of capturing nonlinear relationships and is not constrained by limitations in other widely used methods, such as multinomial logistic regression. After predicting the commuting mode shares, SHAP values (SHapley Additive exPlanations) are used to evaluate the impact of the built environment on commuting mode choices. As an advanced machine learning method, SHAP values adds explainability to the model. This method resolves the known limitation of machine learning methods as being “black boxes” and converts them to “white boxes” by providing interpretability. They provide insights into both the direction and magnitude of the relationships. Thanks to its rigorous ML-based design, our study helps to solidify the state of knowledge with strong evidence that block groups with higher degrees of the 5Ds lead to more choices of public transit and walking modes. We discuss urban policy implications of this study.
ISSN:2169-3536