A Deep-Learning-Based Multimodal Data Fusion Framework for Urban Region Function Recognition

Accurate and efficient classification maps of urban functional zones (UFZs) are crucial to urban planning, management, and decision making. Due to the complex socioeconomic UFZ properties, it is increasingly challenging to identify urban functional zones by using remote-sensing images (RSIs) alone....

Full description

Bibliographic Details
Main Authors:	Mingyang Yu, Haiqing Xu, Fangliang Zhou, Shuai Xu, Hongling Yin
Format:	Article
Language:	English
Published:	MDPI AG 2023-11-01
Series:	ISPRS International Journal of Geo-Information
Subjects:	multimodal data fusion UFZ map spatial relationship modeling vision transformer
Online Access:	https://www.mdpi.com/2220-9964/12/12/468

_version_	1797380803218374656
author	Mingyang Yu Haiqing Xu Fangliang Zhou Shuai Xu Hongling Yin
author_facet	Mingyang Yu Haiqing Xu Fangliang Zhou Shuai Xu Hongling Yin
author_sort	Mingyang Yu
collection	DOAJ
description	Accurate and efficient classification maps of urban functional zones (UFZs) are crucial to urban planning, management, and decision making. Due to the complex socioeconomic UFZ properties, it is increasingly challenging to identify urban functional zones by using remote-sensing images (RSIs) alone. Point-of-interest (POI) data and remote-sensing image data play important roles in UFZ extraction. However, many existing methods only use a single type of data or simply combine the two, failing to take full advantage of the complementary advantages between them. Therefore, we designed a deep-learning framework that integrates the above two types of data to identify urban functional areas. In the first part of the complementary feature-learning and fusion module, we use a convolutional neural network (CNN) to extract visual features and social features. Specifically, we extract visual features from RSI data, while POI data are converted into a distance heatmap tensor that is input into the CNN with gated attention mechanisms to extract social features. Then, we use a feature fusion module (FFM) with adaptive weights to fuse the two types of features. The second part is the spatial-relationship-modeling module. We designed a new spatial-relationship-learning network based on a vision transformer model with long- and short-distance attention, which can simultaneously learn the global and local spatial relationships of the urban functional zones. Finally, a feature aggregation module (FGM) utilizes the two spatial relationships efficiently. The experimental results show that the proposed model can fully extract visual features, social features, and spatial relationship features from RSIs and POIs for more accurate UFZ recognition.
first_indexed	2024-03-08T20:43:22Z
format	Article
id	doaj.art-fd15c13110064c48994a6595f604b632
institution	Directory Open Access Journal
issn	2220-9964
language	English
last_indexed	2024-03-08T20:43:22Z
publishDate	2023-11-01
publisher	MDPI AG
record_format	Article
series	ISPRS International Journal of Geo-Information
spelling	doaj.art-fd15c13110064c48994a6595f604b6322023-12-22T14:13:10ZengMDPI AGISPRS International Journal of Geo-Information2220-99642023-11-01121246810.3390/ijgi12120468A Deep-Learning-Based Multimodal Data Fusion Framework for Urban Region Function RecognitionMingyang Yu0Haiqing Xu1Fangliang Zhou2Shuai Xu3Hongling Yin4School of Surveying and Geo-Informatics, Shandong Jianzhu University, Jinan 250101, ChinaSchool of Surveying and Geo-Informatics, Shandong Jianzhu University, Jinan 250101, ChinaSchool of Surveying and Geo-Informatics, Shandong Jianzhu University, Jinan 250101, ChinaSchool of Surveying and Geo-Informatics, Shandong Jianzhu University, Jinan 250101, ChinaSchool of Architecture and Urban Planning, Shandong Jianzhu University, Jinan 250101, ChinaAccurate and efficient classification maps of urban functional zones (UFZs) are crucial to urban planning, management, and decision making. Due to the complex socioeconomic UFZ properties, it is increasingly challenging to identify urban functional zones by using remote-sensing images (RSIs) alone. Point-of-interest (POI) data and remote-sensing image data play important roles in UFZ extraction. However, many existing methods only use a single type of data or simply combine the two, failing to take full advantage of the complementary advantages between them. Therefore, we designed a deep-learning framework that integrates the above two types of data to identify urban functional areas. In the first part of the complementary feature-learning and fusion module, we use a convolutional neural network (CNN) to extract visual features and social features. Specifically, we extract visual features from RSI data, while POI data are converted into a distance heatmap tensor that is input into the CNN with gated attention mechanisms to extract social features. Then, we use a feature fusion module (FFM) with adaptive weights to fuse the two types of features. The second part is the spatial-relationship-modeling module. We designed a new spatial-relationship-learning network based on a vision transformer model with long- and short-distance attention, which can simultaneously learn the global and local spatial relationships of the urban functional zones. Finally, a feature aggregation module (FGM) utilizes the two spatial relationships efficiently. The experimental results show that the proposed model can fully extract visual features, social features, and spatial relationship features from RSIs and POIs for more accurate UFZ recognition.https://www.mdpi.com/2220-9964/12/12/468multimodal data fusionUFZ mapspatial relationship modelingvision transformer
spellingShingle	Mingyang Yu Haiqing Xu Fangliang Zhou Shuai Xu Hongling Yin A Deep-Learning-Based Multimodal Data Fusion Framework for Urban Region Function Recognition ISPRS International Journal of Geo-Information multimodal data fusion UFZ map spatial relationship modeling vision transformer
title	A Deep-Learning-Based Multimodal Data Fusion Framework for Urban Region Function Recognition
title_full	A Deep-Learning-Based Multimodal Data Fusion Framework for Urban Region Function Recognition
title_fullStr	A Deep-Learning-Based Multimodal Data Fusion Framework for Urban Region Function Recognition
title_full_unstemmed	A Deep-Learning-Based Multimodal Data Fusion Framework for Urban Region Function Recognition
title_short	A Deep-Learning-Based Multimodal Data Fusion Framework for Urban Region Function Recognition
title_sort	deep learning based multimodal data fusion framework for urban region function recognition
topic	multimodal data fusion UFZ map spatial relationship modeling vision transformer
url	https://www.mdpi.com/2220-9964/12/12/468
work_keys_str_mv	AT mingyangyu adeeplearningbasedmultimodaldatafusionframeworkforurbanregionfunctionrecognition AT haiqingxu adeeplearningbasedmultimodaldatafusionframeworkforurbanregionfunctionrecognition AT fangliangzhou adeeplearningbasedmultimodaldatafusionframeworkforurbanregionfunctionrecognition AT shuaixu adeeplearningbasedmultimodaldatafusionframeworkforurbanregionfunctionrecognition AT honglingyin adeeplearningbasedmultimodaldatafusionframeworkforurbanregionfunctionrecognition AT mingyangyu deeplearningbasedmultimodaldatafusionframeworkforurbanregionfunctionrecognition AT haiqingxu deeplearningbasedmultimodaldatafusionframeworkforurbanregionfunctionrecognition AT fangliangzhou deeplearningbasedmultimodaldatafusionframeworkforurbanregionfunctionrecognition AT shuaixu deeplearningbasedmultimodaldatafusionframeworkforurbanregionfunctionrecognition AT honglingyin deeplearningbasedmultimodaldatafusionframeworkforurbanregionfunctionrecognition

A Deep-Learning-Based Multimodal Data Fusion Framework for Urban Region Function Recognition

Similar Items