Extracting Objects’ Spatial–Temporal Information Based on Surveillance Videos and the Digital Surface Model

Surveillance systems focus on the image itself, mainly from the perspective of computer vision, which lacks integration with geographic information. It is difficult to obtain the location, size, and other spatial information of moving objects from surveillance systems, which lack any ability to coup...

Full description

Bibliographic Details
Main Authors: Shijing Han, Xiaorui Dong, Xiangyang Hao, Shufeng Miao
Format: Article
Language:English
Published: MDPI AG 2022-02-01
Series:ISPRS International Journal of Geo-Information
Subjects:
Online Access:https://www.mdpi.com/2220-9964/11/2/103
_version_ 1797479465877504000
author Shijing Han
Xiaorui Dong
Xiangyang Hao
Shufeng Miao
author_facet Shijing Han
Xiaorui Dong
Xiangyang Hao
Shufeng Miao
author_sort Shijing Han
collection DOAJ
description Surveillance systems focus on the image itself, mainly from the perspective of computer vision, which lacks integration with geographic information. It is difficult to obtain the location, size, and other spatial information of moving objects from surveillance systems, which lack any ability to couple with the geographical environment. To overcome such limitations, we propose a fusion framework of 3D geographic information and moving objects in surveillance video, which provides ideas for related research. We propose a general framework that can extract objects’ spatial–temporal information and visualize object trajectories in a 3D model. The framework does not rely on specific algorithms for determining the camera model, object extraction, or the mapping model. In our experiment, we used the Zhang Zhengyou calibration method and the EPNP method to determine the camera model, YOLOv5 and deep SORT to extract objects from a video, and an imaging ray intersection with the digital surface model to locate objects in the 3D geographical scene. The experimental results show that when the bounding box can thoroughly outline the entire object, the maximum error and root mean square error of the planar position are within 31 cm and 10 cm, respectively, and within 10 cm and 3 cm, respectively, in elevation. The errors of the average width and height of moving objects are within 5 cm and 2 cm, respectively, which is consistent with reality. To our knowledge, we first proposed the general fusion framework. This paper offers a solution to integrate 3D geographic information and surveillance video, which will not only provide a spatial perspective for intelligent video analysis, but also provide a new approach for the multi-dimensional expression of geographic information, object statistics, and object measurement.
first_indexed 2024-03-09T21:46:18Z
format Article
id doaj.art-b423f933bc374ac68d8d2b3b1916f3d5
institution Directory Open Access Journal
issn 2220-9964
language English
last_indexed 2024-03-09T21:46:18Z
publishDate 2022-02-01
publisher MDPI AG
record_format Article
series ISPRS International Journal of Geo-Information
spelling doaj.art-b423f933bc374ac68d8d2b3b1916f3d52023-11-23T20:15:50ZengMDPI AGISPRS International Journal of Geo-Information2220-99642022-02-0111210310.3390/ijgi11020103Extracting Objects’ Spatial–Temporal Information Based on Surveillance Videos and the Digital Surface ModelShijing Han0Xiaorui Dong1Xiangyang Hao2Shufeng Miao3Institute of Geospatial Information, Information Engineering University, Zhengzhou 450001, ChinaInstitute of Geospatial Information, Information Engineering University, Zhengzhou 450001, ChinaInstitute of Geospatial Information, Information Engineering University, Zhengzhou 450001, ChinaWuhan Kedao Geographical Information Engineering Co., Ltd., Wuhan 430081, ChinaSurveillance systems focus on the image itself, mainly from the perspective of computer vision, which lacks integration with geographic information. It is difficult to obtain the location, size, and other spatial information of moving objects from surveillance systems, which lack any ability to couple with the geographical environment. To overcome such limitations, we propose a fusion framework of 3D geographic information and moving objects in surveillance video, which provides ideas for related research. We propose a general framework that can extract objects’ spatial–temporal information and visualize object trajectories in a 3D model. The framework does not rely on specific algorithms for determining the camera model, object extraction, or the mapping model. In our experiment, we used the Zhang Zhengyou calibration method and the EPNP method to determine the camera model, YOLOv5 and deep SORT to extract objects from a video, and an imaging ray intersection with the digital surface model to locate objects in the 3D geographical scene. The experimental results show that when the bounding box can thoroughly outline the entire object, the maximum error and root mean square error of the planar position are within 31 cm and 10 cm, respectively, and within 10 cm and 3 cm, respectively, in elevation. The errors of the average width and height of moving objects are within 5 cm and 2 cm, respectively, which is consistent with reality. To our knowledge, we first proposed the general fusion framework. This paper offers a solution to integrate 3D geographic information and surveillance video, which will not only provide a spatial perspective for intelligent video analysis, but also provide a new approach for the multi-dimensional expression of geographic information, object statistics, and object measurement.https://www.mdpi.com/2220-9964/11/2/103fusionsurveillance video3D geographic informationmappingmoving objectDSM
spellingShingle Shijing Han
Xiaorui Dong
Xiangyang Hao
Shufeng Miao
Extracting Objects’ Spatial–Temporal Information Based on Surveillance Videos and the Digital Surface Model
ISPRS International Journal of Geo-Information
fusion
surveillance video
3D geographic information
mapping
moving object
DSM
title Extracting Objects’ Spatial–Temporal Information Based on Surveillance Videos and the Digital Surface Model
title_full Extracting Objects’ Spatial–Temporal Information Based on Surveillance Videos and the Digital Surface Model
title_fullStr Extracting Objects’ Spatial–Temporal Information Based on Surveillance Videos and the Digital Surface Model
title_full_unstemmed Extracting Objects’ Spatial–Temporal Information Based on Surveillance Videos and the Digital Surface Model
title_short Extracting Objects’ Spatial–Temporal Information Based on Surveillance Videos and the Digital Surface Model
title_sort extracting objects spatial temporal information based on surveillance videos and the digital surface model
topic fusion
surveillance video
3D geographic information
mapping
moving object
DSM
url https://www.mdpi.com/2220-9964/11/2/103
work_keys_str_mv AT shijinghan extractingobjectsspatialtemporalinformationbasedonsurveillancevideosandthedigitalsurfacemodel
AT xiaoruidong extractingobjectsspatialtemporalinformationbasedonsurveillancevideosandthedigitalsurfacemodel
AT xiangyanghao extractingobjectsspatialtemporalinformationbasedonsurveillancevideosandthedigitalsurfacemodel
AT shufengmiao extractingobjectsspatialtemporalinformationbasedonsurveillancevideosandthedigitalsurfacemodel