A Spatio-Temporal Encoding Neural Network for Semantic Segmentation of Satellite Image Time Series

Remote sensing image semantic segmentation plays a crucial role in various fields, such as environmental monitoring, urban planning, and agricultural land classification. However, most current research primarily focuses on utilizing the spatial and spectral information of single-temporal remote sens...

Full description

Bibliographic Details
Main Authors: Feifei Zhang, Yong Wang, Yawen Du, Yijia Zhu
Format: Article
Language:English
Published: MDPI AG 2023-11-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/13/23/12658
_version_ 1797400438642835456
author Feifei Zhang
Yong Wang
Yawen Du
Yijia Zhu
author_facet Feifei Zhang
Yong Wang
Yawen Du
Yijia Zhu
author_sort Feifei Zhang
collection DOAJ
description Remote sensing image semantic segmentation plays a crucial role in various fields, such as environmental monitoring, urban planning, and agricultural land classification. However, most current research primarily focuses on utilizing the spatial and spectral information of single-temporal remote sensing images, neglecting the valuable temporal information present in historical image sequences. In fact, historical images often contain valuable phenological variations in land features, which exhibit diverse patterns and can significantly benefit from semantic segmentation tasks. This paper introduces a semantic segmentation framework for satellite image time series (SITS) based on dilated convolution and a Transformer encoder. The framework includes spatial encoding and temporal encoding. Spatial encoding, utilizing dilated convolutions exclusively, mitigates the loss of spatial accuracy and the need for up-sampling, while allowing for the extraction of rich multi-scale features through a combination of different dilation rates and dense connections. Temporal encoding leverages a Transformer encoder to extract temporal features for each pixel in the image. To better capture the annual periodic patterns of phenological phenomena in land features, position encoding is calculated based on the image’s acquisition date within the year. To assess the performance of this framework, comparative and ablation experiments were conducted using the PASTIS dataset. The experiments indicate that this framework achieves highly competitive performance with relatively low optimization parameters, resulting in an improvement of 8 percentage points in the mean Intersection over Union (mIoU).
first_indexed 2024-03-09T01:55:30Z
format Article
id doaj.art-2962fbc686284cdab5914b985c91dde8
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-09T01:55:30Z
publishDate 2023-11-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-2962fbc686284cdab5914b985c91dde82023-12-08T15:11:18ZengMDPI AGApplied Sciences2076-34172023-11-0113231265810.3390/app132312658A Spatio-Temporal Encoding Neural Network for Semantic Segmentation of Satellite Image Time SeriesFeifei Zhang0Yong Wang1Yawen Du2Yijia Zhu3School of Computer Science, China University of Geosciences, Wuhan 430074, ChinaSchool of Computer Science, China University of Geosciences, Wuhan 430074, ChinaSchool of Computer Science, China University of Geosciences, Wuhan 430074, ChinaSchool of Computer Science, China University of Geosciences, Wuhan 430074, ChinaRemote sensing image semantic segmentation plays a crucial role in various fields, such as environmental monitoring, urban planning, and agricultural land classification. However, most current research primarily focuses on utilizing the spatial and spectral information of single-temporal remote sensing images, neglecting the valuable temporal information present in historical image sequences. In fact, historical images often contain valuable phenological variations in land features, which exhibit diverse patterns and can significantly benefit from semantic segmentation tasks. This paper introduces a semantic segmentation framework for satellite image time series (SITS) based on dilated convolution and a Transformer encoder. The framework includes spatial encoding and temporal encoding. Spatial encoding, utilizing dilated convolutions exclusively, mitigates the loss of spatial accuracy and the need for up-sampling, while allowing for the extraction of rich multi-scale features through a combination of different dilation rates and dense connections. Temporal encoding leverages a Transformer encoder to extract temporal features for each pixel in the image. To better capture the annual periodic patterns of phenological phenomena in land features, position encoding is calculated based on the image’s acquisition date within the year. To assess the performance of this framework, comparative and ablation experiments were conducted using the PASTIS dataset. The experiments indicate that this framework achieves highly competitive performance with relatively low optimization parameters, resulting in an improvement of 8 percentage points in the mean Intersection over Union (mIoU).https://www.mdpi.com/2076-3417/13/23/12658semantic segmentationphenologyspatial encodingtemporal encodingsatellite image time series
spellingShingle Feifei Zhang
Yong Wang
Yawen Du
Yijia Zhu
A Spatio-Temporal Encoding Neural Network for Semantic Segmentation of Satellite Image Time Series
Applied Sciences
semantic segmentation
phenology
spatial encoding
temporal encoding
satellite image time series
title A Spatio-Temporal Encoding Neural Network for Semantic Segmentation of Satellite Image Time Series
title_full A Spatio-Temporal Encoding Neural Network for Semantic Segmentation of Satellite Image Time Series
title_fullStr A Spatio-Temporal Encoding Neural Network for Semantic Segmentation of Satellite Image Time Series
title_full_unstemmed A Spatio-Temporal Encoding Neural Network for Semantic Segmentation of Satellite Image Time Series
title_short A Spatio-Temporal Encoding Neural Network for Semantic Segmentation of Satellite Image Time Series
title_sort spatio temporal encoding neural network for semantic segmentation of satellite image time series
topic semantic segmentation
phenology
spatial encoding
temporal encoding
satellite image time series
url https://www.mdpi.com/2076-3417/13/23/12658
work_keys_str_mv AT feifeizhang aspatiotemporalencodingneuralnetworkforsemanticsegmentationofsatelliteimagetimeseries
AT yongwang aspatiotemporalencodingneuralnetworkforsemanticsegmentationofsatelliteimagetimeseries
AT yawendu aspatiotemporalencodingneuralnetworkforsemanticsegmentationofsatelliteimagetimeseries
AT yijiazhu aspatiotemporalencodingneuralnetworkforsemanticsegmentationofsatelliteimagetimeseries
AT feifeizhang spatiotemporalencodingneuralnetworkforsemanticsegmentationofsatelliteimagetimeseries
AT yongwang spatiotemporalencodingneuralnetworkforsemanticsegmentationofsatelliteimagetimeseries
AT yawendu spatiotemporalencodingneuralnetworkforsemanticsegmentationofsatelliteimagetimeseries
AT yijiazhu spatiotemporalencodingneuralnetworkforsemanticsegmentationofsatelliteimagetimeseries