AFGL-Net: Attentive Fusion of Global and Local Deep Features for Building Façades Parsing

In this paper, we propose a deep learning framework, namely AFGL-Net to achieve building façade parsing, i.e., obtaining the semantics of small components of building façade, such as windows and doors. To this end, we present an autoencoder embedding position and direction encoding for local feature...

Full description

Bibliographic Details
Main Authors: Dong Chen, Guiqiu Xiang, Jiju Peethambaran, Liqiang Zhang, Jing Li, Fan Hu
Format: Article
Language:English
Published: MDPI AG 2021-12-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/13/24/5039
_version_ 1797500960170311680
author Dong Chen
Guiqiu Xiang
Jiju Peethambaran
Liqiang Zhang
Jing Li
Fan Hu
author_facet Dong Chen
Guiqiu Xiang
Jiju Peethambaran
Liqiang Zhang
Jing Li
Fan Hu
author_sort Dong Chen
collection DOAJ
description In this paper, we propose a deep learning framework, namely AFGL-Net to achieve building façade parsing, i.e., obtaining the semantics of small components of building façade, such as windows and doors. To this end, we present an autoencoder embedding position and direction encoding for local feature encoding. The autoencoder enhances the local feature aggregation and augments the representation of skeleton features of windows and doors. We also integrate the Transformer into AFGL-Net to infer the geometric shapes and structural arrangements of façade components and capture the global contextual features. These global features can help recognize inapparent windows/doors from the façade points corrupted with noise, outliers, occlusions, and irregularities. The attention-based feature fusion mechanism is finally employed to obtain more informative features by simultaneously considering local geometric details and the global contexts. The proposed AFGL-Net is comprehensively evaluated on Dublin and RueMonge2014 benchmarks, achieving 67.02% and 59.80% mIoU, respectively. We also demonstrate the superiority of the proposed AFGL-Net by comparing with the state-of-the-art methods and various ablation studies.
first_indexed 2024-03-10T03:11:23Z
format Article
id doaj.art-30dfdbf62bb240bb9768f659803f62ab
institution Directory Open Access Journal
issn 2072-4292
language English
last_indexed 2024-03-10T03:11:23Z
publishDate 2021-12-01
publisher MDPI AG
record_format Article
series Remote Sensing
spelling doaj.art-30dfdbf62bb240bb9768f659803f62ab2023-11-23T10:23:59ZengMDPI AGRemote Sensing2072-42922021-12-011324503910.3390/rs13245039AFGL-Net: Attentive Fusion of Global and Local Deep Features for Building Façades ParsingDong Chen0Guiqiu Xiang1Jiju Peethambaran2Liqiang Zhang3Jing Li4Fan Hu5College of Civil Engineering, Nanjing Forestry University, Nanjing 210037, ChinaCollege of Civil Engineering, Nanjing Forestry University, Nanjing 210037, ChinaDepartment of Mathematics and Computing Science, Saint Mary’s University, Halifax, NS B3P 2M6, CanadaThe State Key Laboratory of Remote Sensing Science, Faculty of Geographical Science, Beijing Normal University, Beijing 100875, ChinaCollege of Civil Engineering, Nanjing Forestry University, Nanjing 210037, ChinaCollege of Civil Engineering, Nanjing Forestry University, Nanjing 210037, ChinaIn this paper, we propose a deep learning framework, namely AFGL-Net to achieve building façade parsing, i.e., obtaining the semantics of small components of building façade, such as windows and doors. To this end, we present an autoencoder embedding position and direction encoding for local feature encoding. The autoencoder enhances the local feature aggregation and augments the representation of skeleton features of windows and doors. We also integrate the Transformer into AFGL-Net to infer the geometric shapes and structural arrangements of façade components and capture the global contextual features. These global features can help recognize inapparent windows/doors from the façade points corrupted with noise, outliers, occlusions, and irregularities. The attention-based feature fusion mechanism is finally employed to obtain more informative features by simultaneously considering local geometric details and the global contexts. The proposed AFGL-Net is comprehensively evaluated on Dublin and RueMonge2014 benchmarks, achieving 67.02% and 59.80% mIoU, respectively. We also demonstrate the superiority of the proposed AFGL-Net by comparing with the state-of-the-art methods and various ablation studies.https://www.mdpi.com/2072-4292/13/24/5039façade parsingsemantic segmentationMLPautoencoderglobal transformerattentive feature fusion
spellingShingle Dong Chen
Guiqiu Xiang
Jiju Peethambaran
Liqiang Zhang
Jing Li
Fan Hu
AFGL-Net: Attentive Fusion of Global and Local Deep Features for Building Façades Parsing
Remote Sensing
façade parsing
semantic segmentation
MLP
autoencoder
global transformer
attentive feature fusion
title AFGL-Net: Attentive Fusion of Global and Local Deep Features for Building Façades Parsing
title_full AFGL-Net: Attentive Fusion of Global and Local Deep Features for Building Façades Parsing
title_fullStr AFGL-Net: Attentive Fusion of Global and Local Deep Features for Building Façades Parsing
title_full_unstemmed AFGL-Net: Attentive Fusion of Global and Local Deep Features for Building Façades Parsing
title_short AFGL-Net: Attentive Fusion of Global and Local Deep Features for Building Façades Parsing
title_sort afgl net attentive fusion of global and local deep features for building facades parsing
topic façade parsing
semantic segmentation
MLP
autoencoder
global transformer
attentive feature fusion
url https://www.mdpi.com/2072-4292/13/24/5039
work_keys_str_mv AT dongchen afglnetattentivefusionofglobalandlocaldeepfeaturesforbuildingfacadesparsing
AT guiqiuxiang afglnetattentivefusionofglobalandlocaldeepfeaturesforbuildingfacadesparsing
AT jijupeethambaran afglnetattentivefusionofglobalandlocaldeepfeaturesforbuildingfacadesparsing
AT liqiangzhang afglnetattentivefusionofglobalandlocaldeepfeaturesforbuildingfacadesparsing
AT jingli afglnetattentivefusionofglobalandlocaldeepfeaturesforbuildingfacadesparsing
AT fanhu afglnetattentivefusionofglobalandlocaldeepfeaturesforbuildingfacadesparsing