AFGL-Net: Attentive Fusion of Global and Local Deep Features for Building Façades Parsing
In this paper, we propose a deep learning framework, namely AFGL-Net to achieve building façade parsing, i.e., obtaining the semantics of small components of building façade, such as windows and doors. To this end, we present an autoencoder embedding position and direction encoding for local feature...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-12-01
|
Series: | Remote Sensing |
Subjects: | |
Online Access: | https://www.mdpi.com/2072-4292/13/24/5039 |
_version_ | 1797500960170311680 |
---|---|
author | Dong Chen Guiqiu Xiang Jiju Peethambaran Liqiang Zhang Jing Li Fan Hu |
author_facet | Dong Chen Guiqiu Xiang Jiju Peethambaran Liqiang Zhang Jing Li Fan Hu |
author_sort | Dong Chen |
collection | DOAJ |
description | In this paper, we propose a deep learning framework, namely AFGL-Net to achieve building façade parsing, i.e., obtaining the semantics of small components of building façade, such as windows and doors. To this end, we present an autoencoder embedding position and direction encoding for local feature encoding. The autoencoder enhances the local feature aggregation and augments the representation of skeleton features of windows and doors. We also integrate the Transformer into AFGL-Net to infer the geometric shapes and structural arrangements of façade components and capture the global contextual features. These global features can help recognize inapparent windows/doors from the façade points corrupted with noise, outliers, occlusions, and irregularities. The attention-based feature fusion mechanism is finally employed to obtain more informative features by simultaneously considering local geometric details and the global contexts. The proposed AFGL-Net is comprehensively evaluated on Dublin and RueMonge2014 benchmarks, achieving 67.02% and 59.80% mIoU, respectively. We also demonstrate the superiority of the proposed AFGL-Net by comparing with the state-of-the-art methods and various ablation studies. |
first_indexed | 2024-03-10T03:11:23Z |
format | Article |
id | doaj.art-30dfdbf62bb240bb9768f659803f62ab |
institution | Directory Open Access Journal |
issn | 2072-4292 |
language | English |
last_indexed | 2024-03-10T03:11:23Z |
publishDate | 2021-12-01 |
publisher | MDPI AG |
record_format | Article |
series | Remote Sensing |
spelling | doaj.art-30dfdbf62bb240bb9768f659803f62ab2023-11-23T10:23:59ZengMDPI AGRemote Sensing2072-42922021-12-011324503910.3390/rs13245039AFGL-Net: Attentive Fusion of Global and Local Deep Features for Building Façades ParsingDong Chen0Guiqiu Xiang1Jiju Peethambaran2Liqiang Zhang3Jing Li4Fan Hu5College of Civil Engineering, Nanjing Forestry University, Nanjing 210037, ChinaCollege of Civil Engineering, Nanjing Forestry University, Nanjing 210037, ChinaDepartment of Mathematics and Computing Science, Saint Mary’s University, Halifax, NS B3P 2M6, CanadaThe State Key Laboratory of Remote Sensing Science, Faculty of Geographical Science, Beijing Normal University, Beijing 100875, ChinaCollege of Civil Engineering, Nanjing Forestry University, Nanjing 210037, ChinaCollege of Civil Engineering, Nanjing Forestry University, Nanjing 210037, ChinaIn this paper, we propose a deep learning framework, namely AFGL-Net to achieve building façade parsing, i.e., obtaining the semantics of small components of building façade, such as windows and doors. To this end, we present an autoencoder embedding position and direction encoding for local feature encoding. The autoencoder enhances the local feature aggregation and augments the representation of skeleton features of windows and doors. We also integrate the Transformer into AFGL-Net to infer the geometric shapes and structural arrangements of façade components and capture the global contextual features. These global features can help recognize inapparent windows/doors from the façade points corrupted with noise, outliers, occlusions, and irregularities. The attention-based feature fusion mechanism is finally employed to obtain more informative features by simultaneously considering local geometric details and the global contexts. The proposed AFGL-Net is comprehensively evaluated on Dublin and RueMonge2014 benchmarks, achieving 67.02% and 59.80% mIoU, respectively. We also demonstrate the superiority of the proposed AFGL-Net by comparing with the state-of-the-art methods and various ablation studies.https://www.mdpi.com/2072-4292/13/24/5039façade parsingsemantic segmentationMLPautoencoderglobal transformerattentive feature fusion |
spellingShingle | Dong Chen Guiqiu Xiang Jiju Peethambaran Liqiang Zhang Jing Li Fan Hu AFGL-Net: Attentive Fusion of Global and Local Deep Features for Building Façades Parsing Remote Sensing façade parsing semantic segmentation MLP autoencoder global transformer attentive feature fusion |
title | AFGL-Net: Attentive Fusion of Global and Local Deep Features for Building Façades Parsing |
title_full | AFGL-Net: Attentive Fusion of Global and Local Deep Features for Building Façades Parsing |
title_fullStr | AFGL-Net: Attentive Fusion of Global and Local Deep Features for Building Façades Parsing |
title_full_unstemmed | AFGL-Net: Attentive Fusion of Global and Local Deep Features for Building Façades Parsing |
title_short | AFGL-Net: Attentive Fusion of Global and Local Deep Features for Building Façades Parsing |
title_sort | afgl net attentive fusion of global and local deep features for building facades parsing |
topic | façade parsing semantic segmentation MLP autoencoder global transformer attentive feature fusion |
url | https://www.mdpi.com/2072-4292/13/24/5039 |
work_keys_str_mv | AT dongchen afglnetattentivefusionofglobalandlocaldeepfeaturesforbuildingfacadesparsing AT guiqiuxiang afglnetattentivefusionofglobalandlocaldeepfeaturesforbuildingfacadesparsing AT jijupeethambaran afglnetattentivefusionofglobalandlocaldeepfeaturesforbuildingfacadesparsing AT liqiangzhang afglnetattentivefusionofglobalandlocaldeepfeaturesforbuildingfacadesparsing AT jingli afglnetattentivefusionofglobalandlocaldeepfeaturesforbuildingfacadesparsing AT fanhu afglnetattentivefusionofglobalandlocaldeepfeaturesforbuildingfacadesparsing |