Machine-Learning Classification of SAR Remotely-Sensed Sea-Surface Petroleum Signatures—Part 1: Training and Testing Cross Validation
Sea-surface petroleum pollution is observed as “oil slicks” (i.e., “oil spills” or “oil seeps”) and can be confused with “look-alike slicks” (i.e., environmental phenomena, such as low-wind speed, upwelling conditions, chlorophyll, etc.) in synthetic aperture radar (SAR) measurements, the most profi...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-06-01
|
Series: | Remote Sensing |
Subjects: | |
Online Access: | https://www.mdpi.com/2072-4292/14/13/3027 |
_version_ | 1797434043430600704 |
---|---|
author | Gustavo de Araújo Carvalho Peter J. Minnett Nelson F. F. Ebecken Luiz Landau |
author_facet | Gustavo de Araújo Carvalho Peter J. Minnett Nelson F. F. Ebecken Luiz Landau |
author_sort | Gustavo de Araújo Carvalho |
collection | DOAJ |
description | Sea-surface petroleum pollution is observed as “oil slicks” (i.e., “oil spills” or “oil seeps”) and can be confused with “look-alike slicks” (i.e., environmental phenomena, such as low-wind speed, upwelling conditions, chlorophyll, etc.) in synthetic aperture radar (SAR) measurements, the most proficient satellite sensor to detect mineral oil on the sea surface. Even though machine learning (ML) has become widely used to classify remotely-sensed petroleum signatures, few papers have been published comparing various ML methods to distinguish spills from look-alikes. Our research fills this gap by comparing and evaluating six traditional techniques: simple (naive Bayes (NB), K-nearest neighbor (KNN), decision trees (DT)) and advanced (random forest (RF), support vector machine (SVM), artificial neural network (ANN)) applied to different combinations of satellite-retrieved attributes. 36 ML algorithms were used to discriminate “ocean-slick signatures” (spills versus look-alikes) with ten-times repeated random subsampling cross validation (70-30 train-test partition). Our results found that the best algorithm (ANN: 90%) was >20% more effective than the least accurate one (DT: ~68%). Our empirical ML observations contribute to both scientific ocean remote-sensing research and to oil and gas industry activities, in that: (i) most techniques were superior when morphological information and Meteorological and Oceanographic (MetOc) parameters were included together, and less accurate when these variables were used separately; (ii) the algorithms with the better performance used more variables (without feature selection), while lower accuracy algorithms were those that used fewer variables (with feature selection); (iii) we created algorithms more effective than those of benchmark-past studies that used linear discriminant analysis (LDA: ~85%) on the same dataset; and (iv) accurate algorithms can assist in finding new offshore fossil fuel discoveries (i.e., misclassification reduction). |
first_indexed | 2024-03-09T10:26:08Z |
format | Article |
id | doaj.art-cff0909d2e5449d2897678edb405f19a |
institution | Directory Open Access Journal |
issn | 2072-4292 |
language | English |
last_indexed | 2024-03-09T10:26:08Z |
publishDate | 2022-06-01 |
publisher | MDPI AG |
record_format | Article |
series | Remote Sensing |
spelling | doaj.art-cff0909d2e5449d2897678edb405f19a2023-12-01T21:40:22ZengMDPI AGRemote Sensing2072-42922022-06-011413302710.3390/rs14133027Machine-Learning Classification of SAR Remotely-Sensed Sea-Surface Petroleum Signatures—Part 1: Training and Testing Cross ValidationGustavo de Araújo Carvalho0Peter J. Minnett1Nelson F. F. Ebecken2Luiz Landau3Laboratório de Sensoriamento Remoto por Radar Aplicado à Indústria do Petróleo (LabSAR), Laboratório de Métodos Computacionais em Engenharia (LAMCE), Programa de Engenharia Civil (PEC), Instituto Alberto Luiz Coimbra de Pós-Graduação e Pesquisa de Engenharia (COPPE), Universidade Federal do Rio de Janeiro (UFRJ), Rio de Janeiro 21941-859, RJ, BrazilDepartment of Ocean Sciences (OCE), Rosenstiel School of Marine and Atmospheric Science (RSMAS), University of Miami (UM), Miami, FL 33149, USANúcleo de Transferência de Tecnologia (NTT), Programa de Engenharia Civil (PEC), Instituto Alberto Luiz Coimbra de Pós-Graduação e Pesquisa de Engenharia (COPPE), Universidade Federal do Rio de Janeiro (UFRJ), Rio de Janeiro 21941-901, RJ, BrazilLaboratório de Sensoriamento Remoto por Radar Aplicado à Indústria do Petróleo (LabSAR), Laboratório de Métodos Computacionais em Engenharia (LAMCE), Programa de Engenharia Civil (PEC), Instituto Alberto Luiz Coimbra de Pós-Graduação e Pesquisa de Engenharia (COPPE), Universidade Federal do Rio de Janeiro (UFRJ), Rio de Janeiro 21941-859, RJ, BrazilSea-surface petroleum pollution is observed as “oil slicks” (i.e., “oil spills” or “oil seeps”) and can be confused with “look-alike slicks” (i.e., environmental phenomena, such as low-wind speed, upwelling conditions, chlorophyll, etc.) in synthetic aperture radar (SAR) measurements, the most proficient satellite sensor to detect mineral oil on the sea surface. Even though machine learning (ML) has become widely used to classify remotely-sensed petroleum signatures, few papers have been published comparing various ML methods to distinguish spills from look-alikes. Our research fills this gap by comparing and evaluating six traditional techniques: simple (naive Bayes (NB), K-nearest neighbor (KNN), decision trees (DT)) and advanced (random forest (RF), support vector machine (SVM), artificial neural network (ANN)) applied to different combinations of satellite-retrieved attributes. 36 ML algorithms were used to discriminate “ocean-slick signatures” (spills versus look-alikes) with ten-times repeated random subsampling cross validation (70-30 train-test partition). Our results found that the best algorithm (ANN: 90%) was >20% more effective than the least accurate one (DT: ~68%). Our empirical ML observations contribute to both scientific ocean remote-sensing research and to oil and gas industry activities, in that: (i) most techniques were superior when morphological information and Meteorological and Oceanographic (MetOc) parameters were included together, and less accurate when these variables were used separately; (ii) the algorithms with the better performance used more variables (without feature selection), while lower accuracy algorithms were those that used fewer variables (with feature selection); (iii) we created algorithms more effective than those of benchmark-past studies that used linear discriminant analysis (LDA: ~85%) on the same dataset; and (iv) accurate algorithms can assist in finding new offshore fossil fuel discoveries (i.e., misclassification reduction).https://www.mdpi.com/2072-4292/14/13/3027oil slicksoil spillsoil seepslook-alike slicksocean remote sensingsatellite |
spellingShingle | Gustavo de Araújo Carvalho Peter J. Minnett Nelson F. F. Ebecken Luiz Landau Machine-Learning Classification of SAR Remotely-Sensed Sea-Surface Petroleum Signatures—Part 1: Training and Testing Cross Validation Remote Sensing oil slicks oil spills oil seeps look-alike slicks ocean remote sensing satellite |
title | Machine-Learning Classification of SAR Remotely-Sensed Sea-Surface Petroleum Signatures—Part 1: Training and Testing Cross Validation |
title_full | Machine-Learning Classification of SAR Remotely-Sensed Sea-Surface Petroleum Signatures—Part 1: Training and Testing Cross Validation |
title_fullStr | Machine-Learning Classification of SAR Remotely-Sensed Sea-Surface Petroleum Signatures—Part 1: Training and Testing Cross Validation |
title_full_unstemmed | Machine-Learning Classification of SAR Remotely-Sensed Sea-Surface Petroleum Signatures—Part 1: Training and Testing Cross Validation |
title_short | Machine-Learning Classification of SAR Remotely-Sensed Sea-Surface Petroleum Signatures—Part 1: Training and Testing Cross Validation |
title_sort | machine learning classification of sar remotely sensed sea surface petroleum signatures part 1 training and testing cross validation |
topic | oil slicks oil spills oil seeps look-alike slicks ocean remote sensing satellite |
url | https://www.mdpi.com/2072-4292/14/13/3027 |
work_keys_str_mv | AT gustavodearaujocarvalho machinelearningclassificationofsarremotelysensedseasurfacepetroleumsignaturespart1trainingandtestingcrossvalidation AT peterjminnett machinelearningclassificationofsarremotelysensedseasurfacepetroleumsignaturespart1trainingandtestingcrossvalidation AT nelsonffebecken machinelearningclassificationofsarremotelysensedseasurfacepetroleumsignaturespart1trainingandtestingcrossvalidation AT luizlandau machinelearningclassificationofsarremotelysensedseasurfacepetroleumsignaturespart1trainingandtestingcrossvalidation |