Machine-Learning Classification of SAR Remotely-Sensed Sea-Surface Petroleum Signatures—Part 1: Training and Testing Cross Validation

Sea-surface petroleum pollution is observed as “oil slicks” (i.e., “oil spills” or “oil seeps”) and can be confused with “look-alike slicks” (i.e., environmental phenomena, such as low-wind speed, upwelling conditions, chlorophyll, etc.) in synthetic aperture radar (SAR) measurements, the most profi...

Full description

Bibliographic Details
Main Authors: Gustavo de Araújo Carvalho, Peter J. Minnett, Nelson F. F. Ebecken, Luiz Landau
Format: Article
Language:English
Published: MDPI AG 2022-06-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/14/13/3027
_version_ 1797434043430600704
author Gustavo de Araújo Carvalho
Peter J. Minnett
Nelson F. F. Ebecken
Luiz Landau
author_facet Gustavo de Araújo Carvalho
Peter J. Minnett
Nelson F. F. Ebecken
Luiz Landau
author_sort Gustavo de Araújo Carvalho
collection DOAJ
description Sea-surface petroleum pollution is observed as “oil slicks” (i.e., “oil spills” or “oil seeps”) and can be confused with “look-alike slicks” (i.e., environmental phenomena, such as low-wind speed, upwelling conditions, chlorophyll, etc.) in synthetic aperture radar (SAR) measurements, the most proficient satellite sensor to detect mineral oil on the sea surface. Even though machine learning (ML) has become widely used to classify remotely-sensed petroleum signatures, few papers have been published comparing various ML methods to distinguish spills from look-alikes. Our research fills this gap by comparing and evaluating six traditional techniques: simple (naive Bayes (NB), K-nearest neighbor (KNN), decision trees (DT)) and advanced (random forest (RF), support vector machine (SVM), artificial neural network (ANN)) applied to different combinations of satellite-retrieved attributes. 36 ML algorithms were used to discriminate “ocean-slick signatures” (spills versus look-alikes) with ten-times repeated random subsampling cross validation (70-30 train-test partition). Our results found that the best algorithm (ANN: 90%) was >20% more effective than the least accurate one (DT: ~68%). Our empirical ML observations contribute to both scientific ocean remote-sensing research and to oil and gas industry activities, in that: (i) most techniques were superior when morphological information and Meteorological and Oceanographic (MetOc) parameters were included together, and less accurate when these variables were used separately; (ii) the algorithms with the better performance used more variables (without feature selection), while lower accuracy algorithms were those that used fewer variables (with feature selection); (iii) we created algorithms more effective than those of benchmark-past studies that used linear discriminant analysis (LDA: ~85%) on the same dataset; and (iv) accurate algorithms can assist in finding new offshore fossil fuel discoveries (i.e., misclassification reduction).
first_indexed 2024-03-09T10:26:08Z
format Article
id doaj.art-cff0909d2e5449d2897678edb405f19a
institution Directory Open Access Journal
issn 2072-4292
language English
last_indexed 2024-03-09T10:26:08Z
publishDate 2022-06-01
publisher MDPI AG
record_format Article
series Remote Sensing
spelling doaj.art-cff0909d2e5449d2897678edb405f19a2023-12-01T21:40:22ZengMDPI AGRemote Sensing2072-42922022-06-011413302710.3390/rs14133027Machine-Learning Classification of SAR Remotely-Sensed Sea-Surface Petroleum Signatures—Part 1: Training and Testing Cross ValidationGustavo de Araújo Carvalho0Peter J. Minnett1Nelson F. F. Ebecken2Luiz Landau3Laboratório de Sensoriamento Remoto por Radar Aplicado à Indústria do Petróleo (LabSAR), Laboratório de Métodos Computacionais em Engenharia (LAMCE), Programa de Engenharia Civil (PEC), Instituto Alberto Luiz Coimbra de Pós-Graduação e Pesquisa de Engenharia (COPPE), Universidade Federal do Rio de Janeiro (UFRJ), Rio de Janeiro 21941-859, RJ, BrazilDepartment of Ocean Sciences (OCE), Rosenstiel School of Marine and Atmospheric Science (RSMAS), University of Miami (UM), Miami, FL 33149, USANúcleo de Transferência de Tecnologia (NTT), Programa de Engenharia Civil (PEC), Instituto Alberto Luiz Coimbra de Pós-Graduação e Pesquisa de Engenharia (COPPE), Universidade Federal do Rio de Janeiro (UFRJ), Rio de Janeiro 21941-901, RJ, BrazilLaboratório de Sensoriamento Remoto por Radar Aplicado à Indústria do Petróleo (LabSAR), Laboratório de Métodos Computacionais em Engenharia (LAMCE), Programa de Engenharia Civil (PEC), Instituto Alberto Luiz Coimbra de Pós-Graduação e Pesquisa de Engenharia (COPPE), Universidade Federal do Rio de Janeiro (UFRJ), Rio de Janeiro 21941-859, RJ, BrazilSea-surface petroleum pollution is observed as “oil slicks” (i.e., “oil spills” or “oil seeps”) and can be confused with “look-alike slicks” (i.e., environmental phenomena, such as low-wind speed, upwelling conditions, chlorophyll, etc.) in synthetic aperture radar (SAR) measurements, the most proficient satellite sensor to detect mineral oil on the sea surface. Even though machine learning (ML) has become widely used to classify remotely-sensed petroleum signatures, few papers have been published comparing various ML methods to distinguish spills from look-alikes. Our research fills this gap by comparing and evaluating six traditional techniques: simple (naive Bayes (NB), K-nearest neighbor (KNN), decision trees (DT)) and advanced (random forest (RF), support vector machine (SVM), artificial neural network (ANN)) applied to different combinations of satellite-retrieved attributes. 36 ML algorithms were used to discriminate “ocean-slick signatures” (spills versus look-alikes) with ten-times repeated random subsampling cross validation (70-30 train-test partition). Our results found that the best algorithm (ANN: 90%) was >20% more effective than the least accurate one (DT: ~68%). Our empirical ML observations contribute to both scientific ocean remote-sensing research and to oil and gas industry activities, in that: (i) most techniques were superior when morphological information and Meteorological and Oceanographic (MetOc) parameters were included together, and less accurate when these variables were used separately; (ii) the algorithms with the better performance used more variables (without feature selection), while lower accuracy algorithms were those that used fewer variables (with feature selection); (iii) we created algorithms more effective than those of benchmark-past studies that used linear discriminant analysis (LDA: ~85%) on the same dataset; and (iv) accurate algorithms can assist in finding new offshore fossil fuel discoveries (i.e., misclassification reduction).https://www.mdpi.com/2072-4292/14/13/3027oil slicksoil spillsoil seepslook-alike slicksocean remote sensingsatellite
spellingShingle Gustavo de Araújo Carvalho
Peter J. Minnett
Nelson F. F. Ebecken
Luiz Landau
Machine-Learning Classification of SAR Remotely-Sensed Sea-Surface Petroleum Signatures—Part 1: Training and Testing Cross Validation
Remote Sensing
oil slicks
oil spills
oil seeps
look-alike slicks
ocean remote sensing
satellite
title Machine-Learning Classification of SAR Remotely-Sensed Sea-Surface Petroleum Signatures—Part 1: Training and Testing Cross Validation
title_full Machine-Learning Classification of SAR Remotely-Sensed Sea-Surface Petroleum Signatures—Part 1: Training and Testing Cross Validation
title_fullStr Machine-Learning Classification of SAR Remotely-Sensed Sea-Surface Petroleum Signatures—Part 1: Training and Testing Cross Validation
title_full_unstemmed Machine-Learning Classification of SAR Remotely-Sensed Sea-Surface Petroleum Signatures—Part 1: Training and Testing Cross Validation
title_short Machine-Learning Classification of SAR Remotely-Sensed Sea-Surface Petroleum Signatures—Part 1: Training and Testing Cross Validation
title_sort machine learning classification of sar remotely sensed sea surface petroleum signatures part 1 training and testing cross validation
topic oil slicks
oil spills
oil seeps
look-alike slicks
ocean remote sensing
satellite
url https://www.mdpi.com/2072-4292/14/13/3027
work_keys_str_mv AT gustavodearaujocarvalho machinelearningclassificationofsarremotelysensedseasurfacepetroleumsignaturespart1trainingandtestingcrossvalidation
AT peterjminnett machinelearningclassificationofsarremotelysensedseasurfacepetroleumsignaturespart1trainingandtestingcrossvalidation
AT nelsonffebecken machinelearningclassificationofsarremotelysensedseasurfacepetroleumsignaturespart1trainingandtestingcrossvalidation
AT luizlandau machinelearningclassificationofsarremotelysensedseasurfacepetroleumsignaturespart1trainingandtestingcrossvalidation