A Novel Framework of Detecting Convective Initiation Combining Automated Sampling, Machine Learning, and Repeated Model Tuning from Geostationary Satellite Data

This paper proposes a complete framework of a machine learning-based model that detects convective initiation (CI) from geostationary meteorological satellite data. The suggested framework consists of three main processes: (1) An automated sampling tool; (2) machine learning-based CI detection model...

Full description

Bibliographic Details
Main Authors: Daehyeon Han, Juhyun Lee, Jungho Im, Seongmun Sim, Sanggyun Lee, Hyangsun Han
Format: Article
Language:English
Published: MDPI AG 2019-06-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/11/12/1454
Description
Summary:This paper proposes a complete framework of a machine learning-based model that detects convective initiation (CI) from geostationary meteorological satellite data. The suggested framework consists of three main processes: (1) An automated sampling tool; (2) machine learning-based CI detection modelling; (3) repeated model tuning through validation. In this study, the automated sampling tool was able to track the CI objects iteratively, even without ancillary data such as an atmospheric motion vector (AMV). The collected samples were used to train the machine learning model for CI detection. Random forest (RF) was used to classify the CI and non-CI. To enhance the advantages of the machine learning approach, we adopted model tuning to iteratively update the training dataset from each validation result by adding hits and misses to the CI samples, and false alarms and correct negatives to the non-CI samples. Using 12 interest fields from the Himawari-8 Advanced Himawari Imager (AHI) over the Korean Peninsula, this simple and intuitive tuning process increased the overall probability of detection (POD) from 0.79 to 0.82 and decreased the overall false alarm rate (FAR) from 0.46 to 0.37 with around 40 min of the lead-time. Amongst the 12 interest fields, <inline-formula> <math display="inline"> <semantics> <mrow> <msub> <mi>T</mi> <mi>b</mi> </msub> </mrow> </semantics> </math> </inline-formula>(11.2) &#181;m was identified as the most significant predictor in the RF model, followed by <inline-formula> <math display="inline"> <semantics> <mrow> <msub> <mi>T</mi> <mi>b</mi> </msub> </mrow> </semantics> </math> </inline-formula>(8.6&#8212;11.2) &#181;m, and <inline-formula> <math display="inline"> <semantics> <mrow> <msub> <mi>T</mi> <mi>b</mi> </msub> </mrow> </semantics> </math> </inline-formula>(6.2&#8722;7.3) &#181;m. The effect of model tuning on the CI detection performance was also analyzed using spatiotemporal validation maps. By automatically collecting and updating the machine learning training dataset, the suggested framework is expected to help the maintenance of the CI detection model from an operational perspective.
ISSN:2072-4292