Cost-Sensitive Laplacian Logistic Regression for Ship Detention Prediction
Port state control (PSC) is the last line of defense for substandard ships. During a PSC inspection, ship detention is the most severe result if the inspected ship is identified with critical deficiencies. Regarding the development of ship detention prediction models, this paper identifies two chall...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-12-01
|
Series: | Mathematics |
Subjects: | |
Online Access: | https://www.mdpi.com/2227-7390/11/1/119 |
_version_ | 1797431421002842112 |
---|---|
author | Xuecheng Tian Shuaian Wang |
author_facet | Xuecheng Tian Shuaian Wang |
author_sort | Xuecheng Tian |
collection | DOAJ |
description | Port state control (PSC) is the last line of defense for substandard ships. During a PSC inspection, ship detention is the most severe result if the inspected ship is identified with critical deficiencies. Regarding the development of ship detention prediction models, this paper identifies two challenges: learning from imbalanced data and learning from unlabeled data. The first challenge, imbalanced data, arises from the fact that a minority of inspected ships were detained. The second challenge, unlabeled data, arises from the fact that in practice not all foreign visiting ships receive a formal PSC inspection, leading to a missing data problem. To address these two challenges, this paper adopts two machine learning paradigms: cost-sensitive learning and semi-supervised learning. Accordingly, we expand the traditional logistic regression (LR) model by introducing a cost parameter to consider the different misclassification costs of unbalanced classes and incorporating a graph regularization term to consider unlabeled data. Finally, we conduct extensive computational experiments to verify the superiority of the developed cost-sensitive semi-supervised learning framework in this paper. Computational results show that introducing a cost parameter into LR can improve the classification rate for substandard ships by almost 10%. In addition, the results show that considering unlabeled data in classification models can increase the classification rate for minority and majority classes by 1.33% and 5.93%, respectively. |
first_indexed | 2024-03-09T09:44:46Z |
format | Article |
id | doaj.art-730a6c7eec104738b6478ef19036d3f0 |
institution | Directory Open Access Journal |
issn | 2227-7390 |
language | English |
last_indexed | 2024-03-09T09:44:46Z |
publishDate | 2022-12-01 |
publisher | MDPI AG |
record_format | Article |
series | Mathematics |
spelling | doaj.art-730a6c7eec104738b6478ef19036d3f02023-12-02T00:38:45ZengMDPI AGMathematics2227-73902022-12-0111111910.3390/math11010119Cost-Sensitive Laplacian Logistic Regression for Ship Detention PredictionXuecheng Tian0Shuaian Wang1Department of Logistics & Maritime Studies, The Hong Kong Polytechnic University, Hung Hom, Hong Kong 999077, ChinaFaculty of Business, The Hong Kong Polytechnic University, Hung Hom, Hong Kong 999077, ChinaPort state control (PSC) is the last line of defense for substandard ships. During a PSC inspection, ship detention is the most severe result if the inspected ship is identified with critical deficiencies. Regarding the development of ship detention prediction models, this paper identifies two challenges: learning from imbalanced data and learning from unlabeled data. The first challenge, imbalanced data, arises from the fact that a minority of inspected ships were detained. The second challenge, unlabeled data, arises from the fact that in practice not all foreign visiting ships receive a formal PSC inspection, leading to a missing data problem. To address these two challenges, this paper adopts two machine learning paradigms: cost-sensitive learning and semi-supervised learning. Accordingly, we expand the traditional logistic regression (LR) model by introducing a cost parameter to consider the different misclassification costs of unbalanced classes and incorporating a graph regularization term to consider unlabeled data. Finally, we conduct extensive computational experiments to verify the superiority of the developed cost-sensitive semi-supervised learning framework in this paper. Computational results show that introducing a cost parameter into LR can improve the classification rate for substandard ships by almost 10%. In addition, the results show that considering unlabeled data in classification models can increase the classification rate for minority and majority classes by 1.33% and 5.93%, respectively.https://www.mdpi.com/2227-7390/11/1/119cost-sensitive learningsemi-supervised learninglogistic regressionport state control |
spellingShingle | Xuecheng Tian Shuaian Wang Cost-Sensitive Laplacian Logistic Regression for Ship Detention Prediction Mathematics cost-sensitive learning semi-supervised learning logistic regression port state control |
title | Cost-Sensitive Laplacian Logistic Regression for Ship Detention Prediction |
title_full | Cost-Sensitive Laplacian Logistic Regression for Ship Detention Prediction |
title_fullStr | Cost-Sensitive Laplacian Logistic Regression for Ship Detention Prediction |
title_full_unstemmed | Cost-Sensitive Laplacian Logistic Regression for Ship Detention Prediction |
title_short | Cost-Sensitive Laplacian Logistic Regression for Ship Detention Prediction |
title_sort | cost sensitive laplacian logistic regression for ship detention prediction |
topic | cost-sensitive learning semi-supervised learning logistic regression port state control |
url | https://www.mdpi.com/2227-7390/11/1/119 |
work_keys_str_mv | AT xuechengtian costsensitivelaplacianlogisticregressionforshipdetentionprediction AT shuaianwang costsensitivelaplacianlogisticregressionforshipdetentionprediction |