An Improved Self-Training Method for Positive Unlabeled Time Series Classification Using DTW Barycenter Averaging

Traditional supervised time series classification (TSC) tasks assume that all training data are labeled. However, in practice, manually labelling all unlabeled data could be very time-consuming and often requires the participation of skilled domain experts. In this paper, we concern with the positiv...

Full description

Bibliographic Details
Main Authors: Jing Li, Haowen Zhang, Yabo Dong, Tongbin Zuo, Duanqing Xu
Format: Article
Language:English
Published: MDPI AG 2021-11-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/21/21/7414
_version_ 1797511723293343744
author Jing Li
Haowen Zhang
Yabo Dong
Tongbin Zuo
Duanqing Xu
author_facet Jing Li
Haowen Zhang
Yabo Dong
Tongbin Zuo
Duanqing Xu
author_sort Jing Li
collection DOAJ
description Traditional supervised time series classification (TSC) tasks assume that all training data are labeled. However, in practice, manually labelling all unlabeled data could be very time-consuming and often requires the participation of skilled domain experts. In this paper, we concern with the positive unlabeled time series classification problem (<i>PUTSC</i>), which refers to automatically labelling the large unlabeled set <i>U</i> based on a small positive labeled set <i>PL</i>. The self-training (<i>ST</i>) is the most widely used method for solving the <i>PUTSC</i> problem and has attracted increased attention due to its simplicity and effectiveness. The existing <i>ST</i> methods simply employ the <i>one-nearest-neighbor</i> (<i>1NN)</i> formula to determine which unlabeled time-series should be labeled. Nevertheless, we note that the <i>1NN</i> formula might not be optimal for <i>PUTSC</i> tasks because it may be sensitive to the initial labeled data located near the boundary between the positive and negative classes. To overcome this issue, in this paper we propose an exploratory methodology called <i>ST-average</i>. Unlike conventional <i>ST</i>-based approaches, <i>ST-average</i> utilizes the average sequence calculated by DTW barycenter averaging technique to label the data. Compared with any individuals in <i>PL</i> set, the average sequence is more representative. Our proposal is insensitive to the initial labeled data and is more reliable than existing <i>ST</i>-based methods. Besides, we demonstrate that <i>ST-average</i> can naturally be implemented along with many existing techniques used in original <i>ST</i>. Experimental results on public datasets show that <i>ST-average</i> performs better than related popular methods.
first_indexed 2024-03-10T05:52:06Z
format Article
id doaj.art-8e7f01c330cb45f9aa4fc325ef7b54b0
institution Directory Open Access Journal
issn 1424-8220
language English
last_indexed 2024-03-10T05:52:06Z
publishDate 2021-11-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj.art-8e7f01c330cb45f9aa4fc325ef7b54b02023-11-22T21:41:16ZengMDPI AGSensors1424-82202021-11-012121741410.3390/s21217414An Improved Self-Training Method for Positive Unlabeled Time Series Classification Using DTW Barycenter AveragingJing Li0Haowen Zhang1Yabo Dong2Tongbin Zuo3Duanqing Xu4College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, ChinaCollege of Computer Science and Technology, Zhejiang University, Hangzhou 310027, ChinaCollege of Computer Science and Technology, Zhejiang University, Hangzhou 310027, ChinaCollege of Computer Science and Technology, Zhejiang University, Hangzhou 310027, ChinaCollege of Computer Science and Technology, Zhejiang University, Hangzhou 310027, ChinaTraditional supervised time series classification (TSC) tasks assume that all training data are labeled. However, in practice, manually labelling all unlabeled data could be very time-consuming and often requires the participation of skilled domain experts. In this paper, we concern with the positive unlabeled time series classification problem (<i>PUTSC</i>), which refers to automatically labelling the large unlabeled set <i>U</i> based on a small positive labeled set <i>PL</i>. The self-training (<i>ST</i>) is the most widely used method for solving the <i>PUTSC</i> problem and has attracted increased attention due to its simplicity and effectiveness. The existing <i>ST</i> methods simply employ the <i>one-nearest-neighbor</i> (<i>1NN)</i> formula to determine which unlabeled time-series should be labeled. Nevertheless, we note that the <i>1NN</i> formula might not be optimal for <i>PUTSC</i> tasks because it may be sensitive to the initial labeled data located near the boundary between the positive and negative classes. To overcome this issue, in this paper we propose an exploratory methodology called <i>ST-average</i>. Unlike conventional <i>ST</i>-based approaches, <i>ST-average</i> utilizes the average sequence calculated by DTW barycenter averaging technique to label the data. Compared with any individuals in <i>PL</i> set, the average sequence is more representative. Our proposal is insensitive to the initial labeled data and is more reliable than existing <i>ST</i>-based methods. Besides, we demonstrate that <i>ST-average</i> can naturally be implemented along with many existing techniques used in original <i>ST</i>. Experimental results on public datasets show that <i>ST-average</i> performs better than related popular methods.https://www.mdpi.com/1424-8220/21/21/7414positive unlabeled time series classificationself-trainingdynamic time warpingDTW barycenter averaging
spellingShingle Jing Li
Haowen Zhang
Yabo Dong
Tongbin Zuo
Duanqing Xu
An Improved Self-Training Method for Positive Unlabeled Time Series Classification Using DTW Barycenter Averaging
Sensors
positive unlabeled time series classification
self-training
dynamic time warping
DTW barycenter averaging
title An Improved Self-Training Method for Positive Unlabeled Time Series Classification Using DTW Barycenter Averaging
title_full An Improved Self-Training Method for Positive Unlabeled Time Series Classification Using DTW Barycenter Averaging
title_fullStr An Improved Self-Training Method for Positive Unlabeled Time Series Classification Using DTW Barycenter Averaging
title_full_unstemmed An Improved Self-Training Method for Positive Unlabeled Time Series Classification Using DTW Barycenter Averaging
title_short An Improved Self-Training Method for Positive Unlabeled Time Series Classification Using DTW Barycenter Averaging
title_sort improved self training method for positive unlabeled time series classification using dtw barycenter averaging
topic positive unlabeled time series classification
self-training
dynamic time warping
DTW barycenter averaging
url https://www.mdpi.com/1424-8220/21/21/7414
work_keys_str_mv AT jingli animprovedselftrainingmethodforpositiveunlabeledtimeseriesclassificationusingdtwbarycenteraveraging
AT haowenzhang animprovedselftrainingmethodforpositiveunlabeledtimeseriesclassificationusingdtwbarycenteraveraging
AT yabodong animprovedselftrainingmethodforpositiveunlabeledtimeseriesclassificationusingdtwbarycenteraveraging
AT tongbinzuo animprovedselftrainingmethodforpositiveunlabeledtimeseriesclassificationusingdtwbarycenteraveraging
AT duanqingxu animprovedselftrainingmethodforpositiveunlabeledtimeseriesclassificationusingdtwbarycenteraveraging
AT jingli improvedselftrainingmethodforpositiveunlabeledtimeseriesclassificationusingdtwbarycenteraveraging
AT haowenzhang improvedselftrainingmethodforpositiveunlabeledtimeseriesclassificationusingdtwbarycenteraveraging
AT yabodong improvedselftrainingmethodforpositiveunlabeledtimeseriesclassificationusingdtwbarycenteraveraging
AT tongbinzuo improvedselftrainingmethodforpositiveunlabeledtimeseriesclassificationusingdtwbarycenteraveraging
AT duanqingxu improvedselftrainingmethodforpositiveunlabeledtimeseriesclassificationusingdtwbarycenteraveraging