Stochastic gradient descent based fuzzy clustering for large data

Data is growing at an unprecedented rate in commercial and scientific areas. Clustering algorithms for large data which require small memory consumption and scalability become increasingly important under this circumstance. In this paper, we propose a new clustering approach called stochastic gradie...

Full description

Bibliographic Details
Main Authors: Chen, Lihui, Wang, Yangtao, Mei, Jian-Ping
Other Authors: School of Electrical and Electronic Engineering
Format: Conference Paper
Language:English
Published: 2015
Subjects:
Online Access:https://hdl.handle.net/10356/104522
http://hdl.handle.net/10220/25889
_version_ 1811682330628063232
author Chen, Lihui
Wang, Yangtao
Mei, Jian-Ping
author2 School of Electrical and Electronic Engineering
author_facet School of Electrical and Electronic Engineering
Chen, Lihui
Wang, Yangtao
Mei, Jian-Ping
author_sort Chen, Lihui
collection NTU
description Data is growing at an unprecedented rate in commercial and scientific areas. Clustering algorithms for large data which require small memory consumption and scalability become increasingly important under this circumstance. In this paper, we propose a new clustering approach called stochastic gradient based fuzzy clustering(SGFC) which achieves the optimization based on stochastic approximation to handle such kind of large data. We derive an adaptive learning rate which can be updated incrementally and maintained automatically in gradient descent approach employed in SGFC. Moreover, SGFC is extended to a mini-batch SGFC to reduce the stochastic noise. Additionally, multi-pass SGFC is also proposed to improve the clustering performance. Experiments have been conducted on synthetic data to show the effectiveness of our derived adaptive learning rate. Experimental studies have been also conducted on several large benchmark datasets including real world image and document datasets. Compared with existing fuzzy clustering approaches for large data, the mini-batch SGFC shows comparable or better accuracy with significant less time consumption. These results demonstrate the great potential of SGFC for large data analysis.
first_indexed 2024-10-01T03:55:08Z
format Conference Paper
id ntu-10356/104522
institution Nanyang Technological University
language English
last_indexed 2024-10-01T03:55:08Z
publishDate 2015
record_format dspace
spelling ntu-10356/1045222020-03-07T13:24:51Z Stochastic gradient descent based fuzzy clustering for large data Chen, Lihui Wang, Yangtao Mei, Jian-Ping School of Electrical and Electronic Engineering 2014 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) DRNTU::Engineering::Electrical and electronic engineering::Electronic systems Data is growing at an unprecedented rate in commercial and scientific areas. Clustering algorithms for large data which require small memory consumption and scalability become increasingly important under this circumstance. In this paper, we propose a new clustering approach called stochastic gradient based fuzzy clustering(SGFC) which achieves the optimization based on stochastic approximation to handle such kind of large data. We derive an adaptive learning rate which can be updated incrementally and maintained automatically in gradient descent approach employed in SGFC. Moreover, SGFC is extended to a mini-batch SGFC to reduce the stochastic noise. Additionally, multi-pass SGFC is also proposed to improve the clustering performance. Experiments have been conducted on synthetic data to show the effectiveness of our derived adaptive learning rate. Experimental studies have been also conducted on several large benchmark datasets including real world image and document datasets. Compared with existing fuzzy clustering approaches for large data, the mini-batch SGFC shows comparable or better accuracy with significant less time consumption. These results demonstrate the great potential of SGFC for large data analysis. Accepted version 2015-06-12T03:53:14Z 2019-12-06T21:34:27Z 2015-06-12T03:53:14Z 2019-12-06T21:34:27Z 2014 2014 Conference Paper Wang, Y., Chen, L., & Mei, J.-P. (2014). Stochastic gradient descent based fuzzy clustering for large data. 2014 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), 2511-2518. https://hdl.handle.net/10356/104522 http://hdl.handle.net/10220/25889 10.1109/FUZZ-IEEE.2014.6891755 en © 2015 Institute of Electrical and Electronics Engineers (IEEE). application/pdf
spellingShingle DRNTU::Engineering::Electrical and electronic engineering::Electronic systems
Chen, Lihui
Wang, Yangtao
Mei, Jian-Ping
Stochastic gradient descent based fuzzy clustering for large data
title Stochastic gradient descent based fuzzy clustering for large data
title_full Stochastic gradient descent based fuzzy clustering for large data
title_fullStr Stochastic gradient descent based fuzzy clustering for large data
title_full_unstemmed Stochastic gradient descent based fuzzy clustering for large data
title_short Stochastic gradient descent based fuzzy clustering for large data
title_sort stochastic gradient descent based fuzzy clustering for large data
topic DRNTU::Engineering::Electrical and electronic engineering::Electronic systems
url https://hdl.handle.net/10356/104522
http://hdl.handle.net/10220/25889
work_keys_str_mv AT chenlihui stochasticgradientdescentbasedfuzzyclusteringforlargedata
AT wangyangtao stochasticgradientdescentbasedfuzzyclusteringforlargedata
AT meijianping stochasticgradientdescentbasedfuzzyclusteringforlargedata