Fast Storage System for Time-series Big Data Streams Based on Waterwheel Model

With the rapid development of the Internet of Things,the scale of sensor deployment has been growing in recent years.Large-scale sensors generate massive streaming data every second,and the value of the data decreases over time.Therefore,the storage system needs to be able to withstand the write pre...

Full description

Bibliographic Details
Main Author: LU Mingchen, LYU Yanqi, LIU Ruicheng, JIN Peiquan
Format: Article
Language:zho
Published: Editorial office of Computer Science 2023-01-01
Series:Jisuanji kexue
Subjects:
Online Access:https://www.jsjkx.com/fileup/1002-137X/PDF/1002-137X-2023-50-1-25.pdf
_version_ 1797845153758576640
author LU Mingchen, LYU Yanqi, LIU Ruicheng, JIN Peiquan
author_facet LU Mingchen, LYU Yanqi, LIU Ruicheng, JIN Peiquan
author_sort LU Mingchen, LYU Yanqi, LIU Ruicheng, JIN Peiquan
collection DOAJ
description With the rapid development of the Internet of Things,the scale of sensor deployment has been growing in recent years.Large-scale sensors generate massive streaming data every second,and the value of the data decreases over time.Therefore,the storage system needs to be able to withstand the write pressure brought by the high-speed arriving streaming data and persist the data as fast as possible for subsequent query and analysis.This poses a considerable challenge to the write performance of the storage system.The fast storage system based on the waterwheel model can meet the fast storage requirements of high-speed time-series data streams in big data application scenarios.The proposed system is deployed between high-speed streaming data and underlying storage nodes,using multiple data buckets to build a logically rotating storage model(similar to the ancient Chinese waterwheel),and coordinating data writing and persisting by controlling the state of each data bucket.Waterwheel sends data buckets to different underlying storage nodes,so that the instantaneous write pressure is evenly distributed to multiple underlying storage nodes,and the write throughput is improved with the help of multi-node parallel writing.The waterwheel model is deployed on a stand-alone version of MongoDB,and compared with the distributed MongoDB in experiments.The results show that the proposed system can effectively improve the write throughput of the system,reduce the write latency,and has good horizontal scalability.
first_indexed 2024-04-09T17:33:58Z
format Article
id doaj.art-b8bcac7b7fa64cb9bfa82d94870b9395
institution Directory Open Access Journal
issn 1002-137X
language zho
last_indexed 2024-04-09T17:33:58Z
publishDate 2023-01-01
publisher Editorial office of Computer Science
record_format Article
series Jisuanji kexue
spelling doaj.art-b8bcac7b7fa64cb9bfa82d94870b93952023-04-18T02:33:09ZzhoEditorial office of Computer ScienceJisuanji kexue1002-137X2023-01-01501253310.11896/jsjkx.220900045Fast Storage System for Time-series Big Data Streams Based on Waterwheel ModelLU Mingchen, LYU Yanqi, LIU Ruicheng, JIN Peiquan0School of Computer Science and Technology,University of Science and Technology of China,Hefei 230027,ChinaWith the rapid development of the Internet of Things,the scale of sensor deployment has been growing in recent years.Large-scale sensors generate massive streaming data every second,and the value of the data decreases over time.Therefore,the storage system needs to be able to withstand the write pressure brought by the high-speed arriving streaming data and persist the data as fast as possible for subsequent query and analysis.This poses a considerable challenge to the write performance of the storage system.The fast storage system based on the waterwheel model can meet the fast storage requirements of high-speed time-series data streams in big data application scenarios.The proposed system is deployed between high-speed streaming data and underlying storage nodes,using multiple data buckets to build a logically rotating storage model(similar to the ancient Chinese waterwheel),and coordinating data writing and persisting by controlling the state of each data bucket.Waterwheel sends data buckets to different underlying storage nodes,so that the instantaneous write pressure is evenly distributed to multiple underlying storage nodes,and the write throughput is improved with the help of multi-node parallel writing.The waterwheel model is deployed on a stand-alone version of MongoDB,and compared with the distributed MongoDB in experiments.The results show that the proposed system can effectively improve the write throughput of the system,reduce the write latency,and has good horizontal scalability.https://www.jsjkx.com/fileup/1002-137X/PDF/1002-137X-2023-50-1-25.pdftime-series big data|streaming data|fast storage|waterwheel model|middleware
spellingShingle LU Mingchen, LYU Yanqi, LIU Ruicheng, JIN Peiquan
Fast Storage System for Time-series Big Data Streams Based on Waterwheel Model
Jisuanji kexue
time-series big data|streaming data|fast storage|waterwheel model|middleware
title Fast Storage System for Time-series Big Data Streams Based on Waterwheel Model
title_full Fast Storage System for Time-series Big Data Streams Based on Waterwheel Model
title_fullStr Fast Storage System for Time-series Big Data Streams Based on Waterwheel Model
title_full_unstemmed Fast Storage System for Time-series Big Data Streams Based on Waterwheel Model
title_short Fast Storage System for Time-series Big Data Streams Based on Waterwheel Model
title_sort fast storage system for time series big data streams based on waterwheel model
topic time-series big data|streaming data|fast storage|waterwheel model|middleware
url https://www.jsjkx.com/fileup/1002-137X/PDF/1002-137X-2023-50-1-25.pdf
work_keys_str_mv AT lumingchenlyuyanqiliuruichengjinpeiquan faststoragesystemfortimeseriesbigdatastreamsbasedonwaterwheelmodel