Off-line Deduplication Method for Solid-State Disk Based on Hot and Cold Data

Solid-state disk (SSD) deduplication refers to the identification and deletion of duplicate data stored in an SSD. The reliability of SSDs is improved by deduplication. At present, the common data deduplication of SSDs is based on online data deduplication with Field Programmable Gate Array (FPGA) a...

Full description

Bibliographic Details
Main Authors: Xin Ye, Zhengjun Zhai, Xiaochang Li
Format: Article
Language:English
Published: Faculty of Mechanical Engineering in Slavonski Brod, Faculty of Electrical Engineering in Osijek, Faculty of Civil Engineering in Osijek 2020-01-01
Series:Tehnički Vjesnik
Subjects:
Online Access:https://hrcak.srce.hr/file/343898
_version_ 1797207316083245056
author Xin Ye
Zhengjun Zhai
Xiaochang Li
author_facet Xin Ye
Zhengjun Zhai
Xiaochang Li
author_sort Xin Ye
collection DOAJ
description Solid-state disk (SSD) deduplication refers to the identification and deletion of duplicate data stored in an SSD. The reliability of SSDs is improved by deduplication. At present, the common data deduplication of SSDs is based on online data deduplication with Field Programmable Gate Array (FPGA) acceleration. The disadvantage is that FPGA, which has a complex structure. An off-line deduplication method for the SSD based on hot and cold data was proposed in this study to simplify the structure of an SSD deduplication, reduce the cost, and improve the efficiency of deduplication and access performance of SSDs. First, the wear-leveling algorithm was employed in the SSD to divide the data into cold and hot. Then, the corresponding fingerprint was generated for the cold data. Second, the fingerprint was compared, and the cold data with the same fingerprint were deleted. Finally, the cold and hot data were exchanged after deduplication. Results demonstrate that the duplicate recognition rate of the proposed method is 5% - 38%, which is close to that of the online deduplication method. In terms of access performance, the performance of SSDs using the proposed method is improved by 20% compared with that of traditional SSDs and is near the access performance of SSDs using online deduplication. This study provides certain reference for improving the reliability of existing SSDs.
first_indexed 2024-04-24T09:20:58Z
format Article
id doaj.art-fa6435999dcd4f81928fb90a4ac5730b
institution Directory Open Access Journal
issn 1330-3651
1848-6339
language English
last_indexed 2024-04-24T09:20:58Z
publishDate 2020-01-01
publisher Faculty of Mechanical Engineering in Slavonski Brod, Faculty of Electrical Engineering in Osijek, Faculty of Civil Engineering in Osijek
record_format Article
series Tehnički Vjesnik
spelling doaj.art-fa6435999dcd4f81928fb90a4ac5730b2024-04-15T16:07:54ZengFaculty of Mechanical Engineering in Slavonski Brod, Faculty of Electrical Engineering in Osijek, Faculty of Civil Engineering in OsijekTehnički Vjesnik1330-36511848-63392020-01-0127236837310.17559/TV-20191219154709Off-line Deduplication Method for Solid-State Disk Based on Hot and Cold DataXin Ye0Zhengjun Zhai1Xiaochang Li2School of Computer Science and Engineering, Northwestern Polytechnical University Xi’an, 127 West Youyi Road, Beilin District, Xi'an Shaanxi, 710072, P. R. ChinaSchool of Computer Science and Engineering, Northwestern Polytechnical University Xi’an, 127 West Youyi Road, Beilin District, Xi'an Shaanxi, 710072, P. R. ChinaSchool of Computer Science and Engineering, Northwestern Polytechnical University Xi’an, 127 West Youyi Road, Beilin District, Xi'an Shaanxi, 710072, P. R. ChinaSolid-state disk (SSD) deduplication refers to the identification and deletion of duplicate data stored in an SSD. The reliability of SSDs is improved by deduplication. At present, the common data deduplication of SSDs is based on online data deduplication with Field Programmable Gate Array (FPGA) acceleration. The disadvantage is that FPGA, which has a complex structure. An off-line deduplication method for the SSD based on hot and cold data was proposed in this study to simplify the structure of an SSD deduplication, reduce the cost, and improve the efficiency of deduplication and access performance of SSDs. First, the wear-leveling algorithm was employed in the SSD to divide the data into cold and hot. Then, the corresponding fingerprint was generated for the cold data. Second, the fingerprint was compared, and the cold data with the same fingerprint were deleted. Finally, the cold and hot data were exchanged after deduplication. Results demonstrate that the duplicate recognition rate of the proposed method is 5% - 38%, which is close to that of the online deduplication method. In terms of access performance, the performance of SSDs using the proposed method is improved by 20% compared with that of traditional SSDs and is near the access performance of SSDs using online deduplication. This study provides certain reference for improving the reliability of existing SSDs.https://hrcak.srce.hr/file/343898cold data and hot datadeduplicationfingerprintoff-linesolid-state disk
spellingShingle Xin Ye
Zhengjun Zhai
Xiaochang Li
Off-line Deduplication Method for Solid-State Disk Based on Hot and Cold Data
Tehnički Vjesnik
cold data and hot data
deduplication
fingerprint
off-line
solid-state disk
title Off-line Deduplication Method for Solid-State Disk Based on Hot and Cold Data
title_full Off-line Deduplication Method for Solid-State Disk Based on Hot and Cold Data
title_fullStr Off-line Deduplication Method for Solid-State Disk Based on Hot and Cold Data
title_full_unstemmed Off-line Deduplication Method for Solid-State Disk Based on Hot and Cold Data
title_short Off-line Deduplication Method for Solid-State Disk Based on Hot and Cold Data
title_sort off line deduplication method for solid state disk based on hot and cold data
topic cold data and hot data
deduplication
fingerprint
off-line
solid-state disk
url https://hrcak.srce.hr/file/343898
work_keys_str_mv AT xinye offlinededuplicationmethodforsolidstatediskbasedonhotandcolddata
AT zhengjunzhai offlinededuplicationmethodforsolidstatediskbasedonhotandcolddata
AT xiaochangli offlinededuplicationmethodforsolidstatediskbasedonhotandcolddata