Summary: | Efficient management of storage is a primary concern in all systems dealing with Big Data. In the modern era, flash-based solid-state drives (SSDs) are widely adopted in computer systems, slowly replacing hard disk drives. As many kinds of data generated and collected these days are not well-structured, a key-value store has become one of the most important building blocks widely used in datacenters thanks to its simple interface. Key-value stores are often used as a internal engine for other databases.
This thesis explores whether a modern flash-based solid-state drive (SSD) augmented with near-storage computations can be re-designed to provide cheaper and power-efficient solution to maintaining various key-value services in the cloud. The thesis explores a new type of storage device, called a key-value SSD (KV-SSD), that exposes a key-value interface instead of the legacy block interface to the host machine.
The two alternative power- and cost-efficient solutions that can replace existing KVS components are based on KV-SSDs, LightStore and PinK. LightStore is a new storage architecture based on a group of network-attached KV-SSDs without storage host servers. LightStore aims to primarily support large-sized objects and emulates other types of data stores using application-side adapters. Compared to existing storage server-based solutions, LightStore is up to 2.3X space- and 7.4X energy-efficient. PinK is a novel design of an LSM-tree for KV-SSDs with software and hardware techniques that provides bounded tail latency and design flexibility. PinK prototype reduces the read and 99th percentile latency by 22% and improves read throughput by 44% compared to LightStore prototype. The PinK prototype showed 42-73% better latency and 37% better throughput compared to commercial hash-based prototype. A proposed future design based on smart SSDs, a block-based SSD with an accelerator, shows how the smart SSDs can help existing software KVS on hosts. We believe these alternatives to running various types of key-value stores in datacenters would reduce storage management cost drastically.
|