Workload-based generation of administrator hints for optimizing database storage utilization Article

Dutta, K, Rangaswami, R, Kundu, S. (2008). Workload-based generation of administrator hints for optimizing database storage utilization . ACM TRANSACTIONS ON STORAGE, 3(4), 10.1145/1326542.1326545

cited authors

  • Dutta, K; Rangaswami, R; Kundu, S

abstract

  • Database storage management at data centers is a manual, time-consuming, and error-prone task. Such management involves regular movement of database objects across storage nodes in an attempt to balance the I/O bandwidth utilization across disk drives. Achieving such balance is critical for avoiding I/O bottlenecks and thereby maximizing the utilization of the storage system. However, manual management of the aforesaid task, apart from increasing administrative costs, encumbers the greater risks of untimely and erroneous operations. We address the preceding concerns with STORM, an automated approach that combines low-overhead information gathering of database access and storage usage patterns with efficient analysis to generate accurate and timely hints for the administrator regarding data movement operations. STORM's primary objective is minimizing the volume of data movement required (to minimize potential down-time or reduction in performance) during the reconfiguration operation, with the secondary constraints of space and balanced I/O-bandwidth-utilization across the storage devices. We analyze and evaluate STORM theoretically, using a simulation framework, as well as experimentally. We show that the dynamic data layout reconfiguration problem is NP-hard and we present a heuristic that provides an approximate solution in O(Nlog(N/M) + (N/M)2) time, where M is the number of storage devices and N is the total number of database objects residing in the storage devices. A simulation study shows that the heuristic converges to an acceptable solution that is successful in balancing storage utilization with an accuracy that lies within 7% of the ideal solution. Finally, an experimental study demonstrates that the STORM approach can improve the overall performance of the TPC-C benchmark by as much as 22%, by reconfiguring an initial random, but evenly distributed, placement of database objects. © 2008 ACM.

publication date

  • February 1, 2008

published in

Digital Object Identifier (DOI)

volume

  • 3

issue

  • 4