An Instant and Accurate Size Estimation Method for Joins and Selections in a Retrieval-Intensive Environment Article

Sun, W, Ling, Y, Rishe, N et al. (1993). An Instant and Accurate Size Estimation Method for Joins and Selections in a Retrieval-Intensive Environment . 22(2), 79-88. 10.1145/170036.170055

cited authors

  • Sun, W; Ling, Y; Rishe, N; Deng, Y

authors

abstract

  • This paper proposes a novel strategy for estimating the size of the resulting relation after an equi-join and selection using a regression model. An approximating series representing the underlying data distribution and dependency is derived from the actual data. The proposed method provides an instant and accurate size estimation by performing an evaluation of the series, with no run-time overheads in page faults and space, and with negligible CPU overhead. In contrast, the popular sampling methods incur run-time overheads in page faults 1993, CPU time and space. These overheads of sampling methods increase the response time of processing a query. The results of a comprehensive experimental study are also reported, which demonstrate that the estimation accuracy by the proposed method is comparable with that of the sampling methods which are believed to provide the most accurate estimation. The proposed method seems ideal for retrieval-intensive database and information systems. Since the overheads involved in deriving the approximating series are fairly moderate, we believe that this method is also an extremely competent method when moderate or periodical updates are present. © 1993, ACM. All rights reserved.

publication date

  • January 6, 1993

Digital Object Identifier (DOI)

start page

  • 79

end page

  • 88

volume

  • 22

issue

  • 2