Automatic Extraction of Facets for User Queries [AEFUQ] Conference

Ramya, RS, Raju, N, Sejal, N et al. (2019). Automatic Extraction of Facets for User Queries [AEFUQ] . 10.1109/ICInPro47689.2019.9092063

cited authors

  • Ramya, RS; Raju, N; Sejal, N; Venugopal, KR; Iyengar, SS; Patnaik, LM

authors

abstract

  • A user query facet is a collection of items that summarizes the content covered by a query. In general, the most significant information of a user query is present in the top retrieved document that are in the form of lists. In this work, we propose a framework Automatic Extraction of Facets for User Queries [AEFUQ] that extract the user query facets automatically by grouping the list based on three categories namely HTML tags, free text patterns and repeat regions. Grouping of the list is based on domain sites present in the list. We observe that some of the lists are not relevant for extracting the facets. In order to prune these lists, the importance of each item in the lists that are present in the group G is evaluated and Cosine Similarity (CS) between two items is calculated. Further, based on CS score obtained, High Quality Clustering (HQC) algorithm is proposed to cluster the items that has the most number of point in each iteration to obtain more number of facets. Finally, the top most items from each cluster are provided as the best facets for the user query. Experiments are conducted on User Q and Random Q dataset. It is observed that the proposed method AEFUQ outperforms by providing a large number of useful query facets compared to QDMiner method [1].

publication date

  • December 1, 2019

Digital Object Identifier (DOI)

International Standard Book Number (ISBN) 13