Florida International University
Edit Your Profile
FIU Discovery
Toggle navigation
Browse
Home
People
Organizations
Scholarly & Creative Works
Research Facilities
Support
Edit Your Profile
Affinity-based similarity measure for Web document clustering
Conference
Shyu, ML, Chen, SC, Chen, M
et al
. (2004). Affinity-based similarity measure for Web document clustering .
247-252.
Share this citation
Twitter
Email
Shyu, ML, Chen, SC, Chen, M
et al
. (2004). Affinity-based similarity measure for Web document clustering .
247-252.
Copy Citation
Share
Overview
Additional Document Info
View All
Overview
cited authors
Shyu, ML; Chen, SC; Chen, M; Rubin, SH
authors
Chen, Shu-Ching
abstract
Compared to the regular documents, the major distinguishing characteristics of the Web documents is the dynamic hyper-structure. Thus, in addition to terms or keywords for regular document clustering, Web document clustering can incorporate some dynamic information such as the hyperlinks and the access patterns extracted from the user query logs. In this paper, we extend the concept of document clustering into Web document clustering by introducing the strategy of affinity-based similarity measure, which utilizes the user access patterns in determining the similarities among Web documents via a probabilistic model. Several comparison experiments are conducted using a real data set and the experimental results demonstrate that the proposed similarity measure outperforms the Cosine coefficient and the Euclidean distance method under different document clustering algorithms. © 2004 IEEE.
publication date
December 1, 2004
Additional Document Info
start page
247
end page
252