Feature Relevance in Ward’s Hierarchical Clustering Using the Lp Norm

Cordeiro De Amorim, Renato (2015) Feature Relevance in Ward’s Hierarchical Clustering Using the Lp Norm. Journal of Classification, 32 (1). pp. 46-62. ISSN 0176-4268
Copy

In this paper we introduce a new hierarchical clustering algorithm called Ward p . Unlike the original Ward, Ward p generates feature weights, which can be seen as feature rescaling factors thanks to the use of the L p norm. The feature weights are cluster dependent, allowing a feature to have different degrees of relevance at different clusters. We validate our method by performing experiments on a total of 75 real-world and synthetic datasets, with and without added features made of uniformly random noise. Our experiments show that: (i) the use of our feature weighting method produces results that are superior to those produced by the original Ward method on datasets containing noise features; (ii) it is indeed possible to estimate a good exponent p under a totally unsupervised framework. The clusterings produced by Ward p are dependent on p. This makes the estimation of a good value for this exponent a requirement for this algorithm, and indeed for any other also based on the Lp norm.


picture_as_pdf
MW_Ward.pdf
subject
Submitted Version
Available under Creative Commons: BY-NC-ND 4.0

View Download

Atom BibTeX OpenURL ContextObject in Span OpenURL ContextObject Dublin Core MPEG-21 DIDL Data Cite XML EndNote HTML Citation METS MODS RIOXX2 XML Reference Manager Refer ASCII Citation
Export

Downloads