Kernel Conditional Clustering

Xiao He, Thomas Gumbsch, Damian Roqueiro and Karsten Borgwardt

Kernel Conditional Clustering

Summary

Clustering results are often affected by covariates that are independent of the clusters one would like to discover. Traditionally, Alternative Clustering algorithms can be used to solve such a problem. However, these algorithms suffer from at least one of the following problems: i) continuous covariates or non-linearly separable clusters cannot be handled; ii) assumptions are made about the distribution of the data; iii) one or more hyper-parameters need to be set.

In this project we propose a novel algorithm, named Kernel Conditional Clustering (KCC), whose objective is derived from a kernel based conditional dependence measure. We show that KCC is parameter-light and makes no assumptions about the cluster structure, the covariates, or the distribution of the data. On both simulated and real-world datasets, the proposed KCC algorithm detects the ground truth cluster structures more accurately than state-of-the-art alternative clustering methods..

Code

A beta version of code is available in our GitHub repository external page here.

Reference

Kernel Conditional Clustering

Xiao He, Thomas Gumbsch, Damian Roqueiro and Karsten Borgwardt
IEEE International Conference on Data Mining (ICDM) 2017, 157-166
external page Online | ETH Research Collection | Project page | external page GitHub

Contact Xiao He for questions regarding usage or reporting bugs.