Highly Likely Clusterable Data With No Cluster
Data generated as part of a real-life experiment is often quite organized. So much so that, in many cases, projecting the data onto a random line has a high probability of uncovering a clear division of the data into two well-separated groups. In other words, the data can be clustered with a high probability of success using a hyperplane whose normal vector direction is picked at random. We call such data ``highly likely clusterable. The clusters obtained in this fashion often do not seem compatible with a cluster structure in the original space. In fact, the data in the original space may not contain any cluster at all. This talk is about this surprising phenomenon. We will discuss empirical ways to detect it as well as how to exploit it to cluster datasets, especially datasets consisting of a small number of points in a high-dimensional space. We will also present a possible mathematical model that would explain this observed phenomenon. This is joint work with Alden Bradford (Purdue Math), Sangchun Han (Purdue ECE, now at Google) and Tarun Yellamraju (Purdue ECE, now at Qualcomm).
Mireille (Mimi) Boutin graduated with a bachelors degree in Physics-Mathematics from the University of Montreal. She received the Ph.D. degree in Mathematics from the University of Minnesota under the direction of Peter J. Olver. She joined Purdue University after a post-doctorate with David Mumford, David Cooper, and Ben Kimia at Brown University, Rhode Island, followed by a post-doctorate with Stefan Muller at the Max Plank Institute for Mathematics in the Sciences in Leipzig, Germany. She is currently an Associate Professor in the School of Electrical and Computer Engineering, with a courtesy appointment in the Department of Mathematics. Her research is in the area of signal processing, machine learning, and applied mathematics. She is a three-time recipient of Purdues Seed for Success Award. She is also a recipient of the Eta Kappa Nu Outstanding Faculty Award, the Eta Kappa Nu Outstanding Teaching Award and the Wilfred Duke Hesselberth Award for Teaching Excellence.