From Clustering with Graph Cuts to Isoperimetric Inequalities: Quantitative Convergence Rates of Cheeger Cuts on Data Clouds
Graph cuts have been studied for decades in the mathematics and computer science communities. For example, a celebrated result in optimization relates the cut minimization problem (under some membership constraints) with a maximum flow problem via the well known max flow-min cut duality theorem. Another very important problem formulated in the computer science community that uses graph cuts is motivated by data clustering: while direct minimization of a graph cut is reasonable as it penalizes the size of interfaces, the optimization is not able to rule out partitions of data into groups that are highly asymmetric in terms of size. In order to avoid trivial partitions, and provide a more reasonable clustering approach, the original optimization of graph cuts is modified by adding an extra balancing term to the objective function in either additive or multiplicative form. A canonical example, with historical motivation, is the so called Cheeger cut problem. Minimization of Cheeger cuts for data clustering is on the one hand intuitively motivated, but on the other, is a highly non-convex optimization problem with a pessimistic NP hard label tamped on it (at least in a worst case scenario setting). Nevertheless, in the past decade or so, several algorithmic improvements made the minimization of Cheeger cuts more feasible, and at the same time there was a renewed interest in studying statistical properties of Cheeger cuts. New analytical ideas have provided new tools to attack problems that were elusive using classical approaches from statistics and statistical learning theory. Despite the advances, several questions remain unanswered.
The purpose of this talk is to present some of these theoretical developments, with emphasis on new results where, for the first time, high probability converge rates of Cheeger cuts of proximity graphs over data clouds are deduced. These quantitative convergence rates are obtained by building bridges between the original clustering problem and another field within the mathematical analysis community that has seen enormous advancements in the past few years: quantitative isoperimetric inequalities. This connection serves as a metaphor for how the mathematical analyst may be able to contribute to answer theoretical questions in machine learning, and how one may be able to deduce statistical properties of solutions to learning optimization problems that have a continuum counterpart.