The study of Gaussian mixture distributions goes back to the late 19th century, when Pearson introduced the method of moments to analyze the statistics of a crab population. They have since become one of the most popular tools of modeling and data analysis, extensively used in speech recognition and other fields. Yet their properties are still not well understood. Widely used algorithms, such as Expectation Maximization (EM) often fail even on simple generated data and their theoretical properties are often unclear.
In my talk I will discuss some theoretical aspects of the problem of learning Gaussian mixtures. In particular, I will discuss our recent result, which, in a certain sense, completes work on an active recent topic in theoretical computer science by establishing quite general conditions for polynomial learnability of mixtures by using techniques based on techniques of semi-algebraic geometry.
If time permits, I will also discuss a set of practical algorithms based on spectral methods, which can potentially overcome some of the shortcomings of Expectation Maximization.