Homework Assignment #5  (Due Monday, October 27, 2008)

Estimating a Probability Density by using a Mixture of Gaussians.

The data you will be using is the Peterson & Barney vowel data. The data file verified_pb.data contains the vowel formant frequencies. The Matlab file pb.m will take the data and plot the following figure:

Your job is to estimate the probability density of each of the 10 vowel clusters using a Mixture of Gaussians. A good description of using the Expectation Maximization (EM) algorithm to estimate the parameters (mean, covariance, pi) of the multivariate Gaussians can be found in the paper "Using EM to Estimate A Probability Density With a Mixture of Gaussians".

Note: If you want to plot the IPA vowel chart option in pb.m, you will need the following file IPA_Vowel_chart_2005.png

Create a Matlab function mog.m that has the following I/O:

function [muv,covm,piv]=mog(x,M)
%---------------------------------------------------------
% This function implements the mixture of Gaussians to
% model a probability density function. The parameters are
% estimated using the EM algorithm
% Based on the paper "Using EM to Estimate A Probability Density
% With A Mixture of Gaussians" by Aaron A. D'Souza (adsouza@usc.edu)
%
% Input:
% x - data where each row is an observation and the number of
% columns is the dimensionality of the data
% M - the model order
%
% Output: muv - cell array of the mean vectors (indexed by model number)
% covm - cell array of the covariance matricies (indexed by model
% number)
% piv - array of pi values (indexed by model number)
%
%--------------------------------------------------------------------------

The function mog.m should implement Algorithm 1 found on page 6 of the paper.

Create another function mog_contour.m that will plot the contours of the Mixture of Gaussians that accepts the following inputs:

function mog_contour(muv,covm,piv,span,step,c)
%--------------------------------------------------------------------------
% Plots Mixture of Gaussion contour on the current figure (2D)
%
% Input:
% muv - cell array of the mean vectors (indexed by model number)
% covm - cell array of the covariance matricies (indexed by model
% number)
% piv - array of pi values (indexed by model number)
% span - usually axis from current figure, i.e [XMIN XMAX YMIN YMAX]
% step - density of meshgrid
% color - color of contour
% Output:
% None (just plots)
%---------------------------------------------------

Notice that these two function are commented out in pb.m. You just need to uncomment these functions and run the analysis and you will get someting similar to the following figures.

Result with Mixture of Gaussions with model order M = 1 (single Gaussian).

Result with Mixture of Gaussions with model order M = 2. Note: your result may be differenct based on your initialization choice.

Submit your Matlab code as well as the two figures (M=1,2) with a brief write-up describing the EM algorithm. You can do it for any of the speaker types (male, female, child). The examples above were based on the male speakers.

Note 1: Matlab functions you will probably want to use: mean, cov, det, inv, meshgrid, contour.

Note 2: When submitting work, please name each file starting with your inititals (for example, I would name a file RKS_....). This way it will be easier for me to keep track of your submissions.