BEARdocs

PG-means: learning the number of clusters in data.

DSpace/Manakin Repository

Show simple item record

dc.contributor.advisor Hamerly, Gregory James, 1977-
dc.contributor.author Feng, Yu.
dc.contributor.other Baylor University. Dept. of Computer Science. en
dc.date.copyright 2006-12
dc.identifier.uri http://hdl.handle.net/2104/5021
dc.description.abstract We present a novel algorithm called PG-means in this thesis. This algorithm is able to determine the number of clusters in a classical Gaussian mixture model automatically. PG-means uses efficient statistical hypothesis tests on one-dimensional projections of the data and model to determine if the examples are well represented by the model. In so doing, we apply a statistical test to the entire model at once, not just on a per-cluster basis. We show that this method works well in difficult cases such as overlapping clusters, eccentric clusters and high dimensional clusters. PG-means also works well on non-Gaussian clusters and many true clusters. Further, the new approach provides a much more stable estimate of the number of clusters than current methods. en
dc.rights Baylor University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. Contact librarywebmaster@baylor.edu for inquiries about permission. en
dc.subject Algorithms. en
dc.subject Computer network architecture. en
dc.title PG-means: learning the number of clusters in data. en
dc.type Thesis en
dc.description.degree M.S. en
dc.rights.accessrights Worldwide access en
dc.contributor.department Computer Science. en


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search BEARdocs


Advanced Search

Browse

My Account

Statistics