Mixture models for high-dimensional clustering with applications to tumour classification, network intrusion, and text classification (2008–2010)

Cluster analysis is primarily used for finding groups in data of unknown structure. Its key applications can now involve data of very high dimension but only a limited number of experimental units. The aim of the project is to develop a mixture model-based framework for the clustering of high-dimensional data that can also handle feature selection, the choice of the number of clusters and their validation, the detection of outliers, and the use of labelled data in a semisupervised context. Key applications in medicine and technology will be studied, aiming at improved clustering performance and understanding of the underlying process.
Grant type:
ARC Discovery Projects
Funded by:
Australian Research Council