THE MVV METHOD FOR DETECTING OUTLIERS IN MULTIVARIATE DATA AND A COMPARISON WITH THE OTHER METHODS


Thesis Type: Postgraduate

Institution Of The Thesis: Gazi Üniversitesi, Fen Bilimleri Enstitüsü, Turkey

Approval Date: 2010

Student: KÜBRA TURGUT

Supervisor: OSMAN UFUK EKİZ

Abstract:

Outliers in multivariate data sets can be hard to detect especially when the number of variables exceeds two. Therefore, various methods have been suggested that based on robust estimation of location and covariance matrix. Although these methods are influential, they are cumbersome for large and high dimension data sets. Computational complexity of these methods increases when the dimension of the data sets is getting increases. The aim of this study is to present the Minimum Vector Variance method which is developed as an alternative to the methods like Minimum Volume Ellipsoid, Minimum Covariance Determinant and Fast Minimum Covariance Determinant that are used in identifying multiple outliers in multivariate data. In this thesis, firstly outlier and breakdown point concepts have been given, secondly the above mentioned methods have beeen introduced and finally the Minimum Vector Variance method and other methods have been compared with a simulation study with respect to the ratio of outlier detection and the computation speed of the methods. As a result, it has been found that the Minimum Vector Variance method is applicable to large and high dimension data sets and the computational complexity of this algorithm is significantly smaller than that of other methods.