Kd-Tree, Adaptıve Radıus And Feature Selectıon (Kd-Arfs Stream) Based Real Tıme Data Stream Clusterıng


Thesis Type: Doctorate

Institution Of The Thesis: Gazi Üniversitesi, Fen Bilimleri Enstitüsü, Turkey

Approval Date: 2019

Student: ALİ ŞENOL

Supervisor: HACER KARACAN

Open Archive Collection: AVESIS Open Access Collection

Abstract:

In classical data clustering approaches, data is static. It is possible to store the data and process it again and again. However, in the today's technology, in which the data is very fast, it is needed to process the data while it is being streamed and results should be shown to the user whenever the user want. In this sense, demand for data stream clustering approaches, which meet the needs, is increasing day by day. Because data stream clustering approaches are fast, have once read ability and can adapt themselves to new data. In other words, while data is streaming on the other hand, results can be shown to the user on the one hand. In this thesis, KD-ARFS Stream algorithm, which clusters streaming data in real-time is proposed. The proposed approach takes its power from kd-tree, which supports multidimensionality, standard deviation based feature selection and adaptive radius. In order to present the success of KD-ARFS Stream algorithm, it is compared with SE-Stream, pcStream, CEDAS and DPStream algorithms in aspects of consumed time and clustering quality. Experimental results have shown that the KD-ARFS Stream algorithm provides better clustering quality in a reasonable time.