Data analysis through social media according to the classified crime

Savas, Serkan; TOPALOĞLU, NURETTİN

doi:10.3906/elk-1712-17

Data analysis through social media according to the classified crime

TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, cilt.27, sa.1, ss.407-420, 2019 (SCI-Expanded, Scopus, TRDizin)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 27 Sayı: 1
Basım Tarihi: 2019
Doi Numarası: 10.3906/elk-1712-17
Dergi Adı: TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, TR DİZİN (ULAKBİM)
Sayfa Sayıları: ss.407-420
Açık Arşiv Koleksiyonu: AVESİS Açık Erişim Koleksiyonu
Gazi Üniversitesi Adresli: Evet

Özet

The amount and variety of data generated through social media sites has increased along with the widespread use of social media sites. In addition, the data production rate has increased in the same way. The inclusion of personal information within these data makes it important to process the data and reach meaningful information within it. This process can be called intelligence and this meaningful information may be for commercial, academic, or security purposes. An example application is developed in this study for intelligence on Twitter. Crimes in Turkey are classified according to Turkish Statistical Institute criminal data and keywords are defined according to this data. A total of 150,000 tweet data in the Turkish language are collected from Twitter between specified dates and processed by Turkish Zemberek natural language processing. It is seen that 56% of the people are talking about terrorist attacks and bombing attacks on the study dates. The words "bomb," "terror," "attack," "organization", and "explode" have percentages of 24%, 12%, 8%, 6%, and 6%, respectively. Moreover, associations between words and situations are found. Correlations are important to create new subclusters like "terror" and "rape" in this study with 0.90 correlation. Bigger masses can be accessible by expanding keyword groups to have a clear picture of the real situation.