High-Precision Person Name Extraction from Turkish Texts Using Wikipedia

Kucuk D., Kucuk D.

20th International Conference on Applications of Natural Language to Information Systems (NLDB), Passau, Almanya, 17 - 19 Haziran 2015, cilt.9103, ss.347-354, (Tam Metin Bildiri)

Yayın Türü: Bildiri / Tam Metin Bildiri
Cilt numarası: 9103
Doi Numarası: 10.1007/978-3-319-19581-0_31
Basıldığı Şehir: Passau
Basıldığı Ülke: Almanya
Sayfa Sayıları: ss.347-354
Gazi Üniversitesi Adresli: Evet

Özet

In this paper, we focus on person name extraction from diverse text types in Turkish and have compiled a large set of person names from Turkish Wikipedia. After automated post-processing to clean and extend it, we have performed extraction experiments using this resource on data sets of considerable sizes and achieved high precision rates. Next, we have shown that the use of non-local dependencies together with this Wikipedia resource improves recall, and hence F-Measure, considerably. Finally, we have tested the contribution of the resource and the scheme based on non-local dependencies to the person name extraction performance of a full-fledged named entity recognizer.