High-Precision Person Name Extraction from Turkish Texts Using Wikipedia


Kucuk D., Kucuk D.

20th International Conference on Applications of Natural Language to Information Systems (NLDB), Passau, Germany, 17 - 19 June 2015, vol.9103, pp.347-354 identifier identifier

  • Publication Type: Conference Paper / Full Text
  • Volume: 9103
  • Doi Number: 10.1007/978-3-319-19581-0_31
  • City: Passau
  • Country: Germany
  • Page Numbers: pp.347-354
  • Gazi University Affiliated: Yes

Abstract

In this paper, we focus on person name extraction from diverse text types in Turkish and have compiled a large set of person names from Turkish Wikipedia. After automated post-processing to clean and extend it, we have performed extraction experiments using this resource on data sets of considerable sizes and achieved high precision rates. Next, we have shown that the use of non-local dependencies together with this Wikipedia resource improves recall, and hence F-Measure, considerably. Finally, we have tested the contribution of the resource and the scheme based on non-local dependencies to the person name extraction performance of a full-fledged named entity recognizer.