Web based machine learning for language identification and translation


Creative Commons License

SAĞIROĞLU Ş., YAVANOĞLU U., Guven E. N.

6th International Conference on Machine Learning and Applications, Ohio, Amerika Birleşik Devletleri, 13 - 15 Aralık 2007, ss.280-285 identifier identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Cilt numarası:
  • Doi Numarası: 10.1109/icmla.2007.115
  • Basıldığı Şehir: Ohio
  • Basıldığı Ülke: Amerika Birleşik Devletleri
  • Sayfa Sayıları: ss.280-285
  • Gazi Üniversitesi Adresli: Evet

Özet

Language identification is an important task for web information retrieval services. This paper presents the implementation of a platform for language identification in multi-lingual documents on web. The platform consists of five modules to achieve the tasks automatically. Furthermore, artificial neural networks were used for the identification of languages in multi-lingual documents. Results for six languages including Turkish, French, Italian, Danish and Deutsch are present. The major benefit of the approach is that the ANN based language identification system could meet the expectations in real-time language identification accuracy with the help of a developed system. Experiments have shown that system achieves the tasks in high accuracy in discriminating different languages and converting them other languages on web pages.