Web based machine learning for language identification and translation

Creative Commons License


6th International Conference on Machine Learning and Applications, Ohio, United States Of America, 13 - 15 December 2007, pp.280-285 identifier identifier

  • Publication Type: Conference Paper / Full Text
  • Volume:
  • Doi Number: 10.1109/icmla.2007.115
  • City: Ohio
  • Country: United States Of America
  • Page Numbers: pp.280-285
  • Gazi University Affiliated: Yes


Language identification is an important task for web information retrieval services. This paper presents the implementation of a platform for language identification in multi-lingual documents on web. The platform consists of five modules to achieve the tasks automatically. Furthermore, artificial neural networks were used for the identification of languages in multi-lingual documents. Results for six languages including Turkish, French, Italian, Danish and Deutsch are present. The major benefit of the approach is that the ANN based language identification system could meet the expectations in real-time language identification accuracy with the help of a developed system. Experiments have shown that system achieves the tasks in high accuracy in discriminating different languages and converting them other languages on web pages.