JOURNAL OF THE FACULTY OF ENGINEERING AND ARCHITECTURE OF GAZI UNIVERSITY, cilt.25, sa.3, ss.483-494, 2010 (SCI-Expanded)
This study presents new methods to identify web contents, containing MS Word and HTML documents in different languages and to translate them into specified languages. The identification problem can be seen as a specific instance of the more general problem of an item classification though its attributes. This novel approach is based on artificial neural network model to recognize the languages. Documents belonging to 15 languages were used in test. The results have shown that the approach presented in this work is very successful to meet the expectations in real-time language identification accuracy and translate into 40 different languages with the help of a developed platform. It is expected that this study will help users to use internet more effectively.