An automatic intelligent language classifier

Verma, Brijesh; Lee, Hong Suk; Zakos, J

An automatic intelligent language classifier

conference contribution

posted on 2017-12-06, 00:00 authored by Brijesh Verma, Hong Suk Lee, J Zakos

The paper presents a novel sentence-based language classifier that accepts a sentence as input and produces a confidence value for each target language. The proposed classifier incorporates Unicode based features and a neural network. The three features Unicode, exclusive Unicode and word matching score are extracted and fed to a neural network for obtaining a final confidence value. The word matching score is calculated by matching words in an input sentence against a common word list for each target language. In a common word list, the most frequently used words for each language are statistically collected and a database is created. The preliminary experiments were performed using test samples from web documents for languages such as English, German, Polish, French, Spanish, Chinese, Japanese and Korean. The classification accuracy of 98.88% has been achieved on a small database.

History

Parent Title

Advances in neuro-information processing : 15th International Conference, ICONIP 2008, Auckland, New Zealand, November 2008, revised selected papers, Part II

Start Page

639

End Page

646

Number of Pages

8

Start Date

2009-01-01

ISBN-13

9783642030390

Location

Bangkok, Thailand

Publisher

Springer-Verlag

Place of Publication

Berlin Heidelberg

Full Text URL

http://dx.doi.org/10.1007/978-3-642-03040-6_78

Peer Reviewed

Yes

Open Access

No

External Author Affiliations

Faculty of Arts, Business, Informatics and Education; Institute for Resource Industries and Sustainability (IRIS); MyCyberTwin;

Era Eligible

Yes

Name of Conference

ICONIP (Conference)

An automatic intelligent language classifier

History

Parent Title

Start Page

End Page

Number of Pages

Start Date

ISBN-13

Location

Publisher

Place of Publication

Full Text URL

Peer Reviewed

Open Access

External Author Affiliations

Era Eligible

Name of Conference

Usage metrics

Categories

Keywords

Licence

Exports