Naslov (eng)

Comparison of the influence of different normalization methods on tweet sentiment analysis in the Serbian language

Autor

Ljajić, Adela
Marovac, Ulfeta
Stanković, Milena

Opis (eng)

Abstract: Given the growing need to quickly process texts and extract information from the data for various purposes, correct normalization that will contribute to better and faster processing is of great importance. The paper presents the comparison of different methods of short text (tweet) normalization. The comparison is illustrated by the example of text sentiment analysis. The results of an application of different normalizations are presented, taking into account time complexity and sentiment algorithm classification accuracy. It has been shown that using cutting to n-gram normalization, better or similar results are obtained compared to language-dependent normalizations. Including the time complexity, it is concluded that the application of this language independent normalization gives optimal results in the classification of short informal texts.

Jezik

engleski

Datum

2018

Licenca

© All rights reserved

Predmet

Keywords: sentiment analysis, normalization, Serbian language

Deo kolekcije (1)

o:28516 Radovi nastavnika i saradnika Državnog univerziteta u Novom Pazaru