University of Belgrade Phaidra

Naslov (eng)

Creating Domain Dictionaries for Serbian Language

Autor

Kajan, Ejub
Ljajić, Adela
Marovac, Ulfeta
Avdić, Aldina

Opis (eng)

Abstract: Automatically created thesauruses are used in order to improve methods for clustering, mining and determining the sentiments of some specific data corpus. There are different methods for the automatic discovering of similar words. Some of them are based on text corpora and mathematical similarity measures, while others use graphs and monolingual dictionaries. Serbian language is the richer than the English, by vocabulary and grammatical issues. Known methods for automatic thesaurus generation may neglect some of these specific issues. This paper deals with a method for automatic generation of a thesaurus from the repositories of documents in the Serbian language based on mathematical methods such as chi-square test, cosine similarity and Jaccard similarity coefficient. The proposed method can be applied either to normalized or non-normalized documents.

Jezik

engleski

Datum

2016

Licenca

Deo kolekcije (1)

o:28516

Radovi nastavnika i saradnika Državnog univerziteta u Novom Pazaru

Identifikatori

https://phaidrabg.bg.ac.rs/o:33327

Svi metapodaci

Naslov (eng)

Autor

Opis (eng)

Jezik

Datum

Licenca

Deo kolekcije (1)

Identifikatori

Vlasnik

Vrste objekata

PDF DOCUMENT (PDF)

Verzija

Linkovi objekta