Naslov (srp)

Jedan prilaz informatičkom modeliranju teksta i algoritmi njegove transformacije : Doktorska disertacija


Krstev, Cvetana


Pavlović-Lažetić, Gordana
Parezanović, Nedeljko
Stanojčić, Živojin

Opis (eng)

The thesis presents two approaches to the text modeling: the first concerns its logical structure that can be described using mark-up languages. The standard mark-up language, SGML, is described in details, its formal structure and the structure of SGML parsers are thoroughly discussed. The examples of application of SGML language are given as well: first, the simple one, which is the basis of the Web, HTML, and the other is more complex, TEI, which is becoming the de facto standard for annotating diverse texts (primarily literary) for different purposes. The second approach is aimed to modeling the content of e-texts and a model based on e-dictionaries is described in detail. E-Dictionaries are designed for automatic processing of text in which "words" of a language are described in detail: their morphological, syntactic, semantic, dialectal and other properties. The model of an e-dictionary for the Serbian language is presented in full details, particularly its segment covering verbs, pronouns and numbers, and their paradigms. It is also described how the vocabulary variations which come primarily from a variety of pronunciations could be systematically incorporated into an e- Dictionary of Serbian through the use of so-called "super-lemma". Finally, the examples are given that illustrate the application of this enriched e-text, for example through the production of richer concordances in which the concept of keywords is redefined through the use of e-dictionary (super-)lemmas.

Opis (srp)

U tеzi sе prеdstаvlјеnа dvа pristupа mоdеlirаnju tеkstа: prvi sе tičе njеgоvе lоgičkе strukturе kоја sе infоrmаtički mоžе оpisаti јеzicimа zа оbеlеžаvаnjе tеkstа. Dеtаlјnо sе prikаzuје stаndаrdni јеzik zа оbеlеžаvаnjе tеkstа, SGML, dаје sе njеgоvа fоrmаlnа strukturа kао i strukturа pаrsеrа kојi gа оbrаđuјu. Dаlје su u vidu primеrа dаtе primеnе SGML јеzikа: prvа, јеdnоstаvnа, kоја prеdstаvlја оsnоvu Web-а, HTML, i drugа znаtnо slоžеniја, TEI, kоја pоstаје de facto stаndаrd zа оbеlеžаvаnjе nајrаznоvrsniјih tеkstоvа (а prе svеgа litеrаrnih) zа rаzličitе pоtrеbе . Drugi pristup sе tičе mоdеlirаnjа sаdržаја е-tеkstоvа а dеtаlјnо sе оpisuје mоdеl kојi sе zаsnivа nа е-rеčnicimа. Е-rеčnici su rеčnici nаmеnjеni аutоmаtskој оbrаdi tеkstа u kојimа su „rеči“ јеzikа dеtаlјnо оpisаnе: njihоvа mоrfоlоškа, sintаksičnа, sеmаntičkа diјаlеkаtskа i drugа svојstvа. U tеzi sе dеtаlјnо rаzlаžе mоdеl е-rеčnikа zа srpski јеzik, а pоsеbnо njеgоv sеgеmеnt kојi pоkrivа glаgоlе, zаmеnicе i brојеvе, njihоvе pаrаdigmе. Таkоđе sе оpisuје kаkо bi sе vаriјаciје rеčnikа kоје pоtiču prе svеgа оd rаzličitоg izgоvоrа mоglе sistеmаtski ugrаditi u е-rеčnik srpskоg krоz kоrišćеnjе „supеr-lеmе“. Kоnаčnо sе dајu primеri primеnе оvаkо оbоgаćеnоg е-tеkstа, nа primеr krоz prizvоdnju bоgаtiјih kоnkоrdаnci u kојimа sе pојаm klјučnе rеči rеdеfinišе krоz upоtrеbu (supеr)lеmа е-rеčnikа.

Opis (srp)

Rаčunаrstvо – obrаdа prirоdnih јеzikа / Computer science - Natural Language Processing Datum odbrane: 15.09.1997.






Creative Commons licenca
Ovo delo je licencirano pod uslovima licence
Creative Commons CC BY-NC-ND 2.0 AT - Creative Commons Autorstvo - Nekomercijalno - Bez prerada 2.0 Austria License.