Integrisanje heterogenih leksičkih resursa
Osnovna aktivnost Grupe za obradu prirodnih jezika na Matematičkom fakulteta Univeziteta u Beogradu je usmerena na razvoj različitih resursa za obradu srpskog jezika. Među njima su posebno značajni sistem morfoloških rečnika srpskog jezika razvijenih u okviru mreže RELEX [1] i semantička mreža (tipa wordnet) za srpski jezik razvijena u okviru međunarodnog projekta Balkanet. Radi se o dva heterogena leksička resursa, razvijena na osnovu sasvim različitih modela, koji samim tim sadrže i različite vrste leksičkih informacija. Integracijom ovih resursa, informacije
... Integrisanje heterogenih leksičkih resursa Ranka Stanković, Rudarsko-geološki fakultet, Beograd Cvetana Krstev, Filološki fakultet, Beograd Duško Vitas, Matematički fakultet, Beograd Ivan Obradović, Rudarsko-geološki fakultet, Beograd Gordana Pavlović-Lažetić, Matematički fakultet, Beograd ...
... (2002). BALKANET: A Multilingual Semantic Network for Balkan Languages. Proceedings of 1st International Wordnet Conference, Mysore, India. [4] Vitas, D. et al. (2003). Resources and Basic Tools for the Processing of Serbian Written Texts. Proc. of the Workshop on Balkan Language Resources, 1st ...Ranka Stanković, Cvetana Krstev, Duško Vitas, Ivan Obradović, Gordana Pavlović-Lažetić. "Integrisanje heterogenih leksičkih resursa" in Festivalski katalog 11. Festivala informatičkih dostignuća INFOFEST 2004, 26th September - 2nd October, 2004, Budva, Montenegro, INFOFEST (2004)
Production of morphological dictionaries of multi-word units using a multipurpose tool
The development of a comprehensive morphological dictionary of multi-word units for Serbian is a very demanding task, due to the complexity of Serbian morphology. Manual production of such a dictionary proved to be extremely time-consuming. In this paper we present a procedure that automatically produces dictionary lemmas for a given list of multi-word units. To accomplish this task the procedure relies on data in e-dictionaries of Serbian simple words, which are already well developed. We also offer an evaluation ...electronic dictionary, Serbian, morphology, inflection, multi-word units, noun phrases, query expansion... instance, if the initial query is ‘marka’ and a user chooses to semantically expand his/her query with Serbian wordnet then the system will find, among others, two synsets with appropriate literals: {marka, zaštitni znak, brend} ‘trade name’ and {marka, poštanska marka} ‘postage stamp’. If MWU synset ...
Obradović, Cvetana Krstev, Duško Vitas
... of Philology, Studentski trg 3, 11000 Belgrade, Serbia Email: cvetana@matf.bg.ac.rs Duško Vitas University of Belgrade — Faculty of Mathematics, Studentski trg 16, 11000 Belgrade, Serbia Email: vitas@matf.bg.ac.rs Abstract—In this paper we outline the use of the multipurpose software tool LeXimir ...Ranka Stanković, Ivan Obradović, Cvetana Krstev, Duško Vitas. "Production of morphological dictionaries of multi-word units using a multipurpose tool" in Proceedings of the Computational Linguistics-Applications Conference, October 2011, Jachranka, Poland, Jachranka, Poland : PTI - Polish Information Processing Society (2011)
WS4LR - a Worksation for Lexical Resources
Resources Cvetana Krstev, Ranka Stanković, Duško Vitas, Ivan Obradović
... Stanković2 , Duško Vitas 3 and Ivan Obradović2 1Faculty of Philology, Studentski trg 3, CS-11000 Belgrade, 2Faculty of Mining and Geology, Đušina 7, CS-11000 Belgrade, 3Faculty of Mathematics, Studentski trg 16, CS-11000 Belgrade E-mail: cvetana@matf.bg.ac.yu, ranka@rgf.bg.ac.yu, vitas@matf.bg.ac.yu ...Cvetana Krstev, Ranka Stanković, Duško Vitas, Ivan Obradović. "WS4LR - a Worksation for Lexical Resources" in Proceedings of the Fifth Interantional Conference on Language Resources and Evaluation, Genoa, Italy, May 2006, ELRA - European Language Resources Association (2006)
Bilingual lexical extraction based on word alignment for improving corpus search
Jelena Andonovski, Branislava Šandrih, Olivera Kitanović. "Bilingual lexical extraction based on word alignment for improving corpus search" in The Electronic Library, Emerald (2019). https://doi.org/10.1108/EL-03-2019-0056
Digital Library From A Domain Of Criminalistics As A Foundation For A Forensic Text Analysis
U ovom radu predstavljen je model koji omogućava prikupljanje, pripremu, opis metapodataka, upravljanje i eksploataciju, uključujući pretragu punog teksta dokumenata iz domena kriminalistike napisanih na srpskom jeziku. Predloženi pristup primenjuje se na veb portalu koji sakuplja različite tekstove nastale iz časopisa Akademije za kriminalistiku i policijske studije, Krivičnog zakona Srbije, konferencija „Tara“ i „Reiss“, kao i iz nekih doktorskih disertacija vezanih za ovu oblast istraživanje. Nakon obrade teksta, korpus koji sadrži preko 5500 stranica običnog teksta, kreiran je i ...... Duško Vitas, “Corpus and Lexicon - Mutual Incompletness ”, in Proceedings of the Corpus Linguistics Conference, 14-17 July 2005, Birmingham, eds. Pernilla Danielsson and Martijn Wagenmakers, ISSN 1747-9398, http://www.corpus.bham.ac.uk/PCLC/, 2005 10 Cvetana Krstev, Ranka Stanković, Duško Vitas, Ivan ...
... WordNet Tool. In G. Pavlović Lažetić, C. Krstev, I. Obradović & D. Vitas Natural Language Processing for Serbian – Resources and Application, 1-11. Matematički fakultet, Beograd. 21 Mladenović, M., Mitrović, J., Krstev, C., & Vitas, D. (2015). Hybrid Sentiment Analysis Framework For A Morphologically ...
... library. 15 Cvetana Krstev. Processing of Serbian – Automata, Text and Electronic Dictionaries, Faculty of philology, Belgrade, 2008 16 Duško Vitas, Cvetana Krstev, Ivan Obradović, Ljubomir Popović, Gordana Pavlović-Lažetić”, An Processing Serbian Written Texts: An Overview of Resources and Basic ...Dalibor Vorkapić, Aleksandra Tomašević, Miljana Mladenović, Ranka Stanković, Nikola Vulović. "Digital Library From A Domain Of Criminalistics As A Foundation For A Forensic Text Analysis" in International Scientific Conference “Archibald Reiss Days” Thematic Conference Proceedings Of International Significance, Belgrade, 7-9 November 2017, Academy Of Criminalistic And Police Studies Belgrade (2017)
Keyword-Based Search on Bilingual Digital Libraries
This paper outlines the main features of Biblisha, a tool that offers various possibilities of enhancing queries submitted to large collections of aligned parallel text residing in bilingual digital library. Biblishsa supports keyword queries as an intuitive way of specifying information needs. The keyword queries initiated, in Serbian or English, can be expanded, both semantically, morphologically and in other language, using different supporting monolingual and bilingual resources. Terminological and lexical resources are of various types, such as wordnets, electronic ...Ranka Stanković, Cvetana Krstev, Duško Vitas, Nikola Vulović, Olivera Kitanović. "Keyword-Based Search on Bilingual Digital Libraries" in Semantic Keyword-Based Search on Structured Data Sources - Second COST Action IC1302 International KEYSTONE Conference, IKC 2016, Springer (2017). https://doi.org/10.1007/978-3-319-53640-8_10
Towards Semantic Interoperability: Parallel Corpora as Linked Data Incorporating Named Entity Linking
U radu se prikazuju rezultati istraživanja vezanih za pripremu paralelnih korpusa, fokusirajući se na transformaciju u RDF grafove koristeći NLP Interchange Format (NIF) za lingvističku anotaciju. Pružamo pregled paralelnog korpusa koji je korišćen u ovom studijskom slučaju, kao i proces označavanja delova govora, lematizacije i prepoznavanja imenovanih entiteta (NER). Zatim opisujemo povezivanje imenovanih entiteta (NEL), konverziju podataka u RDF, i uključivanje NIF anotacija. Proizvedene NIF datoteke su evaluirane kroz istraživanje triplestore-a korišćenjem SPARQL upita. Na kraju, razmatra se povezivanje Linked ...paralelni korpusi, povezivanje imenovanih entiteta, prepoznavanje imenovanih entiteta, NER, NEL, povezani podaci, NIF, VikipodaciRanka Stanković, Milica Ikonić Nešić, Olja Perisic, Mihailo Škorić, Olivera Kitanović. "Towards Semantic Interoperability: Parallel Corpora as Linked Data Incorporating Named Entity Linking" in Proceedings of the 9th Workshop on Linked Data in Linguistics @ LREC-COLING 2024, Turin, 20-25 May 2024, ELRA and ICCL (2024)
Automatic construction of a morphological dictionary of multi-word units
The development of a comprehensive morphological dictionary of multi-word units for Serbian is a very demanding task, due to the complexity of Serbian morphology. Manual production of such a dictionary proved to be extremely time-consuming. In this paper we present a procedure that automatically produces dictionary lemmas for a given list of multi-word units. To accomplish this task the procedure relies on data in e-dictionaries of Serbian simple words, which are already well developed. We also offer an evaluation
electronic dictionary, Serbian, morphology, inflection, multiwordn units, noun phrases, query expansion
... al Inflection of Multi-Word Units - A Contrastive Study of Lexical Approaches. Linguistic Issues in Language Technologies 1 (2008) 4. Krstev, C., Vitas, D.: Finite State Transducers for Recognition and Generation of Compound Words. In Erjavec, T., Žganec Gros, J., eds.: IS-LTC 2006, Ljubljana, Slovenia ...
... 192–197 5. Savary, A.: Multiflex: A Multilingual Finite-State Tool for Multi-Word Units. In: CIAA. (2009) 237–240 6. Krstev, C., Stanković, R., Vitas, D., Obradović, I.: The Usage of Various Lexical Resources and Tools to Improve the Performance of Web Search Engines. In: 6th LREC, Marrakech, Marocco ...Cvetana Krstev, Ranka Stanković, Ivan Obradović, Duško Vitas, Miloš Utvić. "Automatic construction of a morphological dictionary of multi-word units" in Lecture Notes in Computer Science 6233, Advances in Natural Language Processing, Proceedings of the 7thInternational Conference on NLP, IceTAL 2010, Reykjavik, Iceland, August 2010, Springer (2010): 226-237. https://doi.org/10.1007/978-3-642-14770-8_26
Proširivanje upita zasnovano na leksičkim resursima
U radu je opisano kako se leksički resursi za srpski jezik i softverski alati, razvijeni u okviru Grupe za jezičke tehnologije Univerziteta u Beogradu, mogu koristiti za unapređenje postavljanja upita. Rezultati pretrage mogu biti značajno unapređeni korišćenjem različitih leksičkih resursa, kakvi su morfološki rečnici i semantičke mreže. Izloženi pristup može se iskoristiti i u Sistemu naučnih, tehnoloških i poslovnih informacija, jer je efikasno pretraživanje ovog dragocenog resursa, imajući u vidu njegovu heterogenost i obim, kao i preovladavajući tekstualni sadržaj, ...... Lexical Database, The MIT Press. [5] Maurel D., Vitas D., Krstev S., Koeva S., (2007) „Prolex: a lexical model for translation of proper names. Application to French, Serbian and Bulgarian“, BULAG n°32, 2007. [6] Krstev C., Stanković R., Vitas D., Obradović I., “WS4LR: A Workstation for Lexical ...
... fakultetu Univerziteta u Beogradu već duži niz godina, tako da je danas na raspolaganju veliki broj različitih resursa, razvijenih u značajnom obimu (Vitas et al., 2003). Pored korpusa srpskog jezika, kao i višejezičnih paralelnih korpusa, od posebnog su značaja sistem morfoloških rečnika srpskog jezika ...
... njegovo uspešno prilagođavanje različitim namenama, pa samim tim otvaraju i mogućnosti njeogovog korišćenja u okviru SNTPI. LITERATURA [1] Vitas D., Pavlović-Lažetić G., Krstev C., Popović Lj., Obradović I. (2003): „Processing Serbian Written Texts: An Overview of Resources and Basic Tools“ ...Ranka Stanković, Ivan Obradović, Cvetana Krstev. "Proširivanje upita zasnovano na leksičkim resursima" in SNTPI 09 - Naučno-stručni skup Sistem naučnih, tehnoloških i poslovnih informacija, Beograd 19. i 20. jun 2009, Beograd : Fakultet informacionih tehnologija (2009)
Electronic Dictionaries - from File System to lemon Based Lexical Database
In this paper we discuss some well-known morphological descriptions used in various projects and applications (most notably MULTEXT-East and Unitex) and illustrate the encountered problems on Serbian. We have spotted four groups of problems: the lack of a value for an existing category, the lack of a category, the interdependence of values and categories lacking some description, and the lack of a support for some types of categories. At the same time, various descriptions often describe exactly the same ...... al electronic dictionaries Morphological electronic dictionaries of Serbian for NLP are being developed for many years now (Vitas et al., 1993) (Krstev, Cvetana and Vitas, Duško, 2015). They cover gen- eral lexica, proper names (persons and toponyms), general knowledge (famous or fictitious persons ...
... udžbenike. Koeva, S., Krstev, C., and Vitas, D. (2008). Morpho- semantic relations in wordnet–a case study for two slavic languages. In Proceedings of Global WordNet Confer- ence 2008, pages 239–253. University of Szeged, De- partment of Informatics. Krstev, C. and Vitas, D. (2007). Extending the Serbian ...
... C., Vitas, D., and Erjavec, T. (2004). MULTEXT- East resources for Serbian. In Zbornik 7. mednarodne multikonference Informacijska druzba IS 2004 Jezikovne tehnologije 9-15 Oktober 2004, Ljubljana, Slovenija, 2004. Erjavec, Tomaž and Zganec Gros, Jerneja. Krstev, C., Stanković, R., Vitas, D., ...Ranka Stanković, Cvetana Krstev, Biljana Lazić, Mihailo Škorić. "Electronic Dictionaries - from File System to lemon Based Lexical Database" in Proceedings of the 11th International Conference on Language Resources and Evaluation - W23 6th Workshop on Linked Data in Linguistics : Towards Linguistic Data Science (LDL-2018), LREC 2018, Miyazaki, Japan, May 7-12, 2018, European Language Resources Association (ELRA) (2018)
Improvement of geodatabase queries within GeolISS
Ranka Stanković (2008)... Krstev, C., Stanković, R., Vitas, D., Obradović, I. (2006). “WS4LR: A Workstation for Lexical Resources”. In Proceedings of the 5th International Conference on Language Resources and Evaluation, LREC 2006, Genoa, Italy, May 2006, pp. 1692–1697 [10] Krstev, C., Vitas D., Stanković R., Obradović ...
... [11] Krstev C., Pavlović-Lažetić G., Vitas D., Obradović I.: “Using Textual and Lexical Resources in Developing Serbian Wordnet”, Romanian J. Information Science and Technology, Romanian Academy, vol. 7, No. 1–2, pp. 147–161, (2004) [12] Krstev, C., Vitas, D., Maurel, D., Tran, M. (2005). “Mu ...
... Serbia” u časopisu Zapisnici Srpskog geološkog društva, Srpsko geološko društvo, Beograd. [7] ESRI Developer network (http://edn.esri.com) [8] Vitas D., G. Pavlović-Lažetić, C. Krstev, Lj. Popović, I. Obradović (2003): „Processing Serbian Written Texts: An Overview of Resources and Basic Tools“ ...Ranka Stanković. "Improvement of geodatabase queries within GeolISS" in Review of the National Center for Digitization, Beograd : Faculty of Mathematics, Belgrade (2008)
From DELA Based Dictionary to Leximirka Lexical Database
Biljana Lazić, Mihailo Škorić (2020)In this paper, we will present an approach in transforming Serbian language Morphological dictionaries from a DELA text format to a lexical database dubbed Leximirka. Considering the benefits of storing data within a database when compared to storing them in textual documents, we will outline some of the functionality that the database has made possible. We will also show how hand-made rules that use category labels lexical entries are marked with can be used to link lexical entries. ...... Mining and Geology Belgrade, Serbia 1 Introduction Prof. Dr. Dusko Vitas and Prof. Dr. Cvetana Krstev started working on the development of Serbian morphological dictionaries more than 25 years ago (Vitas, 1993; Krstev, 1997; Vitas et al., 1993). Morphological dictionaries represent a significant linguistic ...
... no. 6 (2018): 993–1009, URL https://doi.org/10.1108/EL-11-2017-0239 Vitas, Duško. “Matematički model morfologije srpskohrvatskog jezika (imen- ska fleksija)”. Phdthesis, Univerzitet u Beogradu, Matematički fakultet, 1993 Vitas, Duško, Gordana Pavlovic-Lažetić and Cvetana Krstev. “Electronic ...
... “bibliotekar” is among the 10,000 most frequent words in the Serbian Corpus of the Serbian Language SrbCorp (version of 122 million words by Duško Vitas and Miloš Utvić)6. Information about the Corpus is stored in the KorpusMeta table. The LexicalRelation table stores information 6 Corpus of the Serbian ...Biljana Lazić, Mihailo Škorić. "From DELA Based Dictionary to Leximirka Lexical Database" in Infotheca, Faculty of Philology, University of Belgrade (2020). https://doi.org/10.18485/infotheca.2019.19.2.4
Rule-based Automatic Multi-word Term Extraction and Lemmatization
In this paper we present a rule-based method for multi-word term extraction that relies on extensive lexical resources in the form of electronic dictionaries and finite-state transducers for modelling various syntactic structures of multi-word terms. The same technology is used for lemmatization of extracted multi-word terms, which is unavoidable for highly inflected languages in order to pass extracted data to evaluators and subsequently to terminological e-dictionaries and databases. The approach is illustrated on a corpus of Serbian texts from ...... evaluation. Terminology, 16(2), pp.141--158. Vitas, D., Popović, Lj., Krstev, C., Obradović, I., Pavlović-Lažetić, G. and Stanojević, M. (2012). The Serbian Language in the Digital Age. Berlin; Springer-Verlag. 8. Language Resource References Vitas D., Utvić M. (2015). SrpKor22M, Serbian au ...
... language resources such as morphological e-dictionaries and grammars developed within the University of Belgrade Human Language Technology Group (Vitas et al., 2012). For our approach, production of lemmas for various forms of MWTs extracted from a corpus is necessary for two main reasons. Firstly ...
... In Proc. of the Workshop on BSNLP: Information Extraction and Enabling Technologies, pp. 59--66. Krstev, C., Obradović, I., Stanković, R., and Vitas, D. (2013). An Approach to Efficient Processing of Multi-Word Units. In: Przepiórkowski, A., Piasecki, M., Jassem, K., Fuglewicz, P. (Eds.) Computational ...Ranka Stanković, Cvetana Krstev, Ivan Obradović, Biljana Lazić, Aleksandra Trtovac. "Rule-based Automatic Multi-word Term Extraction and Lemmatization" in Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016, Portorož, Slovenia, 23--28 May 2016, European Language Resources Association (2016)
Machine Learning and Deep Neural Network-Based Lemmatization and Morphosyntactic Tagging for Serbian
The training of new tagger models for Serbian is primarily motivated by the enhancement of the existing tagset with the grammatical category of a gender. The harmonization of resources that were manually annotated within different projects over a long period of time was an important task, enabled by the development of tools that support partial automation. The supporting tools take into account different taggers and tagsets. This paper focuses on TreeTagger and spaCy taggers, and the annotation schema alignment ...... production of the new tag- ger model for Serbian are: (a) Serbian morphological dic- tionaries (Cvetana Krstev, Duško Vitas, 2015) (SMD); (b) pre-annotated texts (Duško Vitas, Cvetana Krstev, Ranka Stanković, Miloš Utvić, 2019). 2.1. Serbian morphological dictionaries Serbian morphological ...
... Bidirectional LSTM-CRF Models for Sequence Tagging. Krstev, C., Vitas, D., and Erjavec, T. (2004). Morpho- Syntactic Descriptions in MULTEXT-East-the Case of Serbian. Informatica, 28(4):431–436. Krstev, C., Obradović, I., Utvić, M., and Vitas, D. (2014). A system for named entity recognition based on ...
... 12(2):36a–47a, December. 8. Language Resource References Cvetana Krstev, Duško Vitas. (2015). Serbian Morpho- logical Dictionary - SMD. University of Belgrade, HLT Group and Jerteh, Lexical resource, 2.0. Duško Vitas, Cvetana Krstev, Ranka Stanković, Miloš Utvić. (2019). Sr-Basic: Annotated corpus ...Ranka Stanković, Branislava Šandrih, Cvetana Krstev, Miloš Utvić, Mihailo Škorić. "Machine Learning and Deep Neural Network-Based Lemmatization and Morphosyntactic Tagging for Serbian" in Proceedings of the 12th Language Resources and Evaluation Conference, May Year: 2020, Marseille, France, European Language Resources Association (2020)
Knowledge and Rule-Based Diacritic Restoration in Serbian
In this paper we present a procedure for the restoration of diacritics in Serbian texts written using the degraded Latin alphabet. The procedure relies on the comprehensive lexical resources for Serbian: the morphological electronic dictionaries, the Corpus of Contemporary Serbian and local grammars. Dictionaries are used to identify possible candidates for the restoration, while the dataobtainedfromSrpKorandlocalgrammarsassistsinmakingadecisionbetween several candidates in cases of ambiguity. The evaluation results reveal that,dependingonthetext,accuracyrangesfrom95.03%to99.36%,whilethe precision (average 98.93%) is always higher than the recall (average 94.94%).... in Serbian Cvetana Krstev, Ranka Stanković, Duško Vitas Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] Knowledge and Rule-Based Diacritic Restoration in Serbian | Cvetana Krstev, Ranka Stanković, Duško Vitas | Proceedings of the Third International Conference ...Cvetana Krstev, Ranka Stanković, Duško Vitas. "Knowledge and Rule-Based Diacritic Restoration in Serbian" in Proceedings of the Third International Conference Computational Linguistics in Bulgaria (CLIB 2018), May 27-29, 2018, Sofia, Bulgaria, Sofia : The Institute for Bulgarian Language Prof. Lyubomir Andreychin, Bulgarian Academy of Sciences (2018): 41-51
Managing mining project documentation using human language technology
Purpose: This paper aims to develop a system, which would enable efficient management and exploitation of documentation in electronic form, related to mining projects, with information retrieval and information extraction (IE) features, using various language resources and natural language processing. Design/methodology/approach: The system is designed to integrate textual, lexical, semantic and terminological resources, enabling advanced document search and extraction of information. These resources are integrated with a set of Web services and applications, for different user profiles and use-cases. Findings: The ...Digital libraries, Information retrieval, Data mining, Human language technologies, Project documentationAleksandra Tomašević, Ranka Stanković, Miloš Utvić, Ivan Obradović, Božo Kolonja . "Managing mining project documentation using human language technology" in The Electronic Library (2018). https://doi.org/10.1108/EL-11-2017-0239
Речник САНУ као база терминолошких речника (на примеру речника кулинарства)
... Krstev, Cvetana, Duško Vitas and Gordana Pavlović-Lažetić. „Resources and methods in the morphosyntactic processing of Serbo-Croatian.” In Gerhild Zybatow et al. (eds.) Formal Description of Slavic Languages: The Fifth Conference, Leipzig 2003, pp. 3-17. Frankfurt am Main. 2. Vitas, D., Popović, Lj. ...
... користе у истраживањима језика и креирању језичких алата. Морфолошке речнике српског језика развили су проф. др Цветана Крстев и проф. др Душко Витас уз помоћ Групe за језичке технологије Универзитета у Београду. Анализа обрађеног корпуса обухватила је екстракцију речи и фраза засновану на доменским ...
... of the Language Resources and Evaluation Conference (LREC), 23-28 May 2016, Portorož. 7. Ranka Stanković, Ivan Obradović, Cvetana Krstev, Duško Vitas, “Production of morphological dictionaries of multi-word units using a multipurpose tool”, In: Proceedings of the Computational Linguistics-Applications ...Рада Стијовић, Олга Сабо, Ранка Станковић. "Речник САНУ као база терминолошких речника (на примеру речника кулинарства)" in Словенска терминологија данас, Београд : Српска академија наука и уметности (2017)
Towards Automatic Definition Extraction for Serbian
U radu su prikazani preliminarni rezultati automatske ekstrakcije kandidata za definicije rečnika iz nestrukturiranih tekstova na srpskom jeziku u cilju ubrzanja razvoja rečnika. Definicije u rečniku Srpske akademije nauka i umetnosti (SANU) korišćene su za modelovanje različitih tipova definicija (opisnih, gramatičkih, referentnih i sinonimskih) koje imaju različite sintaksičke i leksičke karakteristike. Korpus istraživanja sastoji se od 61.213 definicija imenica, koje su analizirane korišćenjem morfoloških e-rečnika i lokalnih gramatika implementiranih kao pretvarači konačnih stanja u paketu za obradu korpusa otvorenog ...... nitexgramlab.org/) 2 A part of this lexicon is publicly available for use within the Unitex system words or a recognized syntactic structure (Vitas & Krstev 2012). Finite state transducers are visualized by graphs for easier development and use. A local grammar and its corresponding graph that ...
... Learning to define word embeddings in natural language. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 31, No. 1). Krstev, C., Vitas, D. & Stanković, R. (2015). A Lexical Approach to Acronyms and their Definitions. In Proceedings of 7th Language & Technology Conference, November ...
... extraction in free-and semi-structured text. In Proceedings of the 13th Linguistic Annotation Workshop, 2019, pp. 124–131. Stanković, R., Stijović, R., Vitas, D., Krstev, C. & Sabo O. (2018). The Dictionary of the Serbian Academy: from the Text to the Lexical Database. In: Proceedings of the XVIII EURALEX ...Ranka Stanković, Cvetana Krstev, Rada Stijović, Mirjana Gočanin, Mihailo Škorić. "Towards Automatic Definition Extraction for Serbian" in Proceedings of the XIX EURALEX Congress of the European Assocition for Lexicography: Lexicography for Inclusion (Volume 2). 7-9 September (virtual), Democritus University of Thrace (2021)
An Italian-Serbian Sentence Aligned Parallel Literary Corpus
This article presents the construction and relevance of an Italian-Serbian sentence-aligned parallel corpus, delving into the aligned sentences in order to facilitate effective translation between the two languages. The parallel corpus serves as a valuable resource for language experts, researchers, and language enthusiasts, fostering a deeper understanding of linguistic nuances and cultural expressions. By bridging the gap between Serbian and Italian, this corpus opens new avenues for cross-cultural communication and collaboration, and ultimately contributes to the improvement of language-related ...Saša Moderc, Ranka Stanković, Aleksandra Tomašević, Mihailo Škorić. "An Italian-Serbian Sentence Aligned Parallel Literary Corpus" in Review of the National Center for Digitization, Belgrade : Faculty of Mathematics, University of Belgrade (2023). https://doi.org/10.5281/zenodo.11203388
