2349 items
Machine Learning and Deep Neural Network-Based Lemmatization and Morphosyntactic Tagging for Serbian
The training of new tagger models for Serbian is primarily motivated by the enhancement of the existing tagset with the grammatical category of a gender. The harmonization of resources that were manually annotated within different projects over a long period of time was an important task, enabled by the development of tools that support partial automation. The supporting tools take into account different taggers and tagsets. This paper focuses on TreeTagger and spaCy taggers, and the annotation schema alignment ...... Constant, M., Krstev, C., and Vitas, D. (2018). Lexical analysis of serbian with conditional random fields and large-coverage finite-state resources. In Zygmunt Vetu- lani, et al., editors, Human Language Technology. Chal- lenges for Computer Science and Linguistics. LTC 2015. Lecture Notes in ...
... computational linguistics: system demonstrations, pages 55–60. Petrov, S., Das, D., and McDonald, R. (2012). A Univer- sal Part-of-Speech Tagset. In Nicoletta Calzolari (Con- ference Chair), et al., editors, Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC’12) ...
... Pacific Asia Conference on Language, Information and Compu- tation, pages 389–398. Honnibal, M. and Montani, I. (2017). spaCy 2: Natural Language Understanding with Bloom Embeddings, Con- volutional Neural Networks and Incremental Parsing. To appear. Huang, Z., Xu, W., and Yu, K. (2015). Bidirectional ...Ranka Stanković, Branislava Šandrih, Cvetana Krstev, Miloš Utvić, Mihailo Škorić. "Machine Learning and Deep Neural Network-Based Lemmatization and Morphosyntactic Tagging for Serbian" in Proceedings of the 12th Language Resources and Evaluation Conference, May Year: 2020, Marseille, France, European Language Resources Association (2020)
English for Geology Students 1 – Dyslexia friendly
Lidija Beko (2023)Lidija Beko. English for Geology Students 1 – Dyslexia friendly, Belgrade : The Faculty of Mining and Geology, 2023
English for Geology Students 2 - Dyslexia friendly
Lidija Beko (2023)Lidija Beko. English for Geology Students 2 - Dyslexia friendly, Belgrade : The Faculty of Mining and Geology, 2023
Development and Evaluation of Three Named Entity Recognition Systems for Serbian - The Case of Personal Names
In this paper we present a rule- and lexicon-based system for the recognition of Named Entities (NE) in Serbian news paper texts that was used to prepare a gold standard annotated with personal names. It was further used to prepare training sets for four different levels of annota tion, which were further used to train two Named Entity Recognition (NER) sys tems: Stanford and spaCy. All obtained models, together with a rule- and lexicon based system were evaluated on ...... 313(1):93– 104. Ralph Grishman and Beth Sundheim. 1996. Message Understanding Conference-6: A Brief History. In Proceedings of the 16th International Conference on Computational Linguistics (COLING 1996). vol- ume 1. Matthew Honnibal and Ines Montani. 2017. spaCy 2: Natural Language Understanding with Bloom ...
... Proceed- ings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1-Volume 10Gemini, https://github.com/fyh828/gemini/ 1068 1. Association for Computational Linguistics, pages 141–150. Nathalie Friburger and Denis Maurel. 2004. Finite- state Transducer Cascades to Extract ...
... Task: Language-independent Named Entity Recognition. In COLING-02: The 6th Conference on Natural Language Learning 2002 (CoNLL-2002). Satoshi Sekine, Kiyoshi Sudo, and Chikashi No- bata. 2002. Extended Named Entity Hier- archy. In Proceedings of the Third Interna- tional Conference on Language Resources ...Branislava Šandrih, Cvetana Krstev, Ranka Stanković. "Development and Evaluation of Three Named Entity Recognition Systems for Serbian - The Case of Personal Names" in Proceedings - Natural Language Processing in a Deep Learning World, Incoma Ltd., Shoumen, Bulgaria (2019). https://doi.org/10.26615/978-954-452-056-4_122
Keyword-Based Search on Bilingual Digital Libraries
This paper outlines the main features of Biblisha, a tool that offers various possibilities of enhancing queries submitted to large collections of aligned parallel text residing in bilingual digital library. Biblishsa supports keyword queries as an intuitive way of specifying information needs. The keyword queries initiated, in Serbian or English, can be expanded, both semantically, morphologically and in other language, using different supporting monolingual and bilingual resources. Terminological and lexical resources are of various types, such as wordnets, electronic ...Ranka Stanković, Cvetana Krstev, Duško Vitas, Nikola Vulović, Olivera Kitanović. "Keyword-Based Search on Bilingual Digital Libraries" in Semantic Keyword-Based Search on Structured Data Sources - Second COST Action IC1302 International KEYSTONE Conference, IKC 2016, Springer (2017). https://doi.org/10.1007/978-3-319-53640-8_10
OntoLex Publication Made Easy: A Dataset of Verbal Aspectual Pairs for Bosnian, Croatian and Serbian
Ovaj rad predstavlja novi jezički resurs za pretraživanje i istraživanje verbalnih aspektnih parova u BCS (bosanskom, hrvatskom i srpskom), kreiran korišćenjem principa Lingvističkih Povezanih Otvorenih Podataka (LLOD). Pošto ne postoji resurs koji bi pomogao učenicima bosanskog, hrvatskog i srpskog kao stranih jezika da prepoznaju aspekt glagola ili njegove parove, kreirali smo novi resurs koji će korisnicima pružiti informacije o aspektu, kao i link ka aspektnim parovima glagola. Ovaj resurs takođe sadrži spoljne linkove ka monolingvalnim rečnicima, Wordnetu i BabelNetu. ...Ranka Stanković, Maxim Ionov, Medina Bajtarević, Lorena Ninčević. "OntoLex Publication Made Easy: A Dataset of Verbal Aspectual Pairs for Bosnian, Croatian and Serbian" in Proceedings of the 9th Workshop on Linked Data in Linguistics @ LREC-COLING 2024, Turin, 20-25 May 2024, ELRA and ICCL (2024)
E-Connecting Balkan Languages
In this paper we present a versatile language processing tool that can be successfully used for many Balkan languages. This tool relies for its work on several sophisticated textual and lexical resources that were developed for most of Balkan languages. These resources are based on several de facto standards in natural language processing.... for Serbian, and in bilingual context, for Serbian and English. In this paper we will show that tools WS4LR and WS4QE are truly independent both from Serbian, for which they were initially developed, and from English which seems to be in the background of many natural language processing tools ...
... Information Science and Technology. Bucureşti: Publishing house of the Romanian academy, Vol. 7, No.1-2, 2004. [21] D. Tufiş, S. Koeva, T. Erjavec, M. Gavrilidou, and C. Krstev. Building Language Resources and Translation Models for Machine Translation focused on South Slavic and Balkan Languages ...
... for them. 2. Integrated Language Resources In order to prove the usability of WS4LR and WS4QE for languages other then Serbian and English we used various resources, both textual and lexical. In the following sections we will briefly present these resources, what methodological framework ...Cvetana Krstev, Ranka Stanković, Duško Vitas, Svetla Koeva. "E-Connecting Balkan Languages" in Proceedings of the Workshop Workshop on Multilingual resources, technologies and evaluation for Central and Eastern European Languages, 17 September 2009, eds. C. Vertan, S. Piperidis, E. Paskaleva and Milena Slavcheva, Borovets, Bulgaria : Association for Computational Linguistics Stroudsburg, PA, USA (2009)
Towards Automatic Definition Extraction for Serbian
U radu su prikazani preliminarni rezultati automatske ekstrakcije kandidata za definicije rečnika iz nestrukturiranih tekstova na srpskom jeziku u cilju ubrzanja razvoja rečnika. Definicije u rečniku Srpske akademije nauka i umetnosti (SANU) korišćene su za modelovanje različitih tipova definicija (opisnih, gramatičkih, referentnih i sinonimskih) koje imaju različite sintaksičke i leksičke karakteristike. Korpus istraživanja sastoji se od 61.213 definicija imenica, koje su analizirane korišćenjem morfoloških e-rečnika i lokalnih gramatika implementiranih kao pretvarači konačnih stanja u paketu za obradu korpusa otvorenog ...... the Association for Computational Linguistics, 4, 17-30. Jin, Y., Kan, M. Y., Ng, J. P., & He, X. (2013). Mining scientific terms and their definitions: A study of the ACL anthology. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 780-790. Tissier ...
... Natural Language Processing (EMNLP 2017), Sep 2017, Copenhague, Denmark. pp. 254-263. Navigli, R. & Velardi, P. (2010). Learning Word-Class Lattices for Definition and Hypernym Extraction. In Proceedings of the Forty-Eighth Annual Meeting of the Association for Computational Linguistics. Uppsala ...
... embeddings. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 1522-1532. Barnbrook, G. (2002). Defining Language, A local grammar of definition sentences, Studies in Corpus Linguistics, (Vol. 11). John Benjamins Publishing. Gortan Premk, D. (1980). O gramatičkoj ...Ranka Stanković, Cvetana Krstev, Rada Stijović, Mirjana Gočanin, Mihailo Škorić. "Towards Automatic Definition Extraction for Serbian" in Proceedings of the XIX EURALEX Congress of the European Assocition for Lexicography: Lexicography for Inclusion (Volume 2). 7-9 September (virtual), Democritus University of Thrace (2021)
Wordnet Development Using a Multifunctional Tool
Ivan Obradović, Ranka Stanković (2007)In this paper we present a multifunctional tool for manipulating heterogeneous language resources. The tool handles electronic dictionaries, wordnets and aligned texts, and provides for their synchronous use in various tasks. We focus here on the description of the possibilities this tool offers in the development of wordnets. Besides the wordnet module which enables parallel handling of two wordnets, other modules, such as the module for morphological dictionaries and the module for aligned texts, as well as available finite ...... Faculty of Mining and Geology Đušina 7, 11000 Belgrade, Serbia ranka@rgf.bg.ac.yu Abstract In this paper we present a multifunctional tool for manipulating heterogeneous language resources. The tool handles electronic dictionaries, wordnets and aligned texts, and provides for their s ...
... developed on basis of PWN and the top-ontology accepted in EuroWordNet, and aligned by using ILI. From a lexicographer’s point of view, the development of a wordnet, perceived as a specific form of dictionary and hierarchical thesaurus for a particular language, opens two critical issues ...
... abandoned. However, language specific concepts were also developed for each particular wordnet, as well as a set of concepts common to BalkaNet languages and unknown to PWN [10]. Once a concept has been accepted and placed within the conceptual framework of a particular language, the lexicographer ...Ivan Obradović, Ranka Stanković. "Wordnet Development Using a Multifunctional Tool" in Proceedings of the International Workshop Computer Aided Language Processing (CALP) '2007, Borovets, Bulgaria, September 2007, - (2007)
Terminology Acquisition and Description Using Lexical Resources and Local Grammars
Acquisition of new terminology from specific domains and its adequate description within terminological dictionaries is a complex task, especially for languages that are morphologically complex such as Serbian. In this paper we present an approach to solving this task semi-automatically on basis of lexical resources and local grammars developed for Serbian. Special attention is given to automatic inflectional class prediction for simple adjectives and nouns and the use of syntactic graphs for extraction of Multi-Word Unit (MWU) candidates for ...... Computational Linguistics - EACL. Rodrıguez, F. M. B., Noya, E. D., Otero, P. G., Martınez, M. L., Mato, E. M. M., Rojo, G., Docıo, S. S. (2007). A Corpus and Lexical Resources for Multi-word Terminology Extraction in the Field of Economy in a Minority Language. Proc. of 3rd Language & Technology ...
... being created and introduced in Serbian making important the automation of their retrieval and incorporation in Serbian terminological dictionaries. Due to spe- cific features of Serbian grammar, especially its rich morphology, this is a complex task, and cor- responding language resources in the ...
... & H. Li (Eds.), Natural Language Processing and Information Systems (Vol. 6177, pp. 248-255): Springer Berlin Heidelberg. Justeson, J. S., & Katz, S. M. (1995). Technical ter- minology: some linguistic properties and an algo- rithm for identification in text. Natural Language Engineering, 1 (01): ...Cvetana Krstev, Ranka Stanković, Ivan Obradović, Biljana Lazić. "Terminology Acquisition and Description Using Lexical Resources and Local Grammars" in Proceedings of the 11th Conference on Terminology and Artificial Intelligence, Granada, Spain, 2015, Granada : LexiCon (Universidad de Granada) (2015)
Old or New, We Repair, Adjust and Alter (Texts)
Cvetana Krstev, Ranka Stanković (2020)U ovom radu predstavljamo kako se e-rečnici i kaskade transduktora konačnih stanja implementirani u alatu Unitex mogu koristiti za rešavanje tri problema transformacije teksta: ispravljanje tekstova nakon OCR-a, vraćanje dijakritičkih znakova i prebacivanje između različitih jezičkih varijanti.ispravka teksta, OCR greške, restauracija dijakritika , jezičke varijante, elektronski rečnik, transduktori konačnih stanja... lem of OCR error detection and correction is still not considered solved (see, for example, (Kolak and Resnik, 2002)), especially for more “demanding” scripts and languages (Cyrillic, Arabic, etc.). Transformation from one language variant to another is usually not per- ceived as an error/correction ...
... true for problems of diacritic restoration, OCR errors correction and language variants transformation. In this paper we present an approach to solving three text mending problems for Serbian: OCR errors, diacritics omission and language vari- ant switching. The common characteristic of these problems ...
... on Human Language Technology Research, 257–262. Morgan Kaufmann Publishers Inc., 2002 Krstev, Cvetana. Processing of Serbian – Automata, Texts and Electronic dictionaries. Faculty of Philology, University of Belgrade, 2008 Krstev, Cvetana, Ranka Stanković and Duško Vitas. “Knowledge and Rule- Based ...Cvetana Krstev, Ranka Stanković. "Old or New, We Repair, Adjust and Alter (Texts)" in Infotheca, Faculty of Philology, University of Belgrade (2020). https://doi.org/10.18485/infotheca.2019.19.2.3
A bilingual digital library for academic and entrepreneurial knowledge management
A generic knowledge management process of organization, storage and retrieval of knowledge can suitably be fitted in a digital library. In the digital and knowledge age digital libraries can be used in knowledge management to handle intellectual assets and support knowledge creation. A multilingual digital library either stores content in more than one language or provides multilingual query access to monolingual content. In Serbia 18 of 308 scientific journals regularly published are bi-lingual, with papers simultaneously being in English ...... the keyword and optionally selects a text collection to search (the default is all collections). Besides the keyword itself, it is necessary to choose the keyword language, and then click on the “Preview and modify terms for query” link. The system uses web services to find synonyms and translations ...
... the language and the collection (it is possible to simultaneously search through all available text collections). The user enters the search criteria in the search field, then adds additional criteria by clicking the “+” sign. Boolean operators “OR” and “AND” build the search query. “AND” is used ...
... Assistant Professor of Mathematics and Informatics at Faculty of Mining and Geology at University of Belgrade. Her scientific field is Human Language Technologies (HLT). She is teaching several courses related to informatics (traditional, online and blended) and she is head of the Computing Centre ...Ranka Stanković, Cvetana Krstev, Biljana Lazić, Dalibor Vorkapić. "A bilingual digital library for academic and entrepreneurial knowledge management" in Proceeding of 10th International Forum on Knowledge Asset Dynamics — IFKAD 2015: Culture, Innovation and Entrepreneurship: connecting the knowledge dots, Bari, Italy, 10-12 June 2015, Bari : IFKAD (2015)
Towards translation of educational resources using GIZA++
... on quantitative linguistics (QUALICO) in Belgrade, Serbia, April 26-29, 2012. University of Belgrade, 2013. [20] D. Vitas and C. Krstev. “Construction and Exploitation of X-Serbian Bitexts”. In Cristina Vertan and Walther v. Hahn (eds.) Multilingual Processing in Eastern and Southern EU Languages: ...
... of them hard to translate into and with relatively weak machine translation (MT) support. Phrase-based and syntax-based SMT models are developed to address language diversity and support the language independent nature of the methodology. For high-quality MT and to add value to existing infrastructure ...
... ~/corpus/edX.clean 1 80 Language Model Training A language model (LM) is used to ensure fluent output, built with the target language, in our case English. Following script creates lm folder, positions in it and finally execute command that will build an 3-gram language model. mkdir ~/lm cd ...Ivan Obradović, Dalibor Vorkapić, Ranka Stanković, Nikola Vulović, Miladin Kotorčević. "Towards translation of educational resources using GIZA++" in The Seventh International Conference on e-Learning (eLearning-2016), September 2016, Belgrade : Metropolitan Univesity (2016)
A Mathematical Learning Environment Based on Serbian Language Resources
In recent years, in line with ever growing usage of Information technology, the learning environments are changing. The amount of available learning materials in various forms has increased. These new environments demand comprehensive learning systems, which enable management of the learning corpus with special attention paid to relevant lexical resources. In this paper we present the concept of a Mathematical Learning Environment in Serbian (MLES), which is based on a corpus of mathematical materials and various lexical resources, enabling ...... for C# programming language and MVC design pattern, as well as HTML and JavaScript, whereas SQL Server served as support for the database. The application is located at http://termi.rgf.bg.ac.rs/ and consists of 5 specific units: browse, search, update, bibliography and profiles. Termi currently ...
... corpus of mathematical content and provides mechanisms for processing and search of this content. It relies on existing lexical resources, morphological e-dictionaries and WordNet of Serbian, which have been developed within the University of Belgrade Human Language Technology group for several ...
... millennium. Proceedings of the Corpus Linguistics 2011 conference. Birmingham: University of Birmingham. [15] Hardie, A. (2012). CQPweb - combining power, flexibility and usability in a corpus analysis tool. International Journal of Corpus Linguistics. 17 (3), pp. 380–409. [16] Stanković ...Radojičić Marija, Obradović Ivan, Stanković Ranka, Utvić Miloć, Kaplar Sebastijan. "A Mathematical Learning Environment Based on Serbian Language Resources" in Proceedings of the 7th International Scientific Conference Technics and Informatics in Education, Faculty of Technical Sciences, Čačak (2018)
An Integrated Environment for Management and Exploitation of Linguistic Resources
Ranka Stanković, Ivan Obradović (2009)... ly used for management and exploitation of linguistic resources. Both the tools and the resources were developed within the University of Belgrade Human Language Technology Group. The tools we describe are WS4LR, a software tool that has been devel- oped and used for solving different ...
... “Improvement of Queries using a Rule Based Procedure for Inflection of Compounds and Phrases”, Polibits, Special section: Natural Language Processing, Journal of Research and Development in Computer Science and Engineering, ed. G. Sidorov (ed.), Centro Innovacion y Desarrollo Tecnologico ...
... spoken, but also from France and the Nether- lands. A national development team was formed for each language, which in the case of Serbian was the University of Belgrade HLT Group. Upon the termination of this project, the development of SWN continued, and this net- work to date contains ...Ranka Stanković, Ivan Obradović. "An Integrated Environment for Management and Exploitation of Linguistic Resources" in Proceedings of the International Multiconference on Computer Science and Information Technology, Computational Linguistics – Applications Workshop (CLA09), Mrągowo, Poland, October 2009, Piscataway : IEEE (2009)
Integrisano okruženje za pripremu paralelizovanog korpusa
Razvoj paralelizovanih korpusa zahteva pripremu paralelnih tekstova za njihovu integraciju u paralelizovani korpus. Reč je o jednom kompleksnom zadatku koji se može rešiti na različite načine, i koji mora da se odvija u nekoliko koraka. U ovom radu najpre je iznet postupak pripreme paralelnih tekstova za paralelizovani korpus koji se koristi u Grupi za jezičke tehnologije Univerziteta u Beogradu. Potom je dat kratak pregled programa (XAlign, Concordancier, WS4LR), odnosno softverskih alata koji se pri tome koriste. Nedostatak udobnog okruženja ...... bilingual corpora, Computational Linguistics”, Vol. 19/1, pp. 75 – 102. 12 [3] Krstev Cvetana, Ranka Stanković, Duško Vitas, Ivan Obradović (2006): “WS4LR - a Worksation for Lexical Resources”, in Proceedings of the Fifth International Conference on Language Resources and Evaluation, Genoa, Italy, May ...
... Kaalep, V. Petkevič, D. Tufiş (1998): “Multext- East: Parallel and Comparable Corpora and Lexicons for Six Central and Eastern European Languages”, in Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics COLING-ACL '98. Montréal, Québec, Canada, pp. 315- 319. [5] ...
... aligned parallel corpus with 20+ languages. In Proceedings of the Fifth International Conference on Language Resources and Evaluation, LREC'06, ELRA, Paris, 2006. [6] Tomaž Erjavec: Compiling and Using the IJS-ELAN Parallel Corpus. Informatica, 26(3), pp. 299-307, 2002. SUMMARY The development ...Ivan Obradović, Ranka Stanković, Miloš Utvić. "Integrisano okruženje za pripremu paralelizovanog korpusa" in Zbornik radova međunarodnog simpozijuma Razlike između bosanskog/bošnjačkog, hrvatskog i srpskog jezika, Graz, Austria, April 2007, - (2007)
An Italian-Serbian Sentence Aligned Parallel Literary Corpus
This article presents the construction and relevance of an Italian-Serbian sentence-aligned parallel corpus, delving into the aligned sentences in order to facilitate effective translation between the two languages. The parallel corpus serves as a valuable resource for language experts, researchers, and language enthusiasts, fostering a deeper understanding of linguistic nuances and cultural expressions. By bridging the gap between Serbian and Italian, this corpus opens new avenues for cross-cultural communication and collaboration, and ultimately contributes to the improvement of language-related ...Saša Moderc, Ranka Stanković, Aleksandra Tomašević, Mihailo Škorić. "An Italian-Serbian Sentence Aligned Parallel Literary Corpus" in Review of the National Center for Digitization, Belgrade : Faculty of Mathematics, University of Belgrade (2023). https://doi.org/10.5281/zenodo.11203388
Using Query Expansion for Cross-Lingual Mathematical Terminology Extraction
Velislava Stoykova, Ranka Stanković (2018)Velislava Stoykova, Ranka Stanković. "Using Query Expansion for Cross-Lingual Mathematical Terminology Extraction" in Advances in Intelligent Systems and Computing, Springer International Publishing (2018). https://doi.org/10.1007/978-3-319-91189-2_16
Претрага корпуса заснована на употреби екстерних лексичких ресурса путем веб-сервиса
У раду се разматра хибридни приступ претрази корпуса, илустрован на примеру алатки OCWB и NoSketch Engine, примењених на специјални корпус из области рударства (РудКор) и Корпус савременог српског језика (СрпКор). Разматрани приступ комбинује постојеће могућности алатки OCWB и NoSketch Engine, које своју претрагу заснивају на лингвистичкој анотацији корпуса, са новим могућностима претраге у виду консултовања екстерних језичких ресурса (морфолошки електронски речници српског језика и лексичка база података Српски ворднет). Хибридни приступ је реализован надоградњом вебсучеља која поменуте алатке користе ...... Еверт-Харди 2011: Stefan Evert and Andrew Hardie, „Twenty-first Century Cor- pus Workbench: Updating a Query Architecture for the New Millennium”, In: Proceedings of the Corpus Linguistics 2011 Conference, Birmingham, University of Birmingham. Еверт 2019: Stefan Evert and The OCWB Development Team, CQP ...
... Belgrade. Крстев и др. 2018: Cvetana Krstev, Ranka Stanković, Duško Vitas, ”Knowl- edge and Rule-Based Diacritic Restoration in Serbian”, In: Proceedings of the Third International Conference Computational Linguistics in Bulgaria (CLIB 2018), May 27-29, 2018, Sofia, Bulgaria, ISSN 2367-5675 (on-line) ...
... Miljana Mladenović, Ivan Obradović, Marko Vitas, Cvetana Krstev, “Resource based WordNet augmentation and enrichment”, In: Proceedings of the Third International Conference Compu- tational Linguistics in Bulgaria (CLIB 2018), May 27–29, 2018, Sofia, Bul- garia, ISSN 2367-5675 (on-line), 104–114, http://dcl ...Милош Утвић, Ранка Станковић, Александра Томашевић, Михаило Шкорић, Биљана Лазић. "Претрага корпуса заснована на употреби екстерних лексичких ресурса путем веб-сервиса" in Научни састанак слависта у Вукове дане - Vol. 48/3 Српски језик и његови ресурси, Међународни славистички центар, Филолошки факултет, Универзитет у Београду (2019). https://doi.org/10.18485/msc.2019.48.3.ch12
Electronic Dictionaries - from File System to lemon Based Lexical Database
In this paper we discuss some well-known morphological descriptions used in various projects and applications (most notably MULTEXT-East and Unitex) and illustrate the encountered problems on Serbian. We have spotted four groups of problems: the lack of a value for an existing category, the lack of a category, the interdependence of values and categories lacking some description, and the lack of a support for some types of categories. At the same time, various descriptions often describe exactly the same ...... Stanković, Cvetana Krstev, Biljana Lazić, Mihailo Škorić | Proceedings of the 11th International Conference on Language Resources and Evaluation - W23 6th Workshop on Linked Data in Linguistics : Towards Linguistic Data Science (LDL-2018), LREC 2018, Miyazaki, Japan, May 7-12, 2018 | 2018 | | http://dr ...
... , G., and Krstev, C. (1993). Electronic dictionary and text processing in Serbo- Croatian. Sprache–Kommunikation–Informatik, 1:225. 10. Language Resource References Krstev, Cvetana and Vitas, Duško. (2015). Serbian Mor- phological Dictionary - SMD. University of Belgrade, HLT Group and Jerteh ...
... other (lexical) data and the possibility to access data by using the standardized SPARQL query language. The model pre- sented is based on the lemon model, but some modifica- tions and extensions were necessary to enable full migra- tion of complex grammatical structures and numerous in- flected ...Ranka Stanković, Cvetana Krstev, Biljana Lazić, Mihailo Škorić. "Electronic Dictionaries - from File System to lemon Based Lexical Database" in Proceedings of the 11th International Conference on Language Resources and Evaluation - W23 6th Workshop on Linked Data in Linguistics : Towards Linguistic Data Science (LDL-2018), LREC 2018, Miyazaki, Japan, May 7-12, 2018, European Language Resources Association (ELRA) (2018)