WS4LR - a Worksation for Lexical Resources
... workstation for lexical resources, a software tool developed within the Human Language Technology Group at the Faculty of Mathematics, University of Belgrade. The tool is aimed at manipulating heterogeneous lexical resources, and the need for such a tool came from the large volume of resources the Group ...
Human Language Technology group at the Faculty of Mathematics has been developing various lexical resources over quite a long period, reaching a considerable volume to date. Given the fact that these resources have been developed for many years, they have naturally been conceived within different
The Nooj System as Module within an Integrated Language Processing Environment
... and use of lexical resources, manage the exchange of data between and among these resources, and to enable the merging of large numbers of different individual electronic resources to form large global electronic resources, so conversion of NooJ resources to LMF format (Lexical markup framework) ...
... This environment named WS4LR (WorkStation for Lexical Resources) has been developed within the Human Language Technology Group (HLT) at the Faculty of Mathematics, University of Belgrade, and is aimed at manipulating heterogeneous lexical resources developed in the course of many years and within ...
XML and table output by application of different lexical resources. After selecting the type of noojapply.exe usage the user can choose the dictionaries and morphological grammars that he wishes to apply from a list of available lexical resources. Next, one or more text files or corpus should be
Combining Heterogeneous Lexical Resources
... traditional lexical resources can not be directly used for the production of electronic resources, and almost none exist in electronic form, the Serbian resources presented in this paper have been manually produced, checked and double checked. Our standpoint is that only when reliable lexical resources ...
also present an integrated programming tool that enables the integration of these diverse lexical resources, as well as possible applications. We envisage the use of these resources in defining and linking lexical data in a way that will enable their more effective retrieval, integration, and reuse across
A Multilingual Evaluation Dataset for Monolingual Word Sense Alignment
Sina Ahmadi, John P McCrae, Sanni Nimb, Fahad Khan, Monica Monachini, Bolette S Pedersen, Thierry Declerck, Tanja Wissik, Andrea Bellandi, Irene Pisani, [...] Ranka Stanković and others (2020)Aligning senses across resources and languages is a challenging task with beneficial applications in the field of natural language processing and electronic lexicography. In this paper, we describe our efforts in manually aligning monolingual dictionaries. The alignment is carried out at sense-level for various resources in 15 languages. Moreover, senses are annotated with possible semantic relationships such as broadness, narrowness, relatedness, and equivalence. In comparison to previous datasets for this task, this dataset covers a wide range of languages ...... notoriously requiring data such as neural networks. Our resources are publicly available at https://github.com/elexis-eu/MWSA. Keywords: lexical semantic resources, sense alignment, lexicography, language resource 1. Introduction Lexical semantic resources (LSRs) are knowledge reposi- tories that provide ...
... Section 6. 2. Related work Aligning senses across lexical resources has been attempted in several lexicographical milieus over the recent years. Such resources mainly include open-source dictionaries, WordNet and collaboratively-curated resources, such as Wikipedia. The latter has been shown to be ...
... (2012) present UKB–a large-scale lexical-semantic resource con- taining pairwise sense alignments between a subset of nine resources in English and German which are mapped to a uniform representation. For Danish, aligning senses across modern lexical resources has been carried out in several projects ...Sina Ahmadi, John P McCrae, Sanni Nimb, Fahad Khan, Monica Monachini, Bolette S Pedersen, Thierry Declerck, Tanja Wissik, Andrea Bellandi, Irene Pisani, [...] Ranka Stanković and others . "A Multilingual Evaluation Dataset for Monolingual Word Sense Alignment" in Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), Marseille, European Language Resources Association (ELRA) (2020)
Vebran Web Services for Corpus Query Expansion
Ranka Stanković, Miloš Utvić (2020)U ovom radu se govori o razvoju veb usluga Vebran i njihovoj primeni u poboljšanju pretraživanja korpusa. Veb-servisi Vebran koriste se za konsultovanje spoljnih leksičkih izvora za srpski jezik (uglavnom elektronski morfološki rečnici i srpski Vordnet) i proširivanje korisničkih upita radi dobijanja relevantnijih rezultata iz srpskih korpusa.... language resources and 2) to enable querying language resources supported with available lexical resources. The language resources to be searched are various digital libraries and corpora, but in this paper, we will focus on corpus case study. The query expansion will rely on different lexical resources ...
... Engine in order to enable corpus query expansion. 3 Lexical resources In order to improve the current corpus search capabilities based on lin- guistic annotation, it is necessary to consult external lexical resources. The following lexical resources have been developed for Serbian by the HLT Group at ...
support query expansion based on lexical resources. Sections 2 and 3 describe language resources for Serbian, corpora that we can search and lexical resources that Natural Language Processing
Development of Open Educational Resources (OER) for Natural Language Processing
In this paper we present the development of an online course at the edX BAEKTEL platform named “Lexical Recognition in the Natural Language Processing (NLP)”. It is based on the course of the same name for PhD studies at the University of Belgrade, Faculty of Philology. There are not many courses in Computational Linguistics (CL) on OER platforms, and there is none in Serbian either for CL or NLP. We have developed this course in order to improve this ...... the necessary knowledge to use the existing resources for NLP for Serbian and to develop new ones. Keywords: E-Learning, Open Educational Resources, Computational Linguistics, Lexical Resources, edX 1. INTRODUCTION Open educational resources (OER) publicly available on the web are growing ...
... consider complex rules for MWU inflection in Serbian. 10. The use of powerful morphological mode is presented that enables the use of lexical resources at sub-word level, as well as the use of information from e-dictionaries for output transformations by transducers. More types of variables ...
... CONCLUSION We hope that the developed OER for lexical recognition in NLP will be used in order to reduce the lack of similar courses. We hope that participants will easily acquire the necessary knowledge to use the existing resources for NLP for Serbian and that the number of resource ...Cvetana Krstev, Biljana Lazić, Ranka Stanković, Giovanni Schiuma, Miladin Kotorčević. "Development of Open Educational Resources (OER) for Natural Language Processing" in The Sixth International Conference on e-Learning (eLearning-2015), September 2015, Belgrade, Serbia, Belgrade : Belgrade Metropolitan Univesity (2015)
A Description of Morphological Features of Serbian: a Revision using Feature System Declaration
In this paper we discuss some well-known morphological descriptions used in various projects and applications (most notably MULTEXT-East and Unitex) and illustrate the encountered problems on Serbian. We have spotted four groups of problems: the lack of a value for an existing category, the lack of a category, the interdependence of values and categories lacking some description, and the lack of a support for some types of categories. At the same time, various descriptions often describe exactly the same ...... ISO/TC 37/SC 4. ISO. (2009) ISO 12620 Terminology and other language and content resources – Data Categories – Specification of data categories and management of a data category registry for language resources Kešelj, V., Kešelj, T., and Zlatić, L. (2004). R{j}ecnik.com: English-Serbo-Croatian ...
... Workshop, Bratislava, Slovakia, 15-16 April, 2009. Metalanguage and encoding scheme design for digital lexicography : innovative solutions for lexical entry design in Slavic lexicography: proceedings. Bratislava: L'. Štúr Institute of Linguistic, Slovak Academy of Sciences, 2009, str. 59-70. ...
Laporte, E. and Monceaux, A. (1999). "Elimination of lexical ambiguities by grammars. The ELAG system", Lingvisticae Investigationes XXII, Amsterdam-Philadelphie : Benjamins, pp. 341-367.
The Dictionary of the Serbian Academy: from the Text to the Lexical Database
In this paper we discuss the project of digitization of the Dictionary of the Serbo-Croatian Standard and Vernacular Language. Scanning and character recognition were a particular challenge, since various non-standard character set encoding was used in the course of the almost 60-year long production of the dictionary. The first aim of the project was to formalize the micro-structure of the dictionary articles in order to parse the digitized text of and transform it into structured data stored in relational lexical database. This approach ...... Spohr, D., & Cimiano, P. (2011). Linking lexical resources and ontologies on the semantic web with lemon. In Extended Semantic Web Conference Springer, Berlin, Heidelberg, pp. 245-259. Monachini, M. & Khan, A. F. (2018). Towards the Construction of a Lexical Data and Technology Ecosystem: The Experience ...
... the developed model and software solution can be successfully used for the other volumes as well. Keywords: computer lexicography, lexical database, language resources, dictionary, Serbian language 1 Introduction The first volume of the Dictionary of the Serbo-Croatian Standard and Vernacular Language ...
... preferably use the same or compatible formal structure and markup language.2 This development led to further linking of lexical data and their integration with semantic resources, such as ontologies (McCrae et al., 2011). The DSA is rather special compared to similar dictionaries for other languages: ...Ranka Stanković, Rada Stijović, Duško Vitas, Cvetana Krstev, Olga Sabo. "The Dictionary of the Serbian Academy: from the Text to the Lexical Database" in Proceedings of the XVIII EURALEX International Congress: Lexicography in Global Contexts, Ljubljana : Ljubljana University Press, Faculty of Arts (2018)
Terminological and lexical resources used to provide open multilingual educational resources
... system consists of several software components administrating in the same time language resources: grammars, lexical and textual resources (Image 1). 4. LEXICAL RESOURCES Morphological dictionaries are meant to be used by computers in the process of query expansion. Their usage is necessary because ...
a brief history and current state of the art of terminological resources are presented, followed by an overview of BAEKTEL (Blending Academic and Entrepreneurial Knowledge in Technology enhanced learning) resources, lexical resources, the process of terminology extraction and a presentation of TERMI
Речници у дигиталном добу - информатичка подршка за српски језик
An Approach to Efficient Processing of Multi-Word Units
Efficient processing of Multi-Word Units in the course of development of morphological MWU dictionaries is not easy to achieve, especially when languages with complex morphological structures are concerned, such as Serbian. Manual development of this type of dictionaries is a tedious and extremely slow process. To alleviate this problem we turned to our multipurpose software tool, dubbed LeXimir, in the production of lemmas for e-dictionaries of multi-word units. In addition to that, we developed a procedure aimed at making ...... and WNDic- tAuto.dll (Fig. 2). For communication with lexical resources LeXimir makes use of the NlpQuery.dll module. Modular organization of components provides two obvi- ous benefits. In the first place, it enables the use of various resources in any part of the system, wherever they are needed. ...
... Proc. of the Workshop on Multilingual Resources, Technologies and Evaluation for Central and Eastern European Languages — RANLP09, pp. 23–29. Borovetz, Bulgaria (2009) 9. Krstev, C., Stanković, R., Vitas, D., Obradović, I.: The Usage of Various Lexical Resources and Tools to Improve the Performance ...
... point is that it seems that the identification and extraction of MWUs has attracted more attention of researchers than their lexical representation. Various approaches to lexical representation of MWUs were analyzed in detail by Savary [16]. Slavic languages are analyzed in [14] and arguments are presented ...Cvetana Krstev, Ivan Obradović, Ranka Stanković, Duško Vitas. "An Approach to Efficient Processing of Multi-Word Units" in Computational Linguistics - Applications, Studies in Computational Intelligence 458 no. 458, Berlin Heidelberg : Springer-Verlag (2013): 109-129. https://doi.org/10.1007/978-3-642-34399-5_6
Multi-word Expressions for Abusive Speech Detection in Serbian
Ovaj rad predstavlja istraživanja na usavršavanju i unapređenju srpske verzije rečnika Hurtlex, višejezičnog leksikona uvredljivih reči. Posebnu pažnju posvećujemo dodavanju izraza sa više reči (polileksemskih jedinica) koji se mogu smatrati uvredljivim, jer su takvi leksički zapisi veoma važni za postizanje dobrih rezultata u mnoštvu zadataka otkrivanja uvredljivog jezika. Srpski morfološki rečnici se koriste kao osnova za čišćenje podataka i stvaranje rečnika. Istaknuta je veza sa drugim leksičkim i semantičkim resursima na srpskom jeziku i predviđena je izgradnja sistema za ...... as abusive, as such lexical entries are very important in obtaining good results in a plethora of abusive language detection tasks. We use Serbian morphological dictionaries as a basis for data cleaning and MWE dictionary creation. A connection to other lexical and semantic resources in Serbian is outlined ...
... Fabrizio Sebastiani. 2006. SENTIWORDNET: A publicly available lexical resource for opin- ion mining. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06), Genoa, Italy, May. European Language Resources Association (ELRA). Njagi Dennis Gitari, Zhang Zuping, ...
... 1621–1622. Biljana Lazić and Mihailo Škorić. From dela based dictionary to leximirka lexical database. Jelena Mitrović, Miljana Mladenović, and Cvetana Krstev. 2015. Adding mwes to serbian lexical resources using crowdsourcing. In poster presented at The 5th PARSEME general meeting. Ias, i, Romania ...Ranka Stanković, Jelena Mitrović, Danka Jokić, Cvetana Krstev. "Multi-word Expressions for Abusive Speech Detection in Serbian" in Proceedings of the Joint Workshop on Multiword Expressions and Electronic Lexicons, Association for Computational Linguistics (2020)
A Data Driven Approach for Raw Material Terminology
Olivera Kitanović, Ranka Stanković, Aleksandra Tomašević, Mihailo Škorić, Ivan Babić, Ljiljana Kolonja (2021)The research presented in this paper aims at creating a bilingual (sr-en), easily searchable, hypertext, born-digital, corpus-based terminological database of raw material terminology for dictionary production. The approach is based on linking dictionaries related to the raw material domain, both digitally born and printed, into a lexicon structure, aligning terminology from different dictionaries as much as possible. This paper presents the main features of this approach, data used for compilation of the terminological database, the procedure by which it has ...sirovine, rudarstvo, terminologija, rečnik, terminološka aplikacija, mobilna aplikacija, digitizacija, leksički podaci, korpusi, otvoreni povezani podaci... this goal is to adopt the Linked (Open) Data (LOD) paradigm for publishing lexical resources, that is, to use URIs for unambiguously identifying lexical entries, their components and their relations in the web of data—to make lexical datasets accessible via http(s), to publish them in accordance with W3 ...
... on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan, 7–12 May 2018; European Language Resources Association (ELRA): Miyazaki, Japan, 2018. 40. šandrih, B.; Krstev, C.; Stanković, R. Two approaches to compilation of bilingual multi-word terminology lists from lexical resources. Nat. Lang ...
... developing this system, a data driven approach is adopted, relying on available textual, lexical and terminological resources, both in printed and electronic form. Within the development of this system, printed resources, the paper dictionaries covering raw material terminology, were subjected to systematic ...Olivera Kitanović, Ranka Stanković, Aleksandra Tomašević, Mihailo Škorić, Ivan Babić, Ljiljana Kolonja. "A Data Driven Approach for Raw Material Terminology" in Applied Sciences, MDPI AG (2021). https://doi.org/10.3390/app11072892
An Approach to Development of Bilingual Lexical Resources
... 102 language resources such as grammars in the form of finite automata and transducers, as well as various lexical resources. Bibliša is able to expand search queries both morphologically and semantically, as well as to another language. One type of lexical resources, morphological e-d ...
An Approach to Development of Bilingual Lexical Resources Stanković Ranka, Obradović Ivan, Trtovac Aleksandra. "An Approach to Development of Bilingual Lexical Resources" in Proceedings of the Fifth Balkan Conference in Informatics BCI 2012, Workshop on Computational Linguistics and Natural Language Processing of Balkan Languages – CLoBL 2012, September 2012, Novi Sad : BCI (2012)
Open Educational Resources in Serbia
... incorporate knowledge from various language and lexical resources. She is head of Computer Centre for the Mining department, Chairman of Technical comity A037 Terminology in Institute for Standardisation of Serbia and vice president of Language Resources and Technologies Society (JERTEH). She actively ...
... OPEN EDUCATIONAL RESOURCES IN SERBIA AUTHOR(s) - Ivan Obradović, Ranka Stanković, Marija Blagojević, Danijela Milošević Abstract: This chapter provides a review of open educational resources in Serbia. It covers different aspects of open educational resources: policy, resources, licenses, ...
current state of open educational resources development and implementation in Serbia. Analysis of the results show an affirmative direction of open educational resources implementation in Serbia and future possibilities. Key words: Open educational resources, BAEKTEL, metadata portal
Using Metadata For Content Indexing Within An OER Network
Ranka Stanković, Olivera Kitanović, Ivan Obradović, Roberto Linzalone, Giovanni Schiuma, Daniela Carlucci (2014)... by scoring resources against keywordson basis of user search activity Preselected groups of resources Resource access level permissions by user group Multilingual,allowingthe user to change the languagewith most major languages supported Automatic thumbnail creation for resources Minimal hosting ...
... for a metadata portal indexing open educational resources within a network of institutions.The network is aimed at blending academic and entrepreneurial knowledge,by enabling higher education institutions to publish various academic learning resources e.g. video lectures, course planning materials ...
... corresponding metadata portal described in this paper is to provide structured access to information on open educational resources within the network. Keywords:OER, Open educational resources, metadata, TEL, Technology enhanced learning 1. INTRODUCTION Due to intense technological development there is a ...Ranka Stanković, Olivera Kitanović, Ivan Obradović, Roberto Linzalone, Giovanni Schiuma, Daniela Carlucci. "Using Metadata For Content Indexing Within An OER Network" in Proceedings of the Fifth International Conference on e-Learning, eLearning 2014, September 2014, Belgrade, Serbia, Belgrade : Belgrade Metropolitan University (2014)
On the compatibility of lexical resources for NooJ
... Proceedings of the 2011 International NooJ Conference 1 ON THE COMPATIBILITY OF LEXICAL RESOURCES FOR NOOJ RANKA STANKOVIĆ, MILOŠ UTVIĆ, DUŠKO VITAS, CVETANA KRSTEV AND IVAN OBRADOVIĆ Abstract Lexical resources for many languages are provided for the NooJ linguistic development environment ...
for improvement in both directions. Introduction: Motivation, resources and task Lexical resources for NooJ are now available in a considerable number of different languages. The compatibility of these monolingual resources, namely the extent to which they mutually correspond is thus becoming
Resource-based WordNet Augmentation and Enrichment
In this paper we present an approach to support production of synsets for SerbianWordNet(SerWN)byadjustingPrincetonWordNet(PWN)synsetsusing several bilingual English-Serbian resources. PWN synset definitions were automatically translated and post-edited, if needed, while candidate literals for Serbian synsets were obtained automatically from a list of translational equivalents compiled form bilingual resources. Preliminary results obtained from a setof1248selectedPWNsynsetsshowthattheproducedSerbiansynsetscontain 4024 literals, out of which 2278 were offered by the system we present in this paper, whereas experts added the remaining 1746. Approximately one half of ...... SerWN. A brief description of these resources follows. Parallel list is a simple bilingual parallel list, developed gradually from various resources and used as an auxiliary resource in WS4LR (later upgraded and dubbed LeXimir), a workstation for lexical resources we have developed (Krstev et al., 2006) ...
... solved the word sense alignment (WSA) task by pairing senses with the same meaning from different lexical-semantic resources. Besides alignment with a developed wordnet, the use of other available resources for development and enrichment of wordnets have also been proposed. Thus, Oliver and Climent (2014) ...
... automatically enriching wordnets using other available lexical resources, the successfulness of the method is strongly correlated with the comprehensiveness of the resource used in the alignment process (Hristea, 2007). Different methods and resources can be used for alignment. One of the common approaches ...Ranka Stanković, Miljana Mladenović, Ivan Obradović, Marko Vitas, Cvetana Krstev. "Resource-based WordNet Augmentation and Enrichment" in Proceedings of the Third International Conference Computational Linguistics in Bulgaria (CLIB 2018), May 27-29, 2018, Sofia, Bulgaria, Sofia : The Institute for Bulgarian Language Prof. Lyubomir Andreychin, Bulgarian Academy of Sciences (2018)
Using technology for knowledge transfer between academia and enterprises
Ivan Obradović, Ranka Stanković (2014)... structure is outlined in Figure 3, is based on electronic language resources, namely, lexical resources, textual resources and grammars. Bilingual dictionaries in electronic form are one of the simplest multilingual lexical resources. However, for their full functionality in languages with complex ...
... tool for lexical resources management and query expansion developed at FMG (Stanković et al., 2011). Besides specific tools, the TEL platform has corresponding resources, which have already been briefly described at the beginning of this section. An important place among the resources is occupied ...
... et al., 2010) are thus also part of the lexical resources used by LSS. Besides Serbian, such resources exist for many other languages, including English and Russian, which are also envisaged as OER languages within our TEL platform. Another important lexical resource offering support for multilingual ...Ivan Obradović, Ranka Stanković. "Using technology for knowledge transfer between academia and enterprises" in Knowledge and Management Models for Sustainable Growth, Proc. of IFKAD 2014, 9th International Forum on Knowledge Asset Dynamics, 11-13 June 2013, Matera, Italy, Bari : IFKAD (2014)