The Nooj System as Module within an Integrated Language Processing Environment
Written Texts: An Overview of Resources and Basic Tools., Workshop on Balkan Language Resources and Tools, Thessaloniki, Greece, eds, S. Piperidis and V. Karkaletsis, pp. 97-104, 2003. Vossen, P. (ed.): EuroWordNet: A Multilingual Database with Lexical Semantic Networks, Kluwer Academic publishers
... and use of lexical resources, manage the exchange of data between and among these resources, and to enable the merging of large numbers of different individual electronic resources to form large global electronic resources, so conversion of NooJ resources to LMF format (Lexical markup framework) ...
... to the addition of semantic information one lemma has to be separated in two or more lemmas, the copies of the original lemma can be made and appropriate semantic information added to each of them. Figure 6. The semantic separation of the lemma cyelo 4. Textual resources management 4.1. ...Ranka Stanković, Duško Vitas, Cvetana Krstev. "The Nooj System as Module within an Integrated Language Processing Environment" in Proceedings of the 2007 International Nooj Conference, Cambridge Scholars Publishing (2008)
WS4LR - a Worksation for Lexical Resources
... workstation for lexical resources, a software tool developed within the Human Language Technology Group at the Faculty of Mathematics, University of Belgrade. The tool is aimed at manipulating heterogeneous lexical resources, and the need for such a tool came from the large volume of resources the Group ...
... Human Language Technology group at the Faculty of Mathematics has been developing various lexical resources over quite a long period, reaching a considerable volume to date. Given the fact that these resources have been developed for many years, they have naturally been conceived within different ...Cvetana Krstev, Ranka Stanković, Duško Vitas, Ivan Obradović. "WS4LR - a Worksation for Lexical Resources" in Proceedings of the Fifth Interantional Conference on Language Resources and Evaluation, Genoa, Italy, May 2006, ELRA - European Language Resources Association (2006)
Combining Heterogeneous Lexical Resources
Vitas, D. et al. (2003). Resources and Basic Tools for the Processing of Serbian Written Texts. Proc. of the Workshop on Balkan Language Resources, 1st Balkan Conference in Informatics. - Vossen, P. (ed.) (1998). EuroWordNet: A Multilingual Database with Lexical Semantic Networks. Dordrecht: Kluwer
... traditional lexical resources can not be directly used for the production of electronic resources, and almost none exist in electronic form, the Serbian resources presented in this paper have been manually produced, checked and double checked. Our standpoint is that only when reliable lexical resources ...
A Multilingual Evaluation Dataset for Monolingual Word Sense Alignment
Sina Ahmadi, John P McCrae, Sanni Nimb, Fahad Khan, Monica Monachini, Bolette S Pedersen, Thierry Declerck, Tanja Wissik, Andrea Bellandi, Irene Pisani, [...] Ranka Stanković and others (2020)Aligning senses across resources and languages is a challenging task with beneficial applications in the field of natural language processing and electronic lexicography. In this paper, we describe our efforts in manually aligning monolingual dictionaries. The alignment is carried out at sense-level for various resources in 15 languages. Moreover, senses are annotated with possible semantic relationships such as broadness, narrowness, relatedness, and equivalence. In comparison to previous datasets for this task, this dataset covers a wide range of languages ...... notoriously requiring data such as neural networks. Our resources are publicly available at https://github.com/elexis-eu/MWSA. Keywords: lexical semantic resources, sense alignment, lexicography, language resource 1. Introduction Lexical semantic resources (LSRs) are knowledge reposi- tories that provide ...
... (2012) present UKB–a large-scale lexical-semantic resource con- taining pairwise sense alignments between a subset of nine resources in English and German which are mapped to a uniform representation. For Danish, aligning senses across modern lexical resources has been carried out in several projects ...
... Wiktionary. On the other hand, there are a fewer number of manu- ally aligned monolingual resources in other languages. For instance, there have been considerable efforts in aligning lexical semantic resources (LSRs) in German, particularly, the GermaNet–the German Wordnet (Hamp and Feldweg, 1997) ...Sina Ahmadi, John P McCrae, Sanni Nimb, Fahad Khan, Monica Monachini, Bolette S Pedersen, Thierry Declerck, Tanja Wissik, Andrea Bellandi, Irene Pisani, [...] Ranka Stanković and others . "A Multilingual Evaluation Dataset for Monolingual Word Sense Alignment" in Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), Marseille, European Language Resources Association (ELRA) (2020)
Vebran Web Services for Corpus Query Expansion
Ranka Stanković, Miloš Utvić (2020)U ovom radu se govori o razvoju veb usluga Vebran i njihovoj primeni u poboljšanju pretraživanja korpusa. Veb-servisi Vebran koriste se za konsultovanje spoljnih leksičkih izvora za srpski jezik (uglavnom elektronski morfološki rečnici i srpski Vordnet) i proširivanje korisničkih upita radi dobijanja relevantnijih rezultata iz srpskih korpusa.... external lexical resources. The presented approach allows search results to include the inflectional paradigm of the lexemes in a given user query, as well as the word forms semantically related to them (synonyms, antonyms, hyperonyms, etc.) where semantic relations are available through the semantic network ...
... allowed to send a request spec- ifying the term X and the relation (semantic or morphological) which should exist between X and the requested lemmas or word forms. Based on the client’s request, Vebran services consult external lexical resources (see Sec- tion 3) and generate a response to the client. Communication ...
... language resources and 2) to enable querying language resources supported with available lexical resources. The language resources to be searched are various digital libraries and corpora, but in this paper, we will focus on corpus case study. The query expansion will rely on different lexical resources ...Ranka Stanković, Miloš Utvić. "Vebran Web Services for Corpus Query Expansion" in Infotheca, Faculty of Philology, University of Belgrade (2020). https://doi.org/10.18485/infotheca.2019.19.2.5
A Description of Morphological Features of Serbian: a Revision using Feature System Declaration
In this paper we discuss some well-known morphological descriptions used in various projects and applications (most notably MULTEXT-East and Unitex) and illustrate the encountered problems on Serbian. We have spotted four groups of problems: the lack of a value for an existing category, the lack of a category, the interdependence of values and categories lacking some description, and the lack of a support for some types of categories. At the same time, various descriptions often describe exactly the same ......semantic-features> semantic-features>
ISO/TC 37/SC 4. ISO. (2009) ISO 12620 Terminology and other language and content resources – Data Categories – Specification of data categories and management of a data category registry for language resources
... values 2 and 4 correspond to g and a, respectively.) The last line in the above description gives important additional information: nouns marked by a semantic marker +Const in an e-dictionary do not inflect (e.g. feminine personal names of foreign origin Karmen and numeral nouns dvadesetak ‘approximately ...Cvetana Krstev, Ranka Stanković, Vitas Duško. "A Description of Morphological Features of Serbian: a Revision using Feature System Declaration" in Proceedings of the 5th International Conference on Language Resources and Evaluation, LREC 2010, Valetta, Malta : European Language Resources Association (2010)
The Dictionary of the Serbian Academy: from the Text to the Lexical Database
In this paper we discuss the project of digitization of the Dictionary of the Serbo-Croatian Standard and Vernacular Language. Scanning and character recognition were a particular challenge, since various non-standard character set encoding was used in the course of the almost 60-year long production of the dictionary. The first aim of the project was to formalize the micro-structure of the dictionary articles in order to parse the digitized text of and transform it into structured data stored in relational lexical database. This approach
... preferably use the same or compatible formal structure and markup language.2 This development led to further linking of lexical data and their integration with semantic resources, such as ontologies (McCrae et al., 2011). The DSA is rather special compared to similar dictionaries for other languages: ...
... 3.2 Dictionary markers Beside various semantic, accentual and grammatical (phonetic, morphological and, more recently, syntactic) information, the DSA also includes indications of the normative, functional, stylistic and socio-historical status of the lexical entries, as well as their spatial and temporal ...Ranka Stanković, Rada Stijović, Duško Vitas, Cvetana Krstev, Olga Sabo. "The Dictionary of the Serbian Academy: from the Text to the Lexical Database" in Proceedings of the XVIII EURALEX International Congress: Lexicography in Global Contexts, Ljubljana : Ljubljana University Press, Faculty of Arts (2018)
Terminological and lexical resources used to provide open multilingual educational resources
... system consists of several software components administrating in the same time language resources: grammars, lexical and textual resources (Image 1). 4. LEXICAL RESOURCES Morphological dictionaries are meant to be used by computers in the process of query expansion. Their usage is necessary because ...
... a brief history and current state of the art of terminological resources are presented, followed by an overview of BAEKTEL (Blending Academic and Entrepreneurial Knowledge in Technology enhanced learning) resources, lexical resources, the process of terminology extraction and a presentation of TERMI ...Biljana Lazić, Danica Seničić, Aleksandra Tomašević, Bojan Zlatić. "Terminological and lexical resources used to provide open multilingual educational resources" in The Seventh International Conference on eLearning (eLearning-2016), 29-30 September 2016, Belgrade, Serbia, Belgrade : Belgrade Metropolitan University (2016)
Multi-word Expressions for Abusive Speech Detection in Serbian
Ovaj rad predstavlja istraživanja na usavršavanju i unapređenju srpske verzije rečnika Hurtlex, višejezičnog leksikona uvredljivih reči. Posebnu pažnju posvećujemo dodavanju izraza sa više reči (polileksemskih jedinica) koji se mogu smatrati uvredljivim, jer su takvi leksički zapisi veoma važni za postizanje dobrih rezultata u mnoštvu zadataka otkrivanja uvredljivog jezika. Srpski morfološki rečnici se koriste kao osnova za čišćenje podataka i stvaranje rečnika. Istaknuta je veza sa drugim leksičkim i semantičkim resursima na srpskom jeziku i predviđena je izgradnja sistema za ...... as abusive, as such lexical entries are very important in obtaining good results in a plethora of abusive language detection tasks. We use Serbian morphological dictionaries as a basis for data cleaning and MWE dictionary creation. A connection to other lexical and semantic resources in Serbian is outlined ...
Andrea Esuli and Fabrizio Sebastiani. 2006. SENTIWORDNET: A publicly available lexical resource for opin- ion mining. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC'06), Genoa, Italy
Biljana Lazić and Mihailo Škorić. From dela based dictionary to leximirka lexical database. Jelena Mitrović, Miljana Mladenović, and Cvetana Krstev. 2015. Adding mwes to serbian lexical resources using crowdsourcing. In poster presented at The 5th PARSEME general meeting. Ias, i, Romania
A Data Driven Approach for Raw Material Terminology
The research presented in this paper aims at creating a bilingual (sr-en), easily searchable, hypertext, born-digital, corpus-based terminological database of raw material terminology for dictionary production. The approach is based on linking dictionaries related to the raw material domain, both digitally born and printed, into a lexicon structure, aligning terminology from different dictionaries as much as possible. This paper presents the main features of this approach, data used for compilation of the terminological database, the procedure by which it has
... this goal is to adopt the Linked (Open) Data (LOD) paradigm for publishing lexical resources, that is, to use URIs for unambiguously identifying lexical entries, their components and their relations in the web of data—to make lexical datasets accessible via http(s), to publish them in accordance with W3 ...
... on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan, 7–12 May 2018; European Language Resources Association (ELRA): Miyazaki, Japan, 2018. 40. šandrih, B.; Krstev, C.; Stanković, R. Two approaches to compilation of bilingual multi-word terminology lists from lexical resources. Nat. Lang ...Olivera Kitanović, Ranka Stanković, Aleksandra Tomašević, Mihailo Škorić, Ivan Babić, Ljiljana Kolonja. "A Data Driven Approach for Raw Material Terminology" in Applied Sciences, MDPI AG (2021). https://doi.org/10.3390/app11072892
Building Terminological Resources in an e-Learning Environment
... or multilingual terminological resources, corresponding terms are usually linked by appropriate mechanisms. In taxonomies semantic relations between terms are introduced, or more precisely, between concepts represented by specific terms. The elementary semantic relationship is the hypernym/hyponym ...
... of terms Taxonomy A hierarchical organization of terms RudOnto Figure 1: Semantic scale of terminological resources Although currently realized essentially as a taxonomy, RudOnto includes some semantic relations besides hypernymy/hyponymy, thus displaying some characteristics of a thesaurus ...
... core of the Web of knowledge – the Semantic Web, which can serve as an invaluable e- learning tool [4]. In the Semantic Web ontologies are integrated into repositories of learning objects, with the aim of organizing different concepts stored within these resources in what is known as “knowledge domain ...Ranka Stanković, Ivan Obradović, Olivera Kitanović, Ljiljana Kolonja. "Building Terminological Resources in an e-Learning Environment" in Proceedings of the Third International Conference on e-Learning, eLearning-2012, September 2012, Belgrade, Serbia, Belgrade : Belgrade Metropolitan University (2012)
Towards a Mining Equipment Ontology
... 2. TERMINOLOGICAL RESOURCES 2.1 Semantic scale of terminological resources In general, terminological resources in e-format can be organized in different ways and serve different purposes. Depending on their content, structure, and organization, terminological resources can be classified as ...
... relevant terminological resources in electronic format, preferably including relations of a semantic nature between terms. The simplest semantic relations are those between general and specific terms, such as coal mine, and open pit, as a specific type of coal mines. Such semantic relations result in ...
... organization of terms RudOnto Figure 1. Semantic scale of terminological resources The boundary between a thesaurus and an ontology is somewhat blurred, and some authors even consider ontologies and thesauruses as the same type of resources, with the difference resulting only from their ...Ranka Stanković, Ivan Obradović, Olivera Kitanović, Ljiljana Kolonja. "Towards a Mining Equipment Ontology" in Proceedings of the 12th International Conference Research and Development in Mechanical Industry, RaDMI 2012, September 2012, Vrnjačka Banja, Serbia no. 1, Vrnjačka Banja, Serbia : SaTCIP (Scientific and Technical Center for Intellectual Property) Ltd. (2012)
Keyword-Based Search on Bilingual Digital Libraries
This paper outlines the main features of Biblisha, a tool that offers various possibilities of enhancing queries submitted to large collections of aligned parallel text residing in bilingual digital library. Biblishsa supports keyword queries as an intuitive way of specifying information needs. The keyword queries initiated, in Serbian or English, can be expanded, both semantically, morphologically and in other language, using different supporting monolingual and bilingual resources. Terminological and lexical resources are of various types, such as wordnets, electronic ...Ranka Stanković, Cvetana Krstev, Duško Vitas, Nikola Vulović, Olivera Kitanović. "Keyword-Based Search on Bilingual Digital Libraries" in Semantic Keyword-Based Search on Structured Data Sources - Second COST Action IC1302 International KEYSTONE Conference, IKC 2016, Springer (2017). https://doi.org/10.1007/978-3-319-53640-8_10
Digital Library From A Domain Of Criminalistics As A Foundation For A Forensic Text Analysis
U ovom radu predstavljen je model koji omogućava prikupljanje, pripremu, opis metapodataka, upravljanje i eksploataciju, uključujući pretragu punog teksta dokumenata iz domena kriminalistike napisanih na srpskom jeziku. Predloženi pristup primenjuje se na veb portalu koji sakuplja različite tekstove nastale iz časopisa Akademije za kriminalistiku i policijske studije, Krivičnog zakona Srbije, konferencija „Tara“ i „Reiss“, kao i iz nekih doktorskih disertacija vezanih za ovu oblast istraživanje. Nakon obrade teksta, korpus koji sadrži preko 5500 stranica običnog teksta, kreiran je i ...... APPLICATIONS FOR LINGUISTIC RESOURCES The linguistics and lexical resources used for query expansion and text analysis are depicted in Figure 3 on the left, while on the right are main application components of the language support system. Main lexical resources include morphological dictionaries ...
Obradović, "The Usage of Various Lexical Resources and Tools to Improve the Performance of Web Search Engines", in Proceedings of the Sixth Interantional Conference on Language Resources and Evaluation (LREC'08), Marrakech, Morocco, 28-30 May 2008, European Language Resources Association (ELRA), 2008
... Obradović, “The Usage of Various Lexical Resources and Tools to Improve the Performance of Web Search Engines”, in Proceedings of the Sixth Interantional Conference on Language Resources and Evaluation (LREC'08), Marrakech, Morocco, 28-30 May 2008, European Language Resources Association (ELRA), 2008 ...Dalibor Vorkapić, Aleksandra Tomašević, Miljana Mladenović, Ranka Stanković, Nikola Vulović. "Digital Library From A Domain Of Criminalistics As A Foundation For A Forensic Text Analysis" in International Scientific Conference “Archibald Reiss Days” Thematic Conference Proceedings Of International Significance, Belgrade, 7-9 November 2017, Academy Of Criminalistic And Police Studies Belgrade (2017)
On the compatibility of lexical resources for NooJ
Lexical resources for many languages are provided for the NooJ linguistic development environment. Meta-data descriptions of morphosyntactic and semantic properties of these languages and their resources are a mandatory part of each language module. In this paper we analyze how well the meta-data actually describe resources for a chosen subset of languages and to what extent are they compatible across languages to support multilingual processing. We show that there is place for improvement in both directions.... COMPATIBILITY OF LEXICAL RESOURCES FOR NOOJ RANKA STANKOVIĆ, MILOŠ UTVIĆ, DUŠKO VITAS, CVETANA KRSTEV AND IVAN OBRADOVIĆ Abstract Lexical resources for many languages are provided for the NooJ linguistic development environment. Meta-data descriptions of morphosyntactic and semantic properties ...
... results of lexical analysis of the application of NooJ resources to aligned texts. Finally, a section is dedicated to some related issues of compatibility and standardization. The paper ends with concluding remarks. Comparison of annotation systems Morphological, syntactic and semantic information ...
... information in NooJ resources is represented by codes or tags, pertaining to morpho-syntactic and semantic categories, their properties and the features or values of these properties. These codes, such as N for nouns, Top for toponyms, Hyd for hydronyms, can be assigned to lexical entries and subsequently ...Ranka Stanković, Miloš Utvić, Duško Vitas, Cvetana Krstev, Ivan Obradović. "On the compatibility of lexical resources for NooJ" in Automatic Processing of Various Levels of Linguistic Phenomena: Selected Papers from the 2011 International Nooj Conference, Cambridge Scholars Publishing (2012): 96-108
Softverski alati za korišćenje resursa za srpski jezik
Ivan Obradović, Ranka Stanković (2008)... lexicon” which could be used in psycholinguistic research projects. PWN, the lexical data base that materializes the semantic network of concept for English, is based on the SOFTWARE TOOLS FOR SERBIAN LEXICAL RESOURCES 46a representation of each concept by a set of synon- ymous word-sense pairs ...
... Serbian (SMD). Another highly important and developed resource is the Serbian wordnet (SWN), a lexical database representing the semantic network of words in Serbian. With- in this group of resources, the multilingual onto- logical dictionary of proper names Prolex should also be mentioned. Besides ...
... description of lexical resources for Serbian, in Section Three the main functionalities of the WS4LR tool, and in Section Four some possibilities offered by the web application WS4QE. 2 Lexical resources In this Section we will give a brief descrip- tion of some of the most important lexical re- sources ...Ivan Obradović, Ranka Stanković. "Softverski alati za korišćenje resursa za srpski jezik" in INFOteka: časopis za informatiku i bibliotekarstvo, Belgrade, Serbia : Zajednica biblioteka univerziteta u Srbiji (2008)
Managing mining project documentation using human language technology
Purpose: This paper aims to develop a system, which would enable efficient management and exploitation of documentation in electronic form, related to mining projects, with information retrieval and information extraction (IE) features, using various language resources and natural language processing. Design/methodology/approach: The system is designed to integrate textual, lexical, semantic and terminological resources, enabling advanced document search and extraction of information. These resources are integrated with a set of Web services and applications, for different user profiles and use-cases. Findings: The ...Digital libraries, Information retrieval, Data mining, Human language technologies, Project documentationAleksandra Tomašević, Ranka Stanković, Miloš Utvić, Ivan Obradović, Božo Kolonja . "Managing mining project documentation using human language technology" in The Electronic Library (2018). https://doi.org/10.1108/EL-11-2017-0239
E-Connecting Balkan Languages
In this paper we present a versatile language processing tool that can be successfully used for many Balkan languages. This tool relies for its work on several sophisticated textual and lexical resources that were developed for most of Balkan languages. These resources are based on several de facto standards in natural language processing.... basis of the incorporated lexical resources [9]. The new tool WS4QE (shortened for Work Station for Query Expansion) was developed on the basis of WS4LR that enables expansion of queries submitted to the Google search machine [10]. The integrated lexical resources enable modifications of users ...
... them. 2. Integrated Language Resources In order to prove the usability of WS4LR and WS4QE for languages other then Serbian and English we used various resources, both textual and lexical. In the following sections we will briefly present these resources, what methodological framework was ...
et al. Combining Heterogeneous Lexical Resources, in Proc. of the Fourth International Conference LREC, Lisbon, Portugal, May 2004, vol. 4, pp. 1103-1106, 2004. [9] C. Krstev, R. Stanković, D. Vitas, I. Obradović. WS4LR: A Workstation for Lexical Resources, Proceedings of the 5th International
LRMI markup of OER content within the BAEKTEL project
... in more detail in section 2. In section 3 a review of semantic annotation implementation with examples of resource tagging is given. Section 4 of this paper outlines the key aspects of the LRMI standard for describing educational resources, including metadata schema and implementation of LRMI ...
... mailto:biljana.lazic@rgf.bg.ac.rs mailto:miladin.kotorcevic@rgf.bg.ac.rs that powers edX courses. The approach for semantic annotation, respectively markup of edX.BAEKTEL resources is given in section 6. Last section 7 gives conclusions, followed by expectation of benefit and future implementation ...
... analiza teksta about: konačni automati Apart from edX resources, other OER published within BAEKTEL platform will be annotated as well in similar way. 7. CONCLUSION The future of the web, what some call Web 3.0, is based on semantic search and algorithms that will help machines make sense ...Ranka Stanković, Daniela Carlucci, Olivera Kitanović, Nikola Vulović, Bojan Zlatić. "LRMI markup of OER content within the BAEKTEL project" in The Sixth International Conference on e-Learning (eLearning-2015), September 2015, Belgrade, Serbia, Belgrade : Belgrade Metropolitan Univesity (2015)
A Mathematical Learning Environment Based on Serbian Language Resources
In recent years, in line with ever growing usage of Information technology, the learning environments are changing. The amount of available learning materials in various forms has increased. These new environments demand comprehensive learning systems, which enable management of the learning corpus with special attention paid to relevant lexical resources. In this paper we present the concept of a Mathematical Learning Environment in Serbian (MLES), which is based on a corpus of mathematical materials and various lexical resources, enabling ...... with special attention paid to relevant lexical resources. In this paper we present the concept of a Mathematical Learning Environment in Serbian (MLES), which is based on a corpus of mathematical materials and various lexical resources, enabling semantic search of mathematical content. A specific ...
... structures with well-defined symbols. This type of support for Serbian is still not available. Existing Serbian lexical resources and tools enable efficient text search, including semantic and morphological expansion of user queries, the latter being very important in highly inflective languages ...
... tool developed within this group that greatly enhances the potential of manipulating each particular lexical resource as well as several resources simultaneously [10]. Although the resources and tools have already been successfully used for a number of various language processing related tasks ...Radojičić Marija, Obradović Ivan, Stanković Ranka, Utvić Miloć, Kaplar Sebastijan. "A Mathematical Learning Environment Based on Serbian Language Resources" in Proceedings of the 7th International Scientific Conference Technics and Informatics in Education, Faculty of Technical Sciences, Čačak (2018)