Претрага
35 items
-
A Multilingual Evaluation Dataset for Monolingual Word Sense Alignment
Sina Ahmadi, John P McCrae, Sanni Nimb, Fahad Khan, Monica Monachini, Bolette S Pedersen, Thierry Declerck, Tanja Wissik, Andrea Bellandi, Irene Pisani, [...] Ranka Stanković and others (2020)Aligning senses across resources and languages is a challenging task with beneficial applications in the field of natural language processing and electronic lexicography. In this paper, we describe our efforts in manually aligning monolingual dictionaries. The alignment is carried out at sense-level for various resources in 15 languages. Moreover, senses are annotated with possible semantic relationships such as broadness, narrowness, relatedness, and equivalence. In comparison to previous datasets for this task, this dataset covers a wide range of languages ...... Bibliographical References Ahmadi, S., Arcan, M., and McCrae, J. (2018). On lex- icographical networks. In Workshop on eLexicography: Between Digital Humanities and Artificial Intelligence. Burgun, A. and Bodenreider, O. (2001). Comparing terms, concepts and semantic classes in WordNet and the Uni- ...
... Francisco. A. Authors’ affiliations 1Society for Danish Language and Literature (DSL), Copenhagen, Denmark {sn,tt}@dsl.dk 2Austrian Centre for Digital Humanities and Cultural Heritage, Austrian Academy of Sciences, Vienna, Austria tanja.wissik@oeaw.ac.at 3Istituto di Linguistica Computazionale “A. Zampolli– ...
... приступ издањима Факултета и радовима запослених доступним у слободном приступу. - Претрага репозиторијума доступна је на www.dr.rgf.bg.ac.rs The Digital repository of The University of Belgrade Faculty of Mining and Geology archives faculty publications available in open access, as well as the employees' ...Sina Ahmadi, John P McCrae, Sanni Nimb, Fahad Khan, Monica Monachini, Bolette S Pedersen, Thierry Declerck, Tanja Wissik, Andrea Bellandi, Irene Pisani, [...] Ranka Stanković and others . "A Multilingual Evaluation Dataset for Monolingual Word Sense Alignment" in Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), Marseille, European Language Resources Association (ELRA) (2020)
-
Split-Desktop software for the analysis of fragment size distribution of blasted rock mass
Milanka Negovanović, Lazar Kričak, Stefan Milanović, Jovan Marković, Nikola Simić, Snežana Ignjatović (2023)Drobljenje stena je najvažniji pokazatelj u proceni efekata miniranja pri proizvodnom miniranju u površinskoj eksploataciji. Stepen drobljenja stena ima veliki uticaj na efikasnost daljih operacija utovara, transporta, drobljenja i mlevenja. Optimalno drobljenje stena pri proizvodnom miniranju utiče na smanjenje ukupnih troškova proizvodnje. Stoga je pouzdana procena veličine drobljenja odminirane stenske mase veoma važno pitanje, ne samo u operacijama miniranja, već i u rudarskoj proizvodnji. Za predviđanje distribucije veličine komada odminirane stenske mase postoje različiti empirijski modeli. KUZ-RAM model omogućava ...Milanka Negovanović, Lazar Kričak, Stefan Milanović, Jovan Marković, Nikola Simić, Snežana Ignjatović. "Split-Desktop software for the analysis of fragment size distribution of blasted rock mass" in 9th International Conference Mining and environmental protection, Sokobanja, Serbia, 24 – 27. May 2023, Belgrade : University of Belgrade, Faculty of Mining and Geology (2023)
-
OntoLex Publication Made Easy: A Dataset of Verbal Aspectual Pairs for Bosnian, Croatian and Serbian
Ovaj rad predstavlja novi jezički resurs za pretraživanje i istraživanje verbalnih aspektnih parova u BCS (bosanskom, hrvatskom i srpskom), kreiran korišćenjem principa Lingvističkih Povezanih Otvorenih Podataka (LLOD). Pošto ne postoji resurs koji bi pomogao učenicima bosanskog, hrvatskog i srpskog kao stranih jezika da prepoznaju aspekt glagola ili njegove parove, kreirali smo novi resurs koji će korisnicima pružiti informacije o aspektu, kao i link ka aspektnim parovima glagola. Ovaj resurs takođe sadrži spoljne linkove ka monolingvalnim rečnicima, Wordnetu i BabelNetu. ...Ranka Stanković, Maxim Ionov, Medina Bajtarević, Lorena Ninčević. "OntoLex Publication Made Easy: A Dataset of Verbal Aspectual Pairs for Bosnian, Croatian and Serbian" in Proceedings of the 9th Workshop on Linked Data in Linguistics @ LREC-COLING 2024, Turin, 20-25 May 2024, ELRA and ICCL (2024)
-
Bridging Computational Lexicography and Corpus Linguistics: A Query Extension for OntoLex-FrAC
OntoLex, dominantni standard zajednice za mašinski čitljive leksičke resurse u kontekstu RDF-a, Linked Data i tehnologija Semantičkog veba, trenutno se proširuje sa posebnim modulom za Frekvencije, Primere i Informacije zasnovane na Korpusu (OntoLex-FrAC). Predlažemo novi komponent za OntoLex-FrAC, koji se bavi inkorporacijom korpusnih upita za (a) povezivanje rečnika sa korpusnim mašinama, (b) omogućavanje RDF baziranih web servisa da dinamički razmenjuju korpusne upite i podatke odgovora, i (c) korišćenje konvencionalnih upitačkih jezika za formalizaciju unutrašnje strukture kolokacija, skica reči i ...standardizacija, digitalna leksikografija, OntoLex, upiti korpusa, povezani podaci, Lingvistički povezani otvoreni podaciChristian Chiarcos, Ranka Stanković, Maxim Ionov, Gilles Sérasset. "Bridging Computational Lexicography and Corpus Linguistics: A Query Extension for OntoLex-FrAC" in Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), Turin, 20-25 May 2024, LREC (2024)
-
Terminology Acquisition and Description Using Lexical Resources and Local Grammars
Acquisition of new terminology from specific domains and its adequate description within terminological dictionaries is a complex task, especially for languages that are morphologically complex such as Serbian. In this paper we present an approach to solving this task semi-automatically on basis of lexical resources and local grammars developed for Serbian. Special attention is given to automatic inflectional class prediction for simple adjectives and nouns and the use of syntactic graphs for extraction of Multi-Word Unit (MWU) candidates for ...... we applied it to a collection of 74 papers in Serbian from the journal Infotheca. 6 The size of the corpus is 6 Infotheca - Journal for Digital Humanities (http://infoteka.bg.ac.rs/index.php/en/infoteca) Proceedings of the conference Terminology and Artificial Intelligence 2015 (Granada, Spain) ...
... приступ издањима Факултета и радовима запослених доступним у слободном приступу. - Претрага репозиторијума доступна је на www.dr.rgf.bg.ac.rs The Digital repository of The University of Belgrade Faculty of Mining and Geology archives faculty publications available in open access, as well as the employees' ...
... Frantzi, K., Ananiadou, S., & Mima, H. (2000). Au- tomatic recognition of multi-word terms:. the C- value/NC-value method. International Journal on Digital Libraries, 3(2): 115-130. Gelbukh, A., Sidorov, G., Lavin-Villa, E., & Chanona-Hernandez, L. (2010). Automatic Term Extraction Using Log-Likelihood ...Cvetana Krstev, Ranka Stanković, Ivan Obradović, Biljana Lazić. "Terminology Acquisition and Description Using Lexical Resources and Local Grammars" in Proceedings of the 11th Conference on Terminology and Artificial Intelligence, Granada, Spain, 2015, Granada : LexiCon (Universidad de Granada) (2015)
-
Towards the semantic annotation of SR-ELEXIS corpus: Insights into Multiword Expressions and Named Entities
Овај рад представља активности на развоју корпуса ELEXIS-sr, српском додатку вишејезичном анотираном корпусу ELEXIS-а, који се састоји од семантичких анотација и репозиторија значења речи. ELEXIS је паралелни вишејезични анотирани корпус на десет европских језика, који може да се користи као вишејезички репер за евалуацију европских језика са мање и средње развијеним ресурсима. Фокус овог рада је на вишечланим изразима и именованим ентитетима, њиховом препознавању у скупу реченица ELEXIS-sr и поређењу са анотацијама на другим језицима. Разматрају се први кораци ...Cvetana Krstev, Ranka Stanković, Aleksandra Marković, Teodora Mihajlov. "Towards the semantic annotation of SR-ELEXIS corpus: Insights into Multiword Expressions and Named Entities" in Proceedings of the Joint Workshop on Multiword Expressions and Universal Dependencies (MWE-UD) @ LREC-COLING 2024, Turin, May 25, 2024, ELRA and ICCL (2024)
-
Towards Automatic Definition Extraction for Serbian
U radu su prikazani preliminarni rezultati automatske ekstrakcije kandidata za definicije rečnika iz nestrukturiranih tekstova na srpskom jeziku u cilju ubrzanja razvoja rečnika. Definicije u rečniku Srpske akademije nauka i umetnosti (SANU) korišćene su za modelovanje različitih tipova definicija (opisnih, gramatičkih, referentnih i sinonimskih) koje imaju različite sintaksičke i leksičke karakteristike. Korpus istraživanja sastoji se od 61.213 definicija imenica, koje su analizirane korišćenjem morfoloških e-rečnika i lokalnih gramatika implementiranih kao pretvarači konačnih stanja u paketu za obradu korpusa otvorenog ...... engineering for agriculture and tools and mechanization. Logic and philosophy, on the other hand were covered by two high school textbooks focusing on humanities subjects . The domain of music is represented by two music high school textbooks: History of Music and Century of Jazz. The corpus is being developed ...
... приступ издањима Факултета и радовима запослених доступним у слободном приступу. - Претрага репозиторијума доступна је на www.dr.rgf.bg.ac.rs The Digital repository of The University of Belgrade Faculty of Mining and Geology archives faculty publications available in open access, as well as the employees' ...
... Calzolari et al., pp. 3947–3955. Stijović, Rada, Ranka Stanković. Digitalno izdanje Rečnika SANU: formalni opis mikrostrukture Rečnika SANU. [Digital edition of the SASA Dictionary: a formal description of the microstructure of the SASA Dictionary (in Cyrillic)] In: Naučni sastanak slavista u Vukove ...Ranka Stanković, Cvetana Krstev, Rada Stijović, Mirjana Gočanin, Mihailo Škorić. "Towards Automatic Definition Extraction for Serbian" in Proceedings of the XIX EURALEX Congress of the European Assocition for Lexicography: Lexicography for Inclusion (Volume 2). 7-9 September (virtual), Democritus University of Thrace (2021)
-
Using English Baits to Catch Serbian Multi-Word Terminology
In this paper we present the first results in bilingual terminology extraction. The hypothesis of our approach is that if for a source language domain terminology exists as well as a domain aligned corpus for a source and a target language, then it is possible to extract the terminology for a target language. Our approach relies on several resources and tools: aligned domain texts, domain terminology for a source language, a terminology extractor for a target language, and a ...aligned texts, word alignment, terminology extraction, electronic dictionaries, morphological inflection... (inflected) dictionaries for Serbian and English; 4.1. Aligned/parallel corpus The English/Serbian textual resource was derived from the journal for Digital Humanities Infotheca3 that is published biannually in Open Access. 12 issues with 84 papers were aligned at sentence level resulting in 14,710 alignment ...
... extracted MWTs 15Phrase table often contains several similar entries of the same phrase. For example, at the digital library, for digital library, because digital library and of the digital library would represent four different entries within phrase table. We observed these as one phrase, in the manner ...
... 2008/. Vitas, D., Popović, L., Krstev, C., Obradović, I., zetić, G. P.-L., and Stanojević, M. (2012). Srpski jezik u digital- nom dobu – The Serbian Language in the Digital Age. META-NET White Paper Series. Georg Rehm and Hans Uszkoreit (Series Editors). Springer. Available online at http://www ...Cvetana Krstev, Branislava Šandrih, Ranka Stanković. "Using English Baits to Catch Serbian Multi-Word Terminology" in Proceedings of the 11th International Conference on Language Resources and Evaluation, LREC 2018, Miyazaki, Japan, May 7-12, 2018, European Language Resources Association (ELRA) (2018)
-
SASA Dictionary as the Gold Standard for Good Dictionary Examples for Serbian
Ranka Stanković, Branislava Šandrih, Rada Stijović, Cvetana Krstev, Duško Vitas, Aleksandra Marković (2019)У овом раду представљамо модел за избор добрих примера за речник српског језика и развој иницијалних компоненти модела. Метода која се користи заснива се на детаљној анализи различитих лексичких и синтактичких карактеристика у корпусу састављених од примера из пет дигитализованих свезака речника САНУ. Почетни скуп функција био је инспирисан сличним приступом и за друге језике. Дистрибуција карактеристика примера из овог корпуса упоређује се са карактеристиком дистрибуције узорака реченица ексцерпираних из корпуса који садрже различите текстове. Анализа је показала да ...Српски, добри примери из речника, аутоматизација израде речника, издвајање својстава, Машинско учење... German and translated to Serbian. In order to represent domain knowledge, two scientific journals (labelled SJ) were used: The Journal for Digital Humanities Infotheca10 and Underground Mining Engineering11. The sample labelled DP, with 17 issues of the daily ...
... приступ издањима Факултета и радовима запослених доступним у слободном приступу. - Претрага репозиторијума доступна је на www.dr.rgf.bg.ac.rs The Digital repository of The University of Belgrade Faculty of Mining and Geology archives faculty publications available in open access, as well as the employees' ...
... volumes of the SASA dictionary. Section 2 describes some steps towards modernization of the dictionary-making process and the development of the digital version of SASA dictionary, starting with retro- digitization process, followed by several ideas about modernization of dictionary- making and the ...Ranka Stanković, Branislava Šandrih, Rada Stijović, Cvetana Krstev, Duško Vitas, Aleksandra Marković. "SASA Dictionary as the Gold Standard for Good Dictionary Examples for Serbian" in Electronic lexicography in the 21st century. Proceedings of the eLex 2019 conference , Lexical Computing CZ, s.r.o. (2019)
-
FrameNet Lexical Database: Presenting a Few Frames Within the Risk Domain
U radu se daje kratak prikaz teorije semantike okvira, na kojoj je zasnovana leksička baza Frejmnet. Predstavljena je koncepcija ove mreže, kao i mogućnosti njene primene. Predstavljena je i leksička analiza koja se primenjuje u projektu izrade Frejmneta i ukazano na razlike između analize zasnovane na okviru u odnosu na analizu zasnovanu na reči. Zatim je prikazano nekoliko povezanih okvira koje prizivaju reči iz domena rizika. U radu je predstavljena i platforma NLTК pomoću koje se mogu koristiti ...... and Nikola Ljubešic. 2018. “Towards semantic role labeling in Slovene and Croatian.” In Pro- ceedings Conference on Language Technologies and Digital Humanities in Ljubljana, 93–98. Gildea, Daniel, and Daniel Jurafsky. 2002. “Automatic labeling of semantic roles.” Computational linguistics 28 (3): 245–288 ...
... приступ издањима Факултета и радовима запослених доступним у слободном приступу. - Претрага репозиторијума доступна је на www.dr.rgf.bg.ac.rs The Digital repository of The University of Belgrade Faculty of Mining and Geology archives faculty publications available in open access, as well as the employees' ...
... learning (since it contains more than 13,000 LUs); as a valence dictionary; as a training dataset for semantic role labeling14 which makes it a rich digital language resource (with over 200,000 manually annotated sentences linked to over 1,200 semantic frames). 13. The property of verbs to take arguments ...Aleksandra Marković, Ranka Stanković, Natalija Tomić, Olivera Kitanović. "FrameNet Lexical Database: Presenting a Few Frames Within the Risk Domain" in Infotheca, Faculty of Philology, University of Belgrade (2021). https://doi.org/10.18485/infotheca.2021.21.1.1
-
Callovian (Middle Jurassic) Sphriganaria (Brachiopoda) from the Jordan Valley (Middle East)
Howard R. Feldman, Barbara V. Radulović, Fayez Ahmad. "Callovian (Middle Jurassic) Sphriganaria (Brachiopoda) from the Jordan Valley (Middle East)" in Historical Biology (2023). https://doi.org/10.1080/08912963.2022.2122823
-
A Twitter Corpus and Lexicon for Abusive Speech Detection in Serbian
Uvredljivi govor na društvenim medijima, uključujući psovke, pogrdni govor i govor mržnje, dostigao je nivo pandemije. Sistem koji bi bio u stanju da detektuje takve tekstove mogao bi da pomogne da internet i društveni mediji postanu bolji virtuelni prostor sa više poštovanja. Istraživanja i komercijalna primena u ovoj oblasti do sada su bili fokusirani uglavnom na engleski jezik. Ovaj rad predstavlja rad na izgradnji AbCoSER-a, prvog korpusa uvredljivog govora na srpskom jeziku. Korpus se sastoji od 6.436 ručno označenih ...... pages 1621–1622, 2013. 22 Biljana Lazić and Mihailo Škorić. From DELA based dictionary to Leximirka lexical database. Infotheca Ű Journal for Digital Humanities, 19(2):81–98, 2020. doi:10.18485/infotheca. 2019.19.2.4. 23 Nikola Ljubešić, Darja Fišer, and Tomaž Erjavec. The FRENK datasets of socially ...
... приступ издањима Факултета и радовима запослених доступним у слободном приступу. - Претрага репозиторијума доступна је на www.dr.rgf.bg.ac.rs The Digital repository of The University of Belgrade Faculty of Mining and Geology archives faculty publications available in open access, as well as the employees' ...
... ion Computing methodologies → Natural language processing Keywords and phrases abusive language, hate speech, Serbian, Twitter, lexicon, corpus Digital Object IdentiĄer 10.4230/OASIcs.LDK.2021.13 Funding Linked data development is supported by the COST Action CA18209-NexusLinguarum “European network ...Danka Jokić, Ranka Stanković, Cvetana Krstev, Branislava Šandrih. "A Twitter Corpus and Lexicon for Abusive Speech Detection in Serbian" in 3rd Conference on Language, Data and Knowledge (LDK 2021), MDPI AG (2021). https://doi.org/10.4230/OASIcs.LDK.2021.13
-
Српски језик у дигиталном добу -- The Serbian Language in the Digital Age
Duško Vitas, Ljubomir Popović, Cvetana Krstev, Ivan Obradović, Gordana Pavlović-Lažetić, Mladen Stanojević (2012)... Science, Univ. of Tartu: Tiit Roosmaa, Kadri Vider Ирска Ireland School of Computing, Dublin City Univ.: Josef van Genabith Исланд Iceland School of Humanities, Univ. of Iceland: Eiríkur Rögnvaldsson Италија Italy Consiglio Nazionale delle Ricerche, Istituto di Linguistica Computazionale “Antonio Zampolli”: ...
... Nicoletta Calzolari Human Language Technology Research Unit, Fondazione Bruno Kessler: Bernardo Magnini Кипар Cyprus Language Centre, School of Humanities: Jack Burston Летонија Latvia Tilde: Andrejs Vasiļjevs Institute of Mathematics and Computer Science, Univ. of Latvia: Inguna Skadiņa Литванија ...
... Utrecht Univ.: Jan Odijk Computational Linguistics, Univ. of Groningen: Gertjan van Noord Хрватска Croatia Institute of Linguistics, Faculty of Humanities and Social Science, Univ. of Zagreb: Marko Tadić 82 Чешка Czech Republic Institute of Formal and Applied Linguistics, Charles Univ. in Prague: ...Duško Vitas, Ljubomir Popović, Cvetana Krstev, Ivan Obradović, Gordana Pavlović-Lažetić, Mladen Stanojević. "Српски језик у дигиталном добу -- The Serbian Language in the Digital Age" in META-NET White Paper Series, G. Rehm, H. Uszkoreit (eds.), Springer (2012)
-
Two approaches to compilation of bilingual multi-word terminology lists from lexical resources
In this paper, we present two approaches and the implemented system for bilingual terminology extraction that rely on an aligned bilingual domain corpus, a terminology extractor for a target language, and a tool for chunk alignment. The two approaches differ in the way terminology for the source language is obtained: the first relies on an existing domain terminology lexicon, while the second one uses a term extraction tool. For both approaches, four experiments were performed with two parameters being ...Branislava Šandrih, Cvetana Krstev, Ranka Stanković. "Two approaches to compilation of bilingual multi-word terminology lists from lexical resources" in Natural Language Engineering, Cambridge University Press (CUP) (2020). https://doi.org/10.1017/S1351324919000615
-
Ontološki model upravljanja rizikom u rudarstvu
Olivera Kitanović (2021)Rudarska proizvodnja obuhvata kompleksne tehnološke sisteme, što nameće potrebu za uspostavljanjem i unapređivanjem sistema upravljanja rizikom. Heterogenost i obim podataka neophodnih za upravljanje rizikom zahtevaju sistem koji ih na fleksibilan način integriše i omogućava njihovo optimalno korišćenje. Osnovni cilj ove disertacije je razvoj ontologije za domen rudarstva i na njoj zasnovanog modela za upravljanje rizikom. Njegova realizacija podrazumeva i implementaciju algoritama ekstrakcije informacija za popunjavanje ontologije, kao i odgovarajuće softversko rešenje. Razvoj modela obuhvata i značajno proširenje rudarskog korpusa, kao ...rudarstvo, rizik, upravljanje rizikom, procena rizika, ontologija, semantička mreža, ekstrakcija informacija, upravljanje znanjem, računarska lingvistika... Intelligent Information Processing, 254–65. Springer. ———. 2009. “Ontology Population via NLP Techniques in Risk Management.” International Journal of Humanities and Social Science (IJHSS) 3 (3): 212–17. McCrae, John, Guadalupe Aguado-de-Cea, Paul Buitelaar, Philipp Cimiano, Thierry Declerck, Asunción ...
... приступ издањима Факултета и радовима запослених доступним у слободном приступу. - Претрага репозиторијума доступна је на www.dr.rgf.bg.ac.rs The Digital repository of The University of Belgrade Faculty of Mining and Geology archives faculty publications available in open access, as well as the employees' ...
... biti prave. U nastavku je dat pregled nekoliko najzastupljenijih softvera iz oblasti upravljanja rizikom. Kompanija Capterra, koja je deo Gartner Digital Markets, bavi se istraživanjem, testiranjem i ocenjivanjem softverskih rešenja različitih kategorija, pa među njima i softverskih rešenja za upravljanjem ...Olivera Kitanović. Ontološki model upravljanja rizikom u rudarstvu, Beograd : [O. Kitanović], 2021