Претрага ⚒ Радови ⚒ Др РГФ - Репозиторијум РГФ

Per page

Sort by

6 items

Multiword Expressions between the Corpus and the Lexicon: Universality, Idiosyncrasy and the Lexicon-Corpus Interface

Verginica Barbu Mititelu, Voula Giouli, Kilian Evang, Daniel Zeman, Petya Osenova, Carole Tiberius, Simon Krek, Stella Markantonatou, Ivelina Stoyanova, Ranka Stankovic, Christian Chiarcos (2024)

Predstavljamo trenutne aktivnosti na definisanju interfejsa leksikona i korpusa koji će služiti kao referenca u prikazu polileksemskih jedinica - višečlanih izraza - (različitih tipova - imenskih, glagolskih, itd.) u specijalizovanim leksikonima i povezivanju ovih unosa sa njihovim pojavljivanjima u korpusima. Konačni cilj je korišćenje ovakvih resursa za automatsko identifikovanje višečlanih izraza u tekstu. Uključivanje nekoliko prirodnih jezika ima za cilj univerzalnost rešenja koje nije usredsređeno na određeni jezik, kao i prilagođavanje idiosinkrazijama. Raspravljaju se izazovi u leksikografskom opisu višerečnih ...

multiword expression lexicon, corpus, proof-of-concept lexicon encoding

Verginica Barbu Mititelu, Voula Giouli, Kilian Evang, Daniel Zeman, Petya Osenova, Carole Tiberius, Simon Krek, Stella Markantonatou, Ivelina Stoyanova, Ranka Stankovic, Christian Chiarcos. "Multiword Expressions between the Corpus and the Lexicon: Universality, Idiosyncrasy and the Lexicon-Corpus Interface" in Proceedings of the Joint Workshop on Multiword Expressions and Universal Dependencies (MWE-UD) @ LREC-COLING 2024, Turin, May 25, 2024, ELRA and ICCL (2024)
Towards the semantic annotation of SR-ELEXIS corpus: Insights into Multiword Expressions and Named Entities

Cvetana Krstev, Ranka Stanković, Aleksandra Marković, Teodora Mihajlov (2024)

Овај рад представља активности на развоју корпуса ELEXIS-sr, српском додатку вишејезичном анотираном корпусу ELEXIS-а, који се састоји од семантичких анотација и репозиторија значења речи. ELEXIS је паралелни вишејезични анотирани корпус на десет европских језика, који може да се користи као вишејезички репер за евалуацију европских језика са мање и средње развијеним ресурсима. Фокус овог рада је на вишечланим изразима и именованим ентитетима, њиховом препознавању у скупу реченица ELEXIS-sr и поређењу са анотацијама на другим језицима. Разматрају се први кораци ...

полилексемске језинице, именовани ентитет, вишезначност значења речи, складиште смисла, LLOD

Cvetana Krstev, Ranka Stanković, Aleksandra Marković, Teodora Mihajlov. "Towards the semantic annotation of SR-ELEXIS corpus: Insights into Multiword Expressions and Named Entities" in Proceedings of the Joint Workshop on Multiword Expressions and Universal Dependencies (MWE-UD) @ LREC-COLING 2024, Turin, May 25, 2024, ELRA and ICCL (2024)
Part of Speech Tagging for Serbian language using Natural Language Toolkit

Ranka Stanković, Boro Milovanović (2020)

Dok se razvijaju složeni algoritmi za NLP (obrada prirodnog jezika), osnovni zadaci kao što je označavanje ostaju veoma važni i još uvek izazovni. NLTK (Natural Language Toolkit) je moćna Python biblioteka za razvoj programa zasnovanih na NLP-u. Pokušavamo da iskoristimo ovu biblioteku za kreiranje PoS (vrsta reči) oznake za savremeni srpski jezik. Jedanaest različitih modela je kreirano korišćenjem NLTK API-ja za označavanje. Najbolji modeli se transformišu sa Brill tagerom da bi se poboljšala tačnost. Obučili smo modele na označenom ...

obrada prirodnog jezika, mašinsko učenje, neuronske mreže

... in MULTEXT-East | the Case of Serbian,” Informatica, vol. 28 no. 4 pp. 431–436, Dec. 2004. [10] M. Gavrilidou, P. Labropoulou, S. Piperidis, V. Giouli, N. Calzolari, M. Monachini, C. Soria, and K. Choukri, “Language Resources Production Models: the Case of the INTERA Multilingual Corpus and Terminology ...
Ranka Stanković, Boro Milovanović. "Part of Speech Tagging for Serbian language using Natural Language Toolkit" in 7th International Conference on Electrical, Electronic and Computing Engineering IcETRAN 2020, Academic Mind, Belgrade (2020)
Keyword-Based Search on Bilingual Digital Libraries

Ranka Stanković, Cvetana Krstev, Duško Vitas, Nikola Vulović, Olivera Kitanović (2017)

This paper outlines the main features of Biblisha, a tool that offers various possibilities of enhancing queries submitted to large collections of aligned parallel text residing in bilingual digital library. Biblishsa supports keyword queries as an intuitive way of specifying information needs. The keyword queries initiated, in Serbian or English, can be expanded, both semantically, morphologically and in other language, using different supporting monolingual and bilingual resources. Terminological and lexical resources are of various types, such as wordnets, electronic ...

Ranka Stanković, Cvetana Krstev, Duško Vitas, Nikola Vulović, Olivera Kitanović. "Keyword-Based Search on Bilingual Digital Libraries" in Semantic Keyword-Based Search on Structured Data Sources - Second COST Action IC1302 International KEYSTONE Conference, IKC 2016, Springer (2017). https://doi.org/10.1007/978-3-319-53640-8_10
Machine Learning and Deep Neural Network-Based Lemmatization and Morphosyntactic Tagging for Serbian

Ranka Stanković, Branislava Šandrih, Cvetana Krstev, Miloš Utvić, Mihailo Škorić (2020)

The training of new tagger models for Serbian is primarily motivated by the enhancement of the existing tagset with the grammatical category of a gender. The harmonization of resources that were manually annotated within different projects over a long period of time was an important task, enabled by the development of tools that support partial automation. The supporting tools take into account different taggers and tagsets. This paper focuses on TreeTagger and spaCy taggers, and the annotation schema alignment ...

Part-of-Speech tagging, lemmatization, corpus, evaluation, Serbian, morphological dictionary

... sources for central and eastern european languages. Lan- guage resources and evaluation, 46(1):131–142. Gavrilidou, M., Labropoulou, P., Piperidis, S., Giouli, V., Calzolari, N., Monachini, M., Soria, C., and Choukri, K. (2006). Language resources production models: the case of the intera multilingual ...
Ranka Stanković, Branislava Šandrih, Cvetana Krstev, Miloš Utvić, Mihailo Škorić. "Machine Learning and Deep Neural Network-Based Lemmatization and Morphosyntactic Tagging for Serbian" in Proceedings of the 12th Language Resources and Evaluation Conference, May Year: 2020, Marseille, France, European Language Resources Association (2020)
Advancing Sentiment Analysis in Serbian Literature: A Zero and Few-Shot Learning Approach Using the Mistral Model

Milica Ikonić Nešić, Saša Petalinkar, Mihailo Škorić, Ranka Stanković, Biljana Rujević (2024)

Ova studija predstavlja analizu sentimenta srpskih starih romana iz perioda 1840-1920, koristeći veliki jezički model (LLM) Mistral za tehniku učenja sa zasnovani na takozvanim "zero" i "few-shot" pokušajima. Glavni pristup uvodi inovacije osmišljavanjem istraživačkih upita (promptova) uključuju tekst sa uputstvom za klasifikaciju bez primera i na osnovu nekoliko primera, omogućavajući jezičkom modelu da klasifikuje osećanja u pozitivne, negativne ili objektivne kategorije. Ova metodologija ima za cilj da pojednostavi analizu osećanja ograničavanjem odgovora, čime se povećava preciznost ...

zero-shot, few-shot, sentiment, Serbian, Mistral model

Milica Ikonić Nešić, Saša Petalinkar, Mihailo Škorić, Ranka Stanković, Biljana Rujević. "Advancing Sentiment Analysis in Serbian Literature: A Zero and Few-Shot Learning Approach Using the Mistral Model" in In Proceedings of the Sixth International Conference on Computational Linguistics in Bulgaria (CLIB 2024), BAS (2024)

Претрага

6 items

Multiword Expressions between the Corpus and the Lexicon: Universality, Idiosyncrasy and the Lexicon-Corpus Interface cite

Towards the semantic annotation of SR-ELEXIS corpus: Insights into Multiword Expressions and Named Entities cite

Part of Speech Tagging for Serbian language using Natural Language Toolkit cite

Keyword-Based Search on Bilingual Digital Libraries cite

Machine Learning and Deep Neural Network-Based Lemmatization and Morphosyntactic Tagging for Serbian cite

Advancing Sentiment Analysis in Serbian Literature: A Zero and Few-Shot Learning Approach Using the Mistral Model cite

Multiword Expressions between the Corpus and the Lexicon: Universality, Idiosyncrasy and the Lexicon-Corpus Interface

Towards the semantic annotation of SR-ELEXIS corpus: Insights into Multiword Expressions and Named Entities

Part of Speech Tagging for Serbian language using Natural Language Toolkit

Keyword-Based Search on Bilingual Digital Libraries

Machine Learning and Deep Neural Network-Based Lemmatization and Morphosyntactic Tagging for Serbian

Advancing Sentiment Analysis in Serbian Literature: A Zero and Few-Shot Learning Approach Using the Mistral Model