Претрага ⚒ Радови ⚒ Др РГФ - Репозиторијум РГФ

Претрага

Per page

Sort by

186 items

Classification of Terms on a Positive-Negative Feelings Polarity Scale Based on Emoticons

Mihailo Škorić (2017)

The goal of this paper is to draw attention to the possibility of using emoticon-riddled text on the web in language-neutral sentiment analysis. It introduces several innovations in the existing framework of research and tests their effectiveness. It also presents a software tool especially made for that purpose, explains how it builds a database with sentimental value of terms and offers the user manual. Finally, it presents a software tool that tests the new database and gives some examples ...

data mining, information extraction, emotions, text on the web

... was written in C# programming language and it can be ran on Windows platform, using any version of the operating system that runs to 64 bit. The interface is user-friendly and entirely in Serbian. The goal in designing this software was that any researcher who speaks Serbian can use it independently, create ...
... use in their messages (in the form of emoti- cons or language-universal phrases) and assigning values of sentiment polar- ity to terms in which those determiners are located. As the determiners are language-independent, the system would be language-independent as well. If it turns out to be valid, this ...
... of Belgrade 1 Introduction When creating natural language understanding software, there are two widely accepted approaches: – Software that does not have a deep understanding of the meaning of written text, but only the grammar of the language that text is written on, which enables wider application ...
Mihailo Škorić. "Classification of Terms on a Positive-Negative Feelings Polarity Scale Based on Emoticons" in Infotheca, Faculty of Philology, University of Belgrade (2017). https://doi.org/10.18485/infotheca.2017.17.1.4
Infotheca (Q25460443) in Wikidata

Ranka Stanković, Lazar Davidović (2021)

Vikipodaci su baza znanja Zadužbine Vikimedija koja predstavlja zajednički izvor različitih vrsta podataka koje koriste ne samo drugi Vikipedijini projekti, već sve više i brojne aplikacije semantičkog veba. U ovom radu ćemo prezentovati primer integracije Vikipodataka sa digitalnim bibliotekama i eksternim sistemima, kao i mogućnost ubrzanja pripreme i unosa podataka na primeru radova iz časopisa za digitalnu humanistiku Infoteka.

semantički veb,otvoreni povezani podaci, vikpodaci,Infoteka, metapodaci časopisa

... entry of data about the results of the research in the domain of digital humanities in Serbia, as well as about old Serbian novels, so as to increase the visibility of both the Serbian language, our cultural heritage and the results of the research in Serbia and certainly pave the way for many other data ...
... (P1476), in English and Serbian (both scripts, Cyrillic and Latin, for the sake of search); – Main subject of the creative work (P921), key words, where the existing ones are linked and new ones are added as instances with labels in Serbian and English; – Publisher (P123); – Language of the work or name ...
... COST Action CA16204 (2017-2021) metadata about Serbian novels included in the srpELTEC corpus is being entered into the knowledge base (Krstev et al. 2019) and Wikidata linked to various applications, one of which is Au- rora.14 Members of JeRTeh Language Resources and Technologies Society15 too contributed ...
Ranka Stanković, Lazar Davidović. "Infotheca (Q25460443) in Wikidata" in Infotheca, Faculty of Philology, University of Belgrade (2021). https://doi.org/10.18485/infotheca.2021.21.1.5
Knowledge and Rule-Based Diacritic Restoration in Serbian

Cvetana Krstev, Ranka Stanković, Duško Vitas (2018)

In this paper we present a procedure for the restoration of diacritics in Serbian texts written using the degraded Latin alphabet. The procedure relies on the comprehensive lexical resources for Serbian: the morphological electronic dictionaries, the Corpus of Contemporary Serbian and local grammars. Dictionaries are used to identify possible candidates for the restoration, while the dataobtainedfromSrpKorandlocalgrammarsassistsinmakingadecisionbetween several candidates in cases of ambiguity. The evaluation results reveal that,dependingonthetext,accuracyrangesfrom95.03%to99.36%,whilethe precision (average 98.93%) is always higher than the recall (average 94.94%).

diacritic restoration, morphological dictionary, corpus, word n-grams, local grammars

... Knowledge and Rule-Based Diacritic Restoration in Serbian Cvetana Krstev, Ranka Stanković, Duško Vitas Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] Knowledge and Rule-Based Diacritic Restoration in Serbian | Cvetana Krstev, Ranka Stanković, Duško Vitas | ...
... RuThes thesaurus (Loukachevitch and Dobrov, 2014) and created for automatic processing of documents in information- analytical systems and natural language processing. These resources are linguistic ontologies uniting some principles of their organization from WordNet, information-retrieval thesauri ...
... generation allows better understanding of the differences between representation models of the thesauri. 2. Existing Russian Thesauri For the Russian language, there were at least four known projects for creating a wordnet. In the Russ- Net project (Azarowa, 2008), the authors planned to create a Russian ...
Cvetana Krstev, Ranka Stanković, Duško Vitas. "Knowledge and Rule-Based Diacritic Restoration in Serbian" in Proceedings of the Third International Conference Computational Linguistics in Bulgaria (CLIB 2018), May 27-29, 2018, Sofia, Bulgaria, Sofia : The Institute for Bulgarian Language Prof. Lyubomir Andreychin, Bulgarian Academy of Sciences (2018): 41-51
Serbian NER&Beyond: The Archaic and the Modern Intertwinned

Branislava Šandrih Todorović, Cvetana Krstev, Ranka Stanković, Milica Ikonić Nešić (2021)

U ovom radu predstavljamo srpski književni korpus koji se razvija pod okriljem COST Akcije „Distant Reading for European Literary History” CA16204. Koristeći ovaj korpus romana napisanih pre više od jednog veka, razvili smo i učinili javno dostupnim Sistem za prepoznavanje imenovanih entiteta (NER) obučen da prepozna 7 različitih tipova imenovanih entiteta, sa konvolucionom neuronskom mrežom (CNN), koja ima F1 rezultat od ≈91% na test skupu podataka. Ovaj model je dalje ocenjen na posebnom skupu podataka za evaluaciju. Završavamo poređenje ...

... Repository is available at: www.dr.rgf.bg.ac.rs Proceedings of Recent Advances in Natural Language Processing, pages 1252–1260 Sep 1–3, 2021. https://doi.org/10.26615/978-954-452-072-4_141 1252 Serbian NER&Beyond: The Archaic and the Modern Intertwinned Branislava Šandrih Todorović University ...
... guration. In our case, the model’s language was 12Translates as: In the meantime, Haji-Đera entered the room to wish agas good morning, when the monastery servant started offering coffee and brandy. 13Quick-start spaCy3 widget, https://spacy.io/usage/training#quickstart Serbian, containing the ner component ...
... Todorović, Cvetana Krstev, Ranka Stanković, Milica Ikonić Nešić | Proceedings of the Conference Recent Advances in Natural Language Processing - Deep Learning for Natural Language Processing Methods and Applications | 2021 | | 10.26615/978-954-452-072-4_141 http://dr.rgf.bg.ac.rs/s/repo/item/0005139 ...
Branislava Šandrih Todorović, Cvetana Krstev, Ranka Stanković, Milica Ikonić Nešić. "Serbian NER&Beyond: The Archaic and the Modern Intertwinned" in Proceedings of the Conference Recent Advances in Natural Language Processing - Deep Learning for Natural Language Processing Methods and Applications, INCOMA Ltd. Shoumen, BULGARIA (2021). https://doi.org/10.26615/978-954-452-072-4_141
Development Of The Serbian Geological Resources Portal

Ranka Stanković, Jelena Prodanović, Olivera Kitanović, Velizar Nikolić (2011)

... and GIS tech- nologies. The largest part was realized with the use of the PHP (Hypertext Preprocessor) script language on server side and the XHTML (eXtensible HyperText Markup Language) language on the client side. In addition to this, the part of the web portal pertaining to geological terminology and ...
... JavaScript functions, while the search engine was developed with the use of PHP script language and AJAX. By using HTML and CSS for markup, JavaScript for the access to DOM elements, XML (Extensible Markup Language) or JSON (JavaScript Object Notation), data is downloaded from the server and the final ...
... 2023-10-14 03:28:12 Development Of The Serbian Geological Resources Portal Ranka Stanković, Jelena Prodanović, Olivera Kitanović, Velizar Nikolić Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] Development Of The Serbian Geological Resources Portal | Ranka Stanković ...
Ranka Stanković, Jelena Prodanović, Olivera Kitanović, Velizar Nikolić. "Development Of The Serbian Geological Resources Portal" in Proceedings of the 17th Meeting of the Association of European Geological Societies, Belgrade, Serbia : The Serbian Geological Society (2011)
An Integrated Environment for Management and Exploitation of Linguistic Resources

Ranka Stanković, Ivan Obradović (2009)

... this module enables update of existing synsets, but also creation of new ones. A new synset in one language (for example Serbian) may be created on basis of an existing synset in another language (for example English). In order to support this feature, the module provides access to bilingual ...
... BalkaNet lan- guages are spoken, but also from France and the Nether- lands. A national development team was formed for each language, which in the case of Serbian was the University of Belgrade HLT Group. Upon the termination of this project, the development of SWN continued, and ...
... original text and its translation into another language. Thus, they represent two texts having the same content, but in two different lan- guages. The majority of parallel texts collected within the HLT Groupare are aligned, with Serbian most often being one of the languages. The procedure ...
Ranka Stanković, Ivan Obradović. "An Integrated Environment for Management and Exploitation of Linguistic Resources" in Proceedings of the International Multiconference on Computer Science and Information Technology, Computational Linguistics – Applications Workshop (CLA09), Mrągowo, Poland, October 2009, Piscataway : IEEE (2009)
Fourth Summer Datathon on Linguistic Linked Open Data

Tijana Radović, Ranka Stanković (2023)

The 4th Summer Datathon on Linguistic Linked Open Data (SD-LLOD-22) was held in Spain, in Cersedilla near Madrid, in May 2022, and organized by the COST Action NexusLinguarum. The school gathered interested researchers, academics, students who wanted to acquire and/or expand their knowledge in the field of linguistic linked data science. During the school, a spectrum of topics from the field of linked data was presented, from various ontologies, through document integration, annotation and natural language text processing tools ...

linguistic linked open data, sentiment analysis, linked data, RDF

Tijana Radović, Ranka Stanković. "Fourth Summer Datathon on Linguistic Linked Open Data" in Infotheca, Faculty of Philology, University of Belgrade (2023). https://doi.org/10.18485/infotheca.2023.23.1.6
Electronic Dictionaries - from File System to lemon Based Lexical Database

Ranka Stanković, Cvetana Krstev, Biljana Lazić, Mihailo Škorić (2018)

In this paper we discuss some well-known morphological descriptions used in various projects and applications (most notably MULTEXT-East and Unitex) and illustrate the encountered problems on Serbian. We have spotted four groups of problems: the lack of a value for an existing category, the lack of a category, the interdependence of values and categories lacking some description, and the lack of a support for some types of categories. At the same time, various descriptions often describe exactly the same ...

... by using the standardized SPARQL query language. The model pre- sented is based on the lemon model, but some modifica- tions and extensions were necessary to enable full migra- tion of complex grammatical structures and numerous in- flected forms for Serbian. MULTEX-East lexicons (Krstev et al., ...
... dictionaries. Therefore, in our model the class Form is used for inflected forms in- stead of variant forms, which is important for Serbian as a highly inflective language. Also, we adapted the lemon model to store all existing markers as a thesaurus of data categories and their values, which enabled ...
... Resources. In Proceedings of the 5th International Conference on Language Resources and Evaluation, LREC 2006, pages 1692–1697. Krstev, C., Stanković, R., and Vitas, D. (2010). A Descrip- tion of Morphological Features of Serbian: a Revision using Feature System Declaration. In Nicoletta Calzo- ...
Ranka Stanković, Cvetana Krstev, Biljana Lazić, Mihailo Škorić. "Electronic Dictionaries - from File System to lemon Based Lexical Database" in Proceedings of the 11th International Conference on Language Resources and Evaluation - W23 6th Workshop on Linked Data in Linguistics : Towards Linguistic Data Science (LDL-2018), LREC 2018, Miyazaki, Japan, May 7-12, 2018, European Language Resources Association (ELRA) (2018)
Building Terminological Resources in an e-Learning Environment

Ranka Stanković, Ivan Obradović, Olivera Kitanović, Ljiljana Kolonja (2012)

... repository for different types of terms: Serbian synonyms of the basic term, its available translational equivalent in the chosen language, and the inflectional forms of the Serbian term and its synonyms. Namely, as Serbian is a morphologically very rich language, there was a need to provide for all ...
... technologies. For each concept separate Serbian and English entries were created. In line with the standard requirements for glossaries, besides the basic Serbian and English terms, each entry contained a short definition of the term in the respective language. However, no synonyms were taken into ...
... functionality within the information system, an UML (Unified Modeling Language) engineering model with a special structure has been developed, whose main features are depicted in Figure 2. Assuming basic familiarity with this language we will briefly comment this model. The class Rečnik in the model ...
Ranka Stanković, Ivan Obradović, Olivera Kitanović, Ljiljana Kolonja. "Building Terminological Resources in an e-Learning Environment" in Proceedings of the Third International Conference on e-Learning, eLearning-2012, September 2012, Belgrade, Serbia, Belgrade : Belgrade Metropolitan University (2012)
Parallel Bidirectionally Pretrained Taggers as Feature Generators

Ranka Stanković, Mihailo Škorić, Branislava Šandrih Todorović (2022)

In a setting where multiple automatic annotation approaches coexist and advance separately but none completely solve a specific problem, the key might be in their combination and integration. This paper outlines a scalable architecture for Part-of-Speech tagging using multiple standalone annotation systems as feature generators for a stacked classifier. It also explores automatic resource expansion via dataset augmentation and bidirectional training in order to increase the number of taggers and to maximize the impact of the composite system, which ...

анотација, обрада природног језика, издвајање обележја, композитне структуре, врста речи

Ranka Stanković, Mihailo Škorić, Branislava Šandrih Todorović. "Parallel Bidirectionally Pretrained Taggers as Feature Generators" in Applied Sciences, MDPI AG (2022). https://doi.org/10.3390/app12105028
An Approach to Efficient Processing of Multi-Word Units

Cvetana Krstev, Ivan Obradović, Ranka Stanković, Duško Vitas (2013)

Efficient processing of Multi-Word Units in the course of development of morphological MWU dictionaries is not easy to achieve, especially when languages with complex morphological structures are concerned, such as Serbian. Manual development of this type of dictionaries is a tedious and extremely slow process. To alleviate this problem we turned to our multipurpose software tool, dubbed LeXimir, in the production of lemmas for e-dictionaries of multi-word units. In addition to that, we developed a procedure aimed at making ...

Natural Language Processing, Grammatical Category, Lexical Representation, MWU, multi-word unit

... rs 1 2 Cvetana Krstev, Ivan Obradović, Ranka Stanković, and Duško Vitas 1 Introduction Morphological electronic dictionaries of Serbian for natural language processing (NLP) are being developed for many years now. Their development follows the methodology and format (known as DELAS/DELAF) presented ...
... for some languages this complex procedure can be skipped and a list of MWU forms can be produced from scratch. Serbian is, how- ever, like all Slavic languages a highly inflectional language and such a shortcut procedure cannot be applied. We will illustrate this with two examples. The nomi- nal MWUs ...
... languages other than Serbian and En- glish, namely, for Bulgarian [8]. The new functionality for production of DELAC entries is also expected to perform successfully without any modifications for other languages. The prerequisites are that there exists a Unitex module for that language including: a dictionary ...
Cvetana Krstev, Ivan Obradović, Ranka Stanković, Duško Vitas. "An Approach to Efficient Processing of Multi-Word Units" in Computational Linguistics - Applications, Studies in Computational Intelligence 458 no. 458, Berlin Heidelberg : Springer-Verlag (2013): 109-129. https://doi.org/10.1007/978-3-642-34399-5_6
Old or New, We Repair, Adjust and Alter (Texts)

Cvetana Krstev, Ranka Stanković (2020)

U ovom radu predstavljamo kako se e-rečnici i kaskade transduktora konačnih stanja implementirani u alatu Unitex mogu koristiti za rešavanje tri problema transformacije teksta: ispravljanje tekstova nakon OCR-a, vraćanje dijakritičkih znakova i prebacivanje između različitih jezičkih varijanti.

ispravka teksta, OCR greške, restauracija dijakritika , jezičke varijante, elektronski rečnik, transduktori konačnih stanja

... true for problems of diacritic restoration, OCR errors correction and language variants transformation. In this paper we present an approach to solving three text mending problems for Serbian: OCR errors, diacritics omission and language vari- ant switching. The common characteristic of these problems is ...
... print of the original text, and its language and alphabet. OCR software today is of good quality compared to its first versions, even when produced for personal rather than professional use,1 and it is applicable to a large number of languages and scripts, including Serbian Cyrillic. However, OCR of old printed ...
... using a noisy channel model”. In Proceedings of the second international conference on Human Language Technology Research, 257–262. Morgan Kaufmann Publishers Inc., 2002 Krstev, Cvetana. Processing of Serbian – Automata, Texts and Electronic dictionaries. Faculty of Philology, University of Belgrade ...
Cvetana Krstev, Ranka Stanković. "Old or New, We Repair, Adjust and Alter (Texts)" in Infotheca, Faculty of Philology, University of Belgrade (2020). https://doi.org/10.18485/infotheca.2019.19.2.3
Towards ELTeC-LLOD: European Literary Text Collection Linguistic Linked Open Data

Ranka Stanković, Christian Chiarcos, Miloš Utvić, Olivera Kitanović (2023)

Овај рад описује студију случаја о генерисању повезаних података креираних на основу обечежених текстуалних корпуса коришћењем формата размене података у обради природних језика (NIF). Као основа за ово истраживање послужио је подскуп корпуса ELTeC, који се састоји од 900 романа из периода 1840-1920 за 9 европских језика. Верзија романа са коментарима, у такозваном TEI level-2 формату, трансформисана је у NIF, формат заснован на RDF/OWL који има за циљ постизање интероперабилности између алата за обраду природних језика, језичких ресурса и ...

повезани отворени подаци, корпус, SrpELTeC, NIF

Ranka Stanković, Christian Chiarcos, Miloš Utvić, Olivera Kitanović. "Towards ELTeC-LLOD: European Literary Text Collection Linguistic Linked Open Data" in LDK 2023 – 4th Conference on Language, Data and Knowledge, 12-15 September in Vienna, Austria, Lisabon : NOVA FCSH - CLUNL (2023). https://doi.org/10.34619/srmk-injj
Improving Document Retrieval in Large Domain Specific Textual Databases Using Lexical Resources

Ranka Stanković, Cvetana Krstev, Ivan Obradović, Olivera Kitanović (2017)

Large collections of textual documents represent an example of big data that requires the solution of three basic problems: the representation of documents, the representation of information needs and the matching of the two representations. This paper outlines the introduction of document indexing as a possible solution to document representation. Documents within a large textual database developed for geological projects in the Republic of Serbia for many years were indexed using methods developed within digital humanities: bag-of-words and named ...

... [22]. When language processing methods and techniques are used for generating a document surrogate, they rely heavily on lexical resources, which is especially important in the case of languages with rich morphology, such as Serbian, and South-Slavic languages in general. Although Serbian belongs to ...
... (corpora and e-dictionaries), as well as applications for basic language processing (tokenization, Part-Of-Speech (POS) tagging, mor- phological analysis), information retrieval and extraction [26]. Several successful applications of Serbian language resources and tools in tasks related to document indexing ...
... in this paper is based on morphological electronic dictionaries and finite-state transducers for Serbian [12]. 3.1 Used Resources Lexical Resources. The resources for natural language processing of Serbian consisting of lexical resources and local grammars are being developed using the finite-state ...
Ranka Stanković, Cvetana Krstev, Ivan Obradović, Olivera Kitanović. "Improving Document Retrieval in Large Domain Specific Textual Databases Using Lexical Resources" in Trans. Computational Collective Intelligence - Lecture Notes in Computer Science 26, Springer (2017). https://doi.org/10.1007/978-3-319-59268-8_8
Srbija u OneGeology Europe

Danka Blagojević, Ranka Stanković, Petar Stejić, Velizar Nikolić (2014)

Геолошки завод Србије као носилац Пројекта ОneGeologyEurope заједно са Рударско геолошким факултетом и Министарством за природне ресурсе, рударство и просторно планирање су се укључили у међународни Пројекат OneGeology Europe у мају 2013. године у већ поодмаклој фази израде Пројекта. До краја 2013. године испунили су завршене активности које треба да доведу до пуноправног укључења у Пројекат чиме је Република Србија нашла своје место на Геолошкој карти Европе 1:1М. Геолошка карта Србије 1:1М представља компилациону односно поједностављену верзију ОГК 1:500 ...

OneGeologyEurope, метаподаци, хармонизација података, вишејезични мета-информациони систем

... на српском језику што је приказано на слици 8. Слика 8: 1GE Портал са интерфeјсом на српском језику Fig 8: 1GE Portal with interface in Serbian language Заокружење пројекта подразумева остале финалне активности након којих би карта требала да буде доступна на 1G-E Порталу, а то обухвата: ...
... добити на српском језику што је приказано на слици 8. Слика 8: 1GE Портал са интерфeјсом на српском језику Fig 8: 1GE Portal with interface in Serbian language Заокружење пројекта подразумева остале финалне активности након којих би карта требала да буде доступна на 1G-E Порталу, а то обухвата: - Валидацију ...
... Serbia. The tasks of each member ОneGeology-Europe Project was as following: - Metadata entry (in English and each native language) - Translation (to each native language: common geological vocabulary, keywords, portal components, metadata titles and abstract of all existing records) - Harmonization ...
Danka Blagojević, Ranka Stanković, Petar Stejić, Velizar Nikolić. "Srbija u OneGeology Europe" in Zapisnici Srpskog geološkog društva za 2013. godinu, Beograd : Srpsko geološko društvo (2014)
Језички модели, шта је то?

Михаило Шкорић (2023)

Језички модели

Михаило Шкорић. "Језички модели, шта је то?" in Језик данас, Нови Сад : Матица српска (2023)
Corpus-based bilingual terminology extraction in the power engineering domain

Tanja Ivanović, Ranka Stanković, Branislava Šandrih Todorović, Cvetana Krstev (2022)

Ovaj rad predstavlja resurse i alate koji se koriste za ekstrkciju i evaluaciju dvojezične, englesko-srpske terminologije u domenu energetike. Resursi se sastoje od postojeće opšte i domenske leksike i domenskog paralelnog korpusa; alati uključuju ekstraktore termina za oba jezika i alat za poravnavanje segmenata koji pripadaju korpusnim rečenicama. Sistem je testiran variranjem funkcije podudaranja koja utvrđuje prisustvo ekstrahovanog termina u poravnatom segmentu (odsečak), u rasponu od veoma labavog do strogog. Procena rezultata je pokazala da je preciznost izdvajanja termina ...

Library and Information Sciences, Communication, Language and Linguistics

Tanja Ivanović, Ranka Stanković, Branislava Šandrih Todorović, Cvetana Krstev. "Corpus-based bilingual terminology extraction in the power engineering domain" in Terminology, John Benjamins Publishing Company (2022). https://doi.org/10.1075/term.20038.iva
Football terminology: compilation and transformation into OntoLex-Lemon resource

Jelena Lazarević, Ranka Stanković, Mihailo Škorić, Biljana Rujević (2023)

У овом раду представља се пројекат који је у развоју, креирање првог дигиталног фудбалског речника на српском језику, као и да демонстрација примене модела OntoLex и љегових модула. OntoLex-FrAC модул укључује информације о учесталости и примерима употребе екстрахованих из корпуса. У овом случају, креиран је корпус за специфичан домен под називом СрФудКо, који садржи чланке вести о фудбалу на српском језику. Вишечлани термини аутоматски су екстраховани из српског корпуса, а затим ручно евалуирани и класификовани као спортски или ...

повезани отворени подаци, корпус, СрФудКо, OntoLex, OntoLex-FrAC

Jelena Lazarević, Ranka Stanković, Mihailo Škorić, Biljana Rujević. "Football terminology: compilation and transformation into OntoLex-Lemon resource" in LDK 2023 – 4th Conference on Language, Data and Knowledge, 12-15 September in Vienna, Austria, Lisabon : NOVA FCSH - CLUNL (2023). https://doi.org/10.34619/srmk-injj
Terminology Acquisition and Description Using Lexical Resources and Local Grammars

Cvetana Krstev, Ranka Stanković, Ivan Obradović, Biljana Lazić (2015)

Acquisition of new terminology from specific domains and its adequate description within terminological dictionaries is a complex task, especially for languages that are morphologically complex such as Serbian. In this paper we present an approach to solving this task semi-automatically on basis of lexical resources and local grammars developed for Serbian. Special attention is given to automatic inflectional class prediction for simple adjectives and nouns and the use of syntactic graphs for extraction of Multi-Word Unit (MWU) candidates for ...

... especially its rich morphology, this is a complex task, and cor- responding language resources in the form of morphological e-dictionaries and grammars need to be applied (Vitas et al., 2012). For that reason, in the case of Serbian, it is not enough to extract terminology from the domain, but it also ...
... acquisition in Serbian. Rap- id changes in many knowledge domains mean that new terms are continuously being created and introduced in Serbian making important the automation of their retrieval and incorporation in Serbian terminological dictionaries. Due to spe- cific features of Serbian grammar, especially ...
... Terminological Da- tabase Using a Transducer Cascade. Proc. of Recent Advances in Natural Language Processing. (pp. 17-23). Baldwin, T., & Kim, S. N. (2010). Multiword expres- sions Handbook of Natural Language Processing, second edition. (267-292): CRC Press. Cerbah, F., & Daille, B. (2007). A ...
Cvetana Krstev, Ranka Stanković, Ivan Obradović, Biljana Lazić. "Terminology Acquisition and Description Using Lexical Resources and Local Grammars" in Proceedings of the 11th Conference on Terminology and Artificial Intelligence, Granada, Spain, 2015, Granada : LexiCon (Universidad de Granada) (2015)
Indexing of textual databases based on lexical resources: A case study for Serbian

Ranka Stanković, Cvetana Krstev, Ivan Obradović, Olivera Kitanović (2015)

In this paper we describe an approach to improvement of information retrieval results for large textual databases by pre-indexing documents using bag-of-words and Named Entity Recognition. The approach was applied on a database of geological projects financed by the Republic of Serbia in the last half century. Each document within this database is described by metadata, consisting of several fields such as title, domain, keywords, abstract, geographical location and the like. A bag of words was produced from these ...

... boundaries are not taken into consideration. This can par- tially solve the problem of the rich morphology that characterizes Serbian, as a language belonging to the South-Slavic Language family. For instance, scanning with lignit ‘lignite’ will also retrieve inflected forms lignita, lignitu, lignitom, etc ...
... bases lemmatization on morphological electronic dictionaries and finite state transducers for Serbian [6]. 4.1 Used Resources Lexical Resources. The resources for natural language processing of Serbian consisting of lexical resources and local grammars are being developed using the finite-state ...
... query into SQL (Structured Query Language) form. The query generated in such a way searches the text of the subset of attributes in the database that correspond to the selected criteria of search. 4 The Improved Solution One of the problems of full text search in Serbian is its rich morphology, where ...
Ranka Stanković, Cvetana Krstev, Ivan Obradović, Olivera Kitanović. "Indexing of textual databases based on lexical resources: A case study for Serbian" in Semantic Keyword-based Search on Structured Data Sources : First COST Action IC1302 International KEYSTONE Conference, IKC 2015, Coimbra, Portugal, September 8-9, 2015. Revised Selected Papers, Springer (2015). https://doi.org/10.1007/978-3-319-27932-9_15

Претрага

186 items

Classification of Terms on a Positive-Negative Feelings Polarity Scale Based on Emoticons cite

Infotheca (Q25460443) in Wikidata cite

Knowledge and Rule-Based Diacritic Restoration in Serbian cite

Serbian NER&Beyond: The Archaic and the Modern Intertwinned cite

Development Of The Serbian Geological Resources Portal cite

An Integrated Environment for Management and Exploitation of Linguistic Resources cite

Fourth Summer Datathon on Linguistic Linked Open Data cite

Electronic Dictionaries - from File System to lemon Based Lexical Database cite

Building Terminological Resources in an e-Learning Environment cite

Parallel Bidirectionally Pretrained Taggers as Feature Generators cite

An Approach to Efficient Processing of Multi-Word Units cite

Old or New, We Repair, Adjust and Alter (Texts) cite

Towards ELTeC-LLOD: European Literary Text Collection Linguistic Linked Open Data cite

Improving Document Retrieval in Large Domain Specific Textual Databases Using Lexical Resources cite

Srbija u OneGeology Europe cite

Језички модели, шта је то? cite

Corpus-based bilingual terminology extraction in the power engineering domain cite

Football terminology: compilation and transformation into OntoLex-Lemon resource cite

Terminology Acquisition and Description Using Lexical Resources and Local Grammars cite

Indexing of textual databases based on lexical resources: A case study for Serbian cite

Classification of Terms on a Positive-Negative Feelings Polarity Scale Based on Emoticons

Infotheca (Q25460443) in Wikidata

Knowledge and Rule-Based Diacritic Restoration in Serbian

Serbian NER&Beyond: The Archaic and the Modern Intertwinned

Development Of The Serbian Geological Resources Portal

An Integrated Environment for Management and Exploitation of Linguistic Resources

Fourth Summer Datathon on Linguistic Linked Open Data

Electronic Dictionaries - from File System to lemon Based Lexical Database

Building Terminological Resources in an e-Learning Environment

Parallel Bidirectionally Pretrained Taggers as Feature Generators

An Approach to Efficient Processing of Multi-Word Units

Old or New, We Repair, Adjust and Alter (Texts)

Towards ELTeC-LLOD: European Literary Text Collection Linguistic Linked Open Data

Improving Document Retrieval in Large Domain Specific Textual Databases Using Lexical Resources

Srbija u OneGeology Europe

Језички модели, шта је то?

Corpus-based bilingual terminology extraction in the power engineering domain

Football terminology: compilation and transformation into OntoLex-Lemon resource

Terminology Acquisition and Description Using Lexical Resources and Local Grammars

Indexing of textual databases based on lexical resources: A case study for Serbian