Претрага
182 items
-
Bilingual lexical extraction based on word alignment for improving corpus search
Jelena Andonovski, Branislava Šandrih, Olivera Kitanović. "Bilingual lexical extraction based on word alignment for improving corpus search" in The Electronic Library, Emerald (2019). https://doi.org/10.1108/EL-03-2019-0056
-
Corpus-based bilingual terminology extraction in the power engineering domain
Ovaj rad predstavlja resurse i alate koji se koriste za ekstrkciju i evaluaciju dvojezične, englesko-srpske terminologije u domenu energetike. Resursi se sastoje od postojeće opšte i domenske leksike i domenskog paralelnog korpusa; alati uključuju ekstraktore termina za oba jezika i alat za poravnavanje segmenata koji pripadaju korpusnim rečenicama. Sistem je testiran variranjem funkcije podudaranja koja utvrđuje prisustvo ekstrahovanog termina u poravnatom segmentu (odsečak), u rasponu od veoma labavog do strogog. Procena rezultata je pokazala da je preciznost izdvajanja termina ...Tanja Ivanović, Ranka Stanković, Branislava Šandrih Todorović, Cvetana Krstev. "Corpus-based bilingual terminology extraction in the power engineering domain" in Terminology, John Benjamins Publishing Company (2022). https://doi.org/10.1075/term.20038.iva
-
Medical Domain Document Classification via Extraction of Taxonomy Concepts from MeSH Ontology
Mihailo Škorić, Mauro Dragoni (2019)This paper is a result of a task that was presented to attendants of Keyword Search in Big Linked Data summer school, that was organized by Vienna University of Technology, under the Keystone COST action in the summer of 2017. It presents a specific approach to the classification via creation of minimal document surrogates based on the US National medical library’s MeSH ontology, which is derived from the Medical Subject Headings thesaurus. In a series of previously classified medically ...... results are evaluated in compresence to previous manual classification of the same documents. KEYWORDS: document classification, MeSH, ontology, information extraction. PAPER SUBMITTED: 21 April 2019 PAPER ACCEPTED: 30 August 2019 Mihailo Škorić mihailo.skoric@rgf.bg.ac.rs University of Belgrade Belgrade ...
... Medical Domain Document Classification via Extraction of Taxonomy Concepts from MeSH Ontology Mihailo Škorić, Mauro Dragoni Дигитални репозиторијум Рударско-геолошког факултета Универзитета у Београду [ДР РГФ] Medical Domain Document Classification via Extraction of Taxonomy Concepts from MeSH Ontology ...
... the employees' publications. - The Repository is available at: www.dr.rgf.bg.ac.rs Scientific paper Medical domain document classification via extraction of taxonomy concepts from MeSH ontology UDC 004.82:025.43MESH DOI 10.18485/infotheca.2019.19.1.3 ABSTRACT: This paper is a result of a task presented ...Mihailo Škorić, Mauro Dragoni. "Medical Domain Document Classification via Extraction of Taxonomy Concepts from MeSH Ontology" in Infotheca, Faculty of Philology, University of Belgrade (2019). https://doi.org/10.18485/infotheca.2019.19.1.3
-
Classification of Terms on a Positive-Negative Feelings Polarity Scale Based on Emoticons
Mihailo Škorić (2017)The goal of this paper is to draw attention to the possibility of using emoticon-riddled text on the web in language-neutral sentiment analysis. It introduces several innovations in the existing framework of research and tests their effectiveness. It also presents a software tool especially made for that purpose, explains how it builds a database with sentimental value of terms and offers the user manual. Finally, it presents a software tool that tests the new database and gives some examples ...... should be included in natural language research (Ptaszynski et. al., 2011). 1.2 Basic information about the experiment Goal of the experiment is to test a new approach to text extraction using emoticon extraction in a new way, by combination of the following three ideas: – Emoticons that will be used ...
... presents a software tool that tests the new database and gives some examples of the analysis of the ob- tained results. KEYWORDS: data mining, information extraction, emotions, text on the web. PAPER SUBMITTED: 24 January 2017 PAPER ACCEPTED: 25 March 2017 Mihailo Škorić miks@tesla.rcub.bg.ac.rs University ...
... research to quickly and efficiently collect large amounts of information. Developing intelligent systems that work with information: – Information retrieval: retrieval of specific information in the text, as well as finding information that can not be precisely defined. Classification of texts according ...Mihailo Škorić. "Classification of Terms on a Positive-Negative Feelings Polarity Scale Based on Emoticons" in Infotheca, Faculty of Philology, University of Belgrade (2017). https://doi.org/10.18485/infotheca.2017.17.1.4
-
Managing mining project documentation using human language technology
Purpose: This paper aims to develop a system, which would enable efficient management and exploitation of documentation in electronic form, related to mining projects, with information retrieval and information extraction (IE) features, using various language resources and natural language processing. Design/methodology/approach: The system is designed to integrate textual, lexical, semantic and terminological resources, enabling advanced document search and extraction of information. These resources are integrated with a set of Web services and applications, for different user profiles and use-cases. Findings: The ...Digital libraries, Information retrieval, Data mining, Human language technologies, Project documentationAleksandra Tomašević, Ranka Stanković, Miloš Utvić, Ivan Obradović, Božo Kolonja . "Managing mining project documentation using human language technology" in The Electronic Library (2018). https://doi.org/10.1108/EL-11-2017-0239
-
The Usage of Various Lexical Resources and Tools to Improve the Performance of Web Search Engines
In this paper we present how resources and tools developed within the Human Language Technology Group at the University of Belgrade can be used for tuning queries before submitting them to a web search engine. We argue that the selection of words chosen for a query, which are of paramount importance for the quality of results obtained by the query, can be substantially improved by using various lexical resources, such as morphological dictionaries and wordnets. These dictionaries enable semantic ...LR web services, MultiWord Expressions & Collocations, Information Extraction, Information Retrieval... web search engine the user is typically interested in information available on the web related to a particular topic. The result of this query is a selection of web pages the search engine determines as relevant to the query. The information the user is interested in can generally be expressed ...
... engines are faced with a problem they are practically unable to cope with. For example, let us consider that we wish to search the web for the information on beli luk ‘garlic’. When searching with the two constituent keywords beli ‘white’ AND luk 219 ‘onion’ the search engine would typically ...
... professional journals that deals with economic issues. The journal’s web site is supported by a search engine that enables its readers to retrieve information from journal’s archive. The used log file thus gives a good insight in users’ queries. Many of the multi word queries are of no interest since ...Krstev Cvetana, Stanković Ranka, Vitas Duško, Obradović Ivan. "The Usage of Various Lexical Resources and Tools to Improve the Performance of Web Search Engines" in LREC 2008: Conference on Language Resources and Evaluation, Marrakesh, Morocco, May 2008, European Language Resources Association (ELRA) (2008)
-
Creation of a Training Dataset for Question-Answering Models in Serbian
Razvoj i primena veštačke inteligencije u jezičkim tehnologijama značajno su napredovali poslednjih godina, posebno u domenu zadatka odgovaranja na pitanja (Question Answering - QA). Dok su postojeći resursi za QA zadatke razvijeni za glavne svetske jezike, srpski jezik je relativno zanemaren u ovoj oblasti. Ovaj rad predstavlja inicijativu za kreiranje obimnog i raznovrsnog skupa podataka za obučavanje modela za odgovaranje na pitanja na srpskom jeziku, koji će doprineti unapređenju jezičkih tehnologija za srpski jezik. Pored brojnih istraživanja o jezičkim modelima ...veštačka inteligencija, obrada prirodnog jezika, jezički resursi, anotirani skupovi, ekstrakcija informacija, odgovaranje na pitanjaRanka Stanković, Jovana Rađenović, Maja Ristić, Dragan Stankov. "Creation of a Training Dataset for Question-Answering Models in Serbian" in South Slavic Languages in the Digital Environment JuDig Book of Abstracts, University of Belgrade - Faculty of Philology, Serbia, November 21-23, 2024, University of Belgrade - Faculty of Philology (2024)
-
Keyword Extraction from Parallel Abstracts of Scientific Publications
... node and used in the procedure of keyword candidate ranking and extraction. This method does not require linguistic knowledge (apart from stemming or lemmatization) as it is derived purely from the statistical and structural information of the network [10]. In this study, we use the SBKE method on a ...
... abstract (scientific paper). In the case when only one set of annotated keywords is available, the evaluation of the keyword extraction is performed as in the standard information retrieval tasks. Hence, precision (P ), recall (R) and the F1 score are used for the evaluation. When comparing the performance ...
... English – shorter texts. This is the reason why we stopped after the first phase (called keyword extraction in the SBKE method). Usually, SBKE performs better on longer texts (containing more information on the structural properties of the input text), but here we can explore the performance on the shorter ...Slobodan Beliga, Olivera Kitanović, Ranka Stanković, Sanda Martinčić-Ipšić . "Keyword Extraction from Parallel Abstracts of Scientific Publications" in Sematic Keyword-Based Search on Structured Data Sources - Third International KEYSTONE Conference, IKC 2017 Gdańsk, Poland, September 11–12, 2017 Revised Selected Papers and COST Action IC1302 Reports, Springer (2017)
-
Extraction of Bilingual Terminology Using Graphs, Dictionaries and GIZA++
Branislava Šandrih, Ranka Stanković (2020)U nauci, industriji i mnogim istraživačkim oblastima, terminologija se brzo razvija. Najčešće, jezik koji je „lingua franca“ za većinu ovih oblasti je engleski. Kao posledica toga, za mnoga polja termini domena su koncipirani na engleskom, a kasnije se prevode na druge jezike. U ovom radu predstavljamo pristup za automatsko izdvajanje dvojezične terminologije za englesko-srpski jezički par koji se oslanja na usaglašeni dvojezični korpus domena, ekstraktor terminologije za ciljni jezik i alat za usklađivanje delova. Ispitujemo performanse metode na domenu ...... informacija;" (goal of finding information). A Software solution for multi-word units extraction displayed in Figure 2 offers possibilities for general NLP processing on selected corpus (apply- ing lexical resources, generating bag of words and extraction of unknown words), extraction of selected syntactic patterns ...
... word-alignment, phrase extraction, phrase scoring and creation of lexicalised reordering tables, GIZA++9 (Och and Ney, 2000) was used, together with the grow-diag-final symmetrisation heuristic (Koehn et al., 2003). Each pair of aligned chunks from this list also contained information about inverse and direct ...
... Terminology Extraction 5 TermEx Infotheca Vol. 19, No. 2, December 2019 123 Šandrih B., Stanković R., “Extraction of Bilingual . . . ”, pp. 119–138 This list can be either an external resource from the same domain or obtained from the text. The only system developed specifically for the extraction of MWTs ...Branislava Šandrih, Ranka Stanković. "Extraction of Bilingual Terminology Using Graphs, Dictionaries and GIZA++" in Infotheca, Faculty of Philology, University of Belgrade (2020). https://doi.org/10.18485/infotheca.2019.19.2.6
-
Rule-based Automatic Multi-word Term Extraction and Lemmatization
In this paper we present a rule-based method for multi-word term extraction that relies on extensive lexical resources in the form of electronic dictionaries and finite-state transducers for modelling various syntactic structures of multi-word terms. The same technology is used for lemmatization of extracted multi-word terms, which is unavoidable for highly inflected languages in order to pass extracted data to evaluators and subsequently to terminological e-dictionaries and databases. The approach is illustrated on a corpus of Serbian texts from ...... example, the MWT geološki informacioni sistem (AXAXN) ‘geological information system’ would be given precedence over informacioni sistem (AXN) ‘information system’. However, in this case, both would be accepted as MWT. Extraction graphs perform simple word lemmatization the result of which need ...
... to use some of them, such as indexing or document information retrieval, for term extraction. The current application is developed and tested within a Windows environment, while a corresponding web application, which would offer term extraction from texts in various domains to a wider community ...
... e-dictionaries will further improve systems for information retrieval, information extraction, query expansion and the like. One useful application can also be the creation of bilingual and multilingual terminological dictionaries, which would provide coverage of terms from a specific domain. In ...Ranka Stanković, Cvetana Krstev, Ivan Obradović, Biljana Lazić, Aleksandra Trtovac. "Rule-based Automatic Multi-word Term Extraction and Lemmatization" in Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016, Portorož, Slovenia, 23--28 May 2016, European Language Resources Association (2016)
-
Comparison of sequential and single extraction in order to estimate environmental impact of metals from fly ash
летећи пепео угља, екстракција са једним агенсом, секвенцијална екстракција, микроталасне пећнице, ултраталасиAleksandra Tasić, Ivana Sredović-Ignjatović, Ljubiša Ignjatović, Marija Ilić, Mališa Antić. "Comparison of sequential and single extraction in order to estimate environmental impact of metals from fly ash" in Journal of the Serbian Chemical Society (2016). https://doi.org/10.2298/JSC160307038T
-
Some examples of interactions between certain rare earth elements and soil
... rapidly. Developing the metho- dology of sequential extraction enables better predictions and further understanding of the elements fate as well as their potential hazard to the environment. SUPPLEMENTARY MATERIAL Additional data and information is available electronically at the pages of journal ...
... happened in several different phases and the extraction in phases with milder extraction agents had occurred. In case of neodymium, the nature of soil is important in particular, since only in certain extraction phases there came to ion extraction, and also due to slightly different neodymium ...
... stic for humus. The extraction was noticed in ion exchange phase (Phase I) and in extraction phases with acids (Phases IV and V). The results show that in sand soil about 80 % of neodymium ions have been extracted and in the clay type of soil 96 % (Table VI). The extraction from humus soil was ...Zlatko Nikolovski, Jelena Isailović, Dejan Jeremić, Sabina Kovač, Ilija Brčeski. "Some examples of interactions between certain rare earth elements and soil" in Journal of the Serbian Chemical Society, National Library of Serbia (2021). https://doi.org/10.2298/JSC211006095N
-
Two approaches to compilation of bilingual multi-word terminology lists from lexical resources
In this paper, we present two approaches and the implemented system for bilingual terminology extraction that rely on an aligned bilingual domain corpus, a terminology extractor for a target language, and a tool for chunk alignment. The two approaches differ in the way terminology for the source language is obtained: the first relies on an existing domain terminology lexicon, while the second one uses a term extraction tool. For both approaches, four experiments were performed with two parameters being ...Branislava Šandrih, Cvetana Krstev, Ranka Stanković. "Two approaches to compilation of bilingual multi-word terminology lists from lexical resources" in Natural Language Engineering, Cambridge University Press (CUP) (2020). https://doi.org/10.1017/S1351324919000615
-
Towards Automatic Definition Extraction for Serbian
U radu su prikazani preliminarni rezultati automatske ekstrakcije kandidata za definicije rečnika iz nestrukturiranih tekstova na srpskom jeziku u cilju ubrzanja razvoja rečnika. Definicije u rečniku Srpske akademije nauka i umetnosti (SANU) korišćene su za modelovanje različitih tipova definicija (opisnih, gramatičkih, referentnih i sinonimskih) koje imaju različite sintaksičke i leksičke karakteristike. Korpus istraživanja sastoji se od 61.213 definicija imenica, koje su analizirane korišćenjem morfoloških e-rečnika i lokalnih gramatika implementiranih kao pretvarači konačnih stanja u paketu za obradu korpusa otvorenog ...... production by automatically associating some grammatical information to lemmas, namely, word class, word forms and different types of markers (Krstev 2008). Since the research related to extraction of dictionary examples has shown that information extraction from a corpus can be used to speed up the work on ...
... presented in the concluding section. 2 Methodologies for Definition Extraction A According to the standard “ISO 1951:2007, Presentation/representation of entries in dictionaries — Requirements, recommendations and information” a definition is “A statement that describes a concept and permits its ...
... 2021), we focused our present research on the extraction of the sentences contained in the definition. The extraction also implies recognition of paradigmatic lexical relations, e.g. synonyms, antonyms, hypernyms, hyponyms. The problem of automatic extraction of definitions from the text has not been ...Ranka Stanković, Cvetana Krstev, Rada Stijović, Mirjana Gočanin, Mihailo Škorić. "Towards Automatic Definition Extraction for Serbian" in Proceedings of the XIX EURALEX Congress of the European Assocition for Lexicography: Lexicography for Inclusion (Volume 2). 7-9 September (virtual), Democritus University of Thrace (2021)
-
Terminological and lexical resources used to provide open multilingual educational resources
Open educational resources (OER) within BAEKTEL (Blending Academic and Entrepreneurial Knowledge in Technology enhanced learning) network will be available in different languages, mostly in the languages of Western Balkans, Russian and English. University of Belgrade (UB) hosts a central repository based on: BAEKTEL Metadata Portal (BMP), terminological web application for management, browse and search of terminological resources, web services for linguistic support (query expansion, information retrieval, OER indexing, etc.), annotation of selected resources and OER repository on local edX ...... option. Successful methods used in automatic term extraction can be applied to units that belong to the general lexica, as well. The potential expansion of such resources would inevitably lead to a more fruitful information retrieval and extraction, providing an invaluable education resource, applicable ...
... ips. Applications of such ontologies, alongside with the automatic term extraction, which will be further discussed, can be found in machine translation, automatic indexing, building lexical knowledge bases and information retrieval [12]. Once they are extracted, completed ontologies represent an ...
... semiautomatic approach for term recognition, extraction and lemmatization. Picture 1 illustrates steps in terminology extraction. Crucial resources are morphological dictionaries and grammars. They are combined with some statistical measures for term extraction. The first step is analysis of terms in ...Biljana Lazić, Danica Seničić, Aleksandra Tomašević, Bojan Zlatić. "Terminological and lexical resources used to provide open multilingual educational resources" in The Seventh International Conference on eLearning (eLearning-2016), 29-30 September 2016, Belgrade, Serbia, Belgrade : Belgrade Metropolitan University (2016)
-
Project REASONING: Characterization and technological procedures for recycling and reusing of the rudnik mine flotation tailings
Vesna Cvetkov, Vladimir Simić, Stefan Petrović, Filip Arnaut, Milena Kostović, Dragan Radulović, Jovica Stojanović, Vladimir Jovanović, Dejan Todorović, Nina Nikolić, Jelena Senćanski, Grozdanka Bogdanović, Dragana Marilović (2024)Vesna Cvetkov, Vladimir Simić, Stefan Petrović, Filip Arnaut, Milena Kostović, Dragan Radulović, Jovica Stojanović, Vladimir Jovanović, Dejan Todorović, Nina Nikolić, Jelena Senćanski, Grozdanka Bogdanović, Dragana Marilović. "Project REASONING: Characterization and technological procedures for recycling and reusing of the rudnik mine flotation tailings" in 5th Congress Geologists of the Republic of North Macedonia, Ohrid, 28-29. 10. 2024, Македонско геолошко друштво (2024)
-
Groundwater management by riverbank filtration and an infiltration channel, the case of Obrenovac, Serbia
Dušan Polomčić, Bojan Hajdin, Zoran Stevanović, Dragoljub Bajić, Katarina Hajdin. "Groundwater management by riverbank filtration and an infiltration channel, the case of Obrenovac, Serbia" in Hydrogeology Journal, Berlin, Heidelberg : Springer, International Association of Hydrogeologists (2013). https://doi.org/10.1007/s10040-013-1025-9
-
Global trend and negative synergy: Climate changes and groundwater over-extraction
Stevanović Zoran (2013)Stevanović Zoran. "Global trend and negative synergy: Climate changes and groundwater over-extraction" in Proceedings of the International Conference “Climate Change Impact on Water Resources”, 17-18 Oct.2013, Belgrade, Belgrade:Institute of Wat. Manag. J.Cerni & WSDAC (2013): 42-45
-
Terminology Acquisition and Description Using Lexical Resources and Local Grammars
Acquisition of new terminology from specific domains and its adequate description within terminological dictionaries is a complex task, especially for languages that are morphologically complex such as Serbian. In this paper we present an approach to solving this task semi-automatically on basis of lexical resources and local grammars developed for Serbian. Special attention is given to automatic inflectional class prediction for simple adjectives and nouns and the use of syntactic graphs for extraction of Multi-Word Unit (MWU) candidates for ...... a, E., & Chanona-Hernandez, L. (2010). Automatic Term Extraction Using Log-Likelihood Based Compari- son with General Reference Corpus. In C. Hopfe, Y. Rezgui, E. Métais, A. Preece & H. Li (Eds.), Natural Language Processing and Information Systems (Vol. 6177, pp. 248-255): Springer Berlin Heidelberg ...
... edge of the caterpillar”. 4.2 Extraction of MWUs from domain texts The extraction of MWUs from a text is preceded by the retrieval of new simple word terms from it and their incorporation in the existing system of morphological e-dictionaries as MWU extraction relies heavily on existing lexical ...
... cover- ing the dictionary of lemmas FST class prediction can be divided into two parts: one is extraction of implicit knowledge and the other is actual prediction of FST class for a new lemma. Extraction of implicit knowledge in the form of a dataset with word endings, grammatical categories and ...Cvetana Krstev, Ranka Stanković, Ivan Obradović, Biljana Lazić. "Terminology Acquisition and Description Using Lexical Resources and Local Grammars" in Proceedings of the 11th Conference on Terminology and Artificial Intelligence, Granada, Spain, 2015, Granada : LexiCon (Universidad de Granada) (2015)
-
Application of contour blasting for the extraction of dimension stone blocks
Kričak Lazar, Negovanović Milanka, Janković Ivan, Zeković D., Mitrović S.. "Application of contour blasting for the extraction of dimension stone blocks" in Proceedings of the 2nd International Conference „Harmony of nature and spirituality in stone“, Kragujevac, Serbia:Stone Studio Association, (2012): 79-85