Претрага
70 items
-
A Lexical Approach to Acronyms and their Definitions
In this paper we present a comprehensive approach to acronyms for Natural-Language Processing (NLP) of Serbian texts. The proposed procedure includes extraction of acronyms and their definitions that are usual Multi-Word Units (MWUs), shallow parsing of MWUs that enables MWU lemmatization and production of entries in morphological electronic dictionaries, both for MWU and acronyms, that are provided with grammatical, syntactic, semantic and domain information. This approach enables representation that reflects complex relations between acronyms and their definitions.... training corpora, while those based on lexical resources do not have them listed in lex- icons. However, their adequate treatment is crucial for many applications, e.g. text-to-speech systems (Taylor, 2009), machine translation (Wolinski et al., 1995), index- ing for information retrieval and text cl ...
... biomed- ical text. In Pacific Symposium on Biocomputing, vol- ume 8. World Scientific. Spasic, I., S. Ananiadou, J. McNaught, and A. Kumar, 2005. Text mining and ontologies in biomedicine: mak- ing sense of raw text. Briefings in bioinformatics, 6(3):239–251. Taylor, Paul, 2009. Text-to-speech synthesis ...
... tual Incompletness. In Proc. of the Corpus Linguistics Conference, Birmingham. Liberman, Mark Y and Kenneth W Church, 1992. Text analysis and word pronunciation in text-to-speech syn- thesis. Advances in speech signal processing:791–831. Moon, S., S. Pakhomov, and G. B. Melton, 2012. Auto- mated ...Cvetana Krstev, Duško Vitas, Ranka Stanković. "A Lexical Approach to Acronyms and their Definitions" in Proceedings of the 7th Language & Technology Conference, November 27-29, 2015, Poznań, Poland, Springer (2015)
-
EUROLAN 2021: Introduction to Linked Data for Linguistics Online Training School
Prva škola za obuku polaznika koju je organizovala COST akcija NexusLinguarum održana je od 8. do 12. februara 2021. godine sa ciljem da studenti, istraživači i stručnjaci nauče osnove lingvističke nauke o podacima. Tokom obuke polaznici su se upoznali sa širokim spektrom tema: od semantičkog veba, RDF -a i ontologija, do modeliranja i pretraživanja jezičkih podataka pomoću najsavremenijih ontoloških modela i alata. Škola je održana u okviru serije letnjih škola EUROLAN-a i organizovalo ju je virtuelno (onlajn) nekoliko instituta; ...nauka o lingvističkim podacima, povezani podaci u lingvistici, jezički podaci, EUROLAN, NexusLinguarum, COST akcija, škola za obuku... (McCrae et al. 2017; Declerck, Tiberius, and Wandl- Vogt 2017; Stanković et al. 2018) – Linguistic linked data generation; (Cimiano et al. 2020) – Corpora and linked data; (Chiarcos 2012) – Linguistic annotations; (Fäth et al. 2020) – NLP Interchange Format; (Hellmann et al. 2013) – Tools and applications ...
... Elena Montiel-Ponsoda. 2017. “Towards a Module for Lexicography in OntoLex.” In LDK Workshops, 74–84. Chiarcos, Christian. 2012. “Interoperability of corpora and annotations.” In Linked Data in Linguistics, 161–179. Springer. Chiarcos, Christian, Maxim Ionov, Jesse de Does, Katrien Depuydt, Fahad Khan, ...Milan Dojchinovski, Julia Bosque Gil, Jorge Gracia, Ranka Stanković. "EUROLAN 2021: Introduction to Linked Data for Linguistics Online Training School" in Infotheca, Faculty of Philology, University of Belgrade (2021). https://doi.org/10.18485/infotheca.2021.21.1.7
-
Machine Learning and Deep Neural Network-Based Lemmatization and Morphosyntactic Tagging for Serbian
The training of new tagger models for Serbian is primarily motivated by the enhancement of the existing tagset with the grammatical category of a gender. The harmonization of resources that were manually annotated within different projects over a long period of time was an important task, enabled by the development of tools that support partial automation. The supporting tools take into account different taggers and tagsets. This paper focuses on TreeTagger and spaCy taggers, and the annotation schema alignment ...... texts used in this research are shown in Table 2. The text 1984, Serbian translation of Orwell’s novel, was anno- tated according to the MULTEXT-East specification and in- cluded in MULTEXT-East resources (version 3) (Krstev et al., 2004). The text Verne, Serbian translation of the novel Around the ...
... on four different manually an- notated set of texts. Test set was compiled of 10% of each text used for training, and it can give a rough idea on how models perform when tagging similar, already familiar text. Verne, History and Novels represent texts previously un- known to the taggers and show their ...
... result when tagging unfamiliar text. Although TreeTagger TT19 seems to have better overall results, the performance of both tag- Figure 1: Part-of-Speech tagging accuracy per token on test sets, for each of trained models gers drops significantly when tagging unknown text. Figure 2: nPoS-tagging accuracy ...Ranka Stanković, Branislava Šandrih, Cvetana Krstev, Miloš Utvić, Mihailo Škorić. "Machine Learning and Deep Neural Network-Based Lemmatization and Morphosyntactic Tagging for Serbian" in Proceedings of the 12th Language Resources and Evaluation Conference, May Year: 2020, Marseille, France, European Language Resources Association (2020)
-
Integracija heterogenih tekstualnih resursa
Ranka Stanković, Ivan Obradović (2007)U radu je opisan pristup integraciji heterogenih tekstualnih resursa za srpski jezik uz pomoć jednog kompleksnog softverskog alata, razvijenog specijalno za ove potrebe. Opisani su struktura i osnovne komponente razvijenog sistema. Iznete su i mogućnosti unapređivanja resursa međusobnom razmenom informacija, koje pruža razvijeno integrisano okruženje. Konačno, opisana je i mogućnost primene integrisanih heterogenih resursa za proširenje upita, kao i pretraživanje tekstova uopšte, a naznačeni su i neki od pravaca daljeg razvoja.... components of the system we developed under the name of WS4LR (WorkStation for Lexical Resources), which synchronously handles corpora of Serbian, multilingual aligned corpora, a system of morphological dictionaries for Serbian, the Serbian wordnet and the multilingual ontology of proper names Prolex ...
... where part of the functions of WS4LR would be accessible via the internet, and which would at the same time provide for integration of WS4LR and the corpora of Serbian that are also partially accessible via the internet. A related public web service for query expansion is also planned, as well as a ...Ranka Stanković, Ivan Obradović. "Integracija heterogenih tekstualnih resursa" in Zbornik radova međunarodnog simpozijuma Razlike između bosanskog/bošnjačkog, hrvatskog i srpskog jezika, Graz, Austria, April 2007, - (2007)
-
Terminological and lexical resources used to provide open multilingual educational resources
Open educational resources (OER) within BAEKTEL (Blending Academic and Entrepreneurial Knowledge in Technology enhanced learning) network will be available in different languages, mostly in the languages of Western Balkans, Russian and English. University of Belgrade (UB) hosts a central repository based on: BAEKTEL Metadata Portal (BMP), terminological web application for management, browse and search of terminological resources, web services for linguistic support (query expansion, information retrieval, OER indexing, etc.), annotation of selected resources and OER repository on local edX ...... consists of morphological dictionaries, WordNet, domain specific terminological resources such as GeolISSterm, RudOnto, aligned texts in TMX format, corpora etc. Special attention will be given to Termi, newly developed application for terminology management. Keywords: Open Educational Resources, Lexical ...
... rely greatly on various NLP tools to help them cater to a large number of students from all over the world. These tools may include assessment of text and speech, writing assistants, automatic generation of exercises, wrap up questions and online instructional environments [3]. The main goal of ...
... transducers applied on domain corpus to extract terminology. Examples of patterns are presented in [15]. After applying these transducers on domain text extracted potential terms were evaluated. Results presented in previous paper were satisfying enough to speed up the development of a terminological ...Biljana Lazić, Danica Seničić, Aleksandra Tomašević, Bojan Zlatić. "Terminological and lexical resources used to provide open multilingual educational resources" in The Seventh International Conference on eLearning (eLearning-2016), 29-30 September 2016, Belgrade, Serbia, Belgrade : Belgrade Metropolitan University (2016)
-
Distant Reading in Digital Humanities: Case Study on the Serbian Part of the ELTeC Collection
Ranka Stanković, Cvetana Krstev, Branislava Šandrih Todorović, Duško Vitas, Mihailo Škorić, Milica Ikonić Nešić (2022)In this paper we present the Serbian part of the ELTeC multilingual corpus of novels written in the time period 1840-1920. The corpus is being built in order to test various distant reading methods and tools with the aim of re-thinking the European literary history. We present the various steps that led to the production of the Serbian sub-collection: the novel selection and retrieval, text preparation, structural annotation, POS-tagging, lemmatization and named entity recognition. The Serbian sub-collection was published ...Ranka Stanković, Cvetana Krstev, Branislava Šandrih Todorović, Duško Vitas, Mihailo Škorić, Milica Ikonić Nešić. "Distant Reading in Digital Humanities: Case Study on the Serbian Part of the ELTeC Collection" in Proceedings of the Language Resources and Evaluation Conference, June 2022, Marseille, France, European Language Resources Association (2022)
-
Development of Open Educational Resources (OER) for Natural Language Processing
In this paper we present the development of an online course at the edX BAEKTEL platform named “Lexical Recognition in the Natural Language Processing (NLP)”. It is based on the course of the same name for PhD studies at the University of Belgrade, Faculty of Philology. There are not many courses in Computational Linguistics (CL) on OER platforms, and there is none in Serbian either for CL or NLP. We have developed this course in order to improve this ...... but also in written form as parallel (multilingual) corpora of lessons and texts, supported by electronic terminological resources[10], services, and functionalities for searching and browsing of terminological resources and using them for text annotation. The project consortium 10 consists ...
... speech tagging and information extraction, question answering, text summarization, collocations and information retrieval, sentiment analysis and semantics, discourse, machine translation, regular expressions, language models, text classification, and name entity recognition. All of them combine ...
... them. Text analyses can be performed at the levels of strings, morphology, and syntax. Some of the functions are: developing and applying electronic dictionaries of simple words and multi-word units; pattern matching with queries in form of regular expressions and graphs; text tra ...Cvetana Krstev, Biljana Lazić, Ranka Stanković, Giovanni Schiuma, Miladin Kotorčević. "Development of Open Educational Resources (OER) for Natural Language Processing" in The Sixth International Conference on e-Learning (eLearning-2015), September 2015, Belgrade, Serbia, Belgrade : Belgrade Metropolitan Univesity (2015)
-
Part of Speech Tagging for Serbian language using Natural Language Toolkit
Ranka Stanković, Boro Milovanović (2020)Dok se razvijaju složeni algoritmi za NLP (obrada prirodnog jezika), osnovni zadaci kao što je označavanje ostaju veoma važni i još uvek izazovni. NLTK (Natural Language Toolkit) je moćna Python biblioteka za razvoj programa zasnovanih na NLP-u. Pokušavamo da iskoristimo ovu biblioteku za kreiranje PoS (vrsta reči) oznake za savremeni srpski jezik. Jedanaest različitih modela je kreirano korišćenjem NLTK API-ja za označavanje. Najbolji modeli se transformišu sa Brill tagerom da bi se poboljšala tačnost. Obučili smo modele na označenom ...... each token in the text. The program that performs tagging is called tagger. The taggers can be created in multiple ways. In this paper, we will create a tagger for Serbian with a help of a Python library NLTK (Natural Language Toolkit). Besides just exposing more than 50 corpora and lexical resources ...
... of tagger models packaged in NLTK that can be trained. Every tagger has an evaluation procedure that strips down the tags from the given text, tags the text with the newly created tagger and reports the accuracy on all tokens. This measure will be used for comparing different taggers. The simplest ...
... 83 90.51 86.95 Training Time 1143s 1343s 3074s Useful tagger model is one which generalizes well to the text from the other domains. That’s why we tested our best taggers on the text that stayed out of the training and validation phases. Results can be seen in Figure 3. Fig. 3. Accuracy ...Ranka Stanković, Boro Milovanović. "Part of Speech Tagging for Serbian language using Natural Language Toolkit" in 7th International Conference on Electrical, Electronic and Computing Engineering IcETRAN 2020, Academic Mind, Belgrade (2020)
-
An Integrated Environment for Management and Exploitation of Linguistic Resources
Ranka Stanković, Ivan Obradović (2009)... “highlighting”, namely by representing them in blue, in order to make them more easily recognizable in the text. The text in English is on the left hand side, and the corre- sponding text in Serbian on the right. Given the fact that the compound “poreska obaveza” was not in the dictionary of compounds ...
... all forms of literals of a chosen synset in a given text, with the possibility of adding hypernym literals. D. Aligned texts WS4LR contains a module for processing of parallel texts which have previously been aligned using the text align- ment tool XAlign. The module enables the tr ...
... house, nursery, glasshouse” from the corresponding synsets in English wordnet were included in query. B. Aligned text search When a bilingual query is applied to an aligned text, WS4QE generates a filtered aligned document in TMX for- mat. Namely, based on the expansion of the query, which ...Ranka Stanković, Ivan Obradović. "An Integrated Environment for Management and Exploitation of Linguistic Resources" in Proceedings of the International Multiconference on Computer Science and Information Technology, Computational Linguistics – Applications Workshop (CLA09), Mrągowo, Poland, October 2009, Piscataway : IEEE (2009)
-
Multi-word Expressions for Abusive Speech Detection in Serbian
Ovaj rad predstavlja istraživanja na usavršavanju i unapređenju srpske verzije rečnika Hurtlex, višejezičnog leksikona uvredljivih reči. Posebnu pažnju posvećujemo dodavanju izraza sa više reči (polileksemskih jedinica) koji se mogu smatrati uvredljivim, jer su takvi leksički zapisi veoma važni za postizanje dobrih rezultata u mnoštvu zadataka otkrivanja uvredljivog jezika. Srpski morfološki rečnici se koriste kao osnova za čišćenje podataka i stvaranje rečnika. Istaknuta je veza sa drugim leksičkim i semantičkim resursima na srpskom jeziku i predviđena je izgradnja sistema za ...... abusiveness are found in text, it is marked as very abusive (Gitari et al., 2015; Pedersen, 2020); (3) Training of classifiers for recognizing abusive speech in text using the lexicon content as the training set (Wiegand et al., 2018). On the other hand, high quality corpora of hate speech, offensive ...
... occurrence in the examined text (Pamungkas and Patti, 2019), or a numerical value corresponding to the number of abusive words and its level of abusiveness (Razavi et al., 2010); (2) When applying rules for classification of offensive content, the authors may decide to classify the text in a certain category ...
... 0 67.5 0.0 62.8 yes 28.9 20.2 24.5 0.0 24.0 81.9 27.2 Table 3: MWEs classified as yes, no, maybe and part of speech of trigger words. and other corpora previously compiled. The distribution of MWEs by part of speech categories of their trigger word is presented in Table 3. Further analysis showed ...Ranka Stanković, Jelena Mitrović, Danka Jokić, Cvetana Krstev. "Multi-word Expressions for Abusive Speech Detection in Serbian" in Proceedings of the Joint Workshop on Multiword Expressions and Electronic Lexicons, Association for Computational Linguistics (2020)
-
On the compatibility of lexical resources for NooJ
Lexical resources for many languages are provided for the NooJ linguistic development environment. Meta-data descriptions of morphosyntactic and semantic properties of these languages and their resources are a mandatory part of each language module. In this paper we analyze how well the meta-data actually describe resources for a chosen subset of languages and to what extent are they compatible across languages to support multilingual processing. We show that there is place for improvement in both directions.... the text dictionary (hraniti,V+FLX=BRANITI+Sem=cons+Prelaz=pov), although it does not exist in the *.def file. Conversely, semantic codes geo (place), etn (ethnic), ust (institution) etc. appear in the *.def file but they cannot be found in the text dictionary despite the fact that the text contains ...
... establish a one-to-one correspondence between the aligned segments and the original text in French. An example follows, showing the introductory chapter title and its first sentence in each of the seven languages:<text lng=”fr”> I Dans lequel Phileas Fogg et Passepartout s'acceptent ...
... par Phileas Fogg, esq., l'un des membres les plus singuliers et les plus remarqués du Reform-Club de Londres, ...... <text lng=”en”>6 [Type text] Chapter I in which Phileas Fogg and Passepartout accept each other, the one as master, the other as manMr ... Ranka Stanković, Miloš Utvić, Duško Vitas, Cvetana Krstev, Ivan Obradović. "On the compatibility of lexical resources for NooJ" in Automatic Processing of Various Levels of Linguistic Phenomena: Selected Papers from the 2011 International Nooj Conference, Cambridge Scholars Publishing (2012): 96-108
Using technology for knowledge transfer between academia and enterprises
Ivan Obradović, Ranka Stanković (2014)... texts an corpora. Aligned texts are pairs of texts in different languages, mainly an original and its translation, aligned on some structural level, most often the sentence. Aligned texts in LSS are in the standard, Translation Memory eXchange (TMX) format, which is XML-compliant. Corpora are large ...
... described in this section, and a common portal for indexing OER and other supporting TEL content throughout the network. Audio, video and written text materials from all partner institution nodes will be indexed and annotated with metadata, thus providing enhanced searching capabilities. Namely, ...Ivan Obradović, Ranka Stanković. "Using technology for knowledge transfer between academia and enterprises" in Knowledge and Management Models for Sustainable Growth, Proc. of IFKAD 2014, 9th International Forum on Knowledge Asset Dynamics, 11-13 June 2013, Matera, Italy, Bari : IFKAD (2014)
Wordnet Development Using a Multifunctional Tool
Ivan Obradović, Ranka Stanković (2007)In this paper we present a multifunctional tool for manipulating heterogeneous language resources. The tool handles electronic dictionaries, wordnets and aligned texts, and provides for their synchronous use in various tasks. We focus here on the description of the possibilities this tool offers in the development of wordnets. Besides the wordnet module which enables parallel handling of two wordnets, other modules, such as the module for morphological dictionaries and the module for aligned texts, as well as available finite ...... original PWN synset and words he/she has already selected for the target synset. Then, if a highlighted word found in the text in English does not have a highlighted match in the text in the target language, the lexicographer should inspect the sentence in the target language for a possible match, ...
... senses to all chosen words. It goes without saying that other linguistic resources, such as electronic dictionaries, bilingual word lists and corpora can be of invaluable help to the lexicographer in accomplishing this task. In this paper we present a multifunctional tool which, among ...
... configured to handle simultaneously up to 10 dictionaries, which can be monolingual or translational dictionaries, but also thesauri or plain corpora. Thus, VisDic went a step further as a tool which can do more than just editing and browsing wordnets. In addition to that, and contrary to the ...Ivan Obradović, Ranka Stanković. "Wordnet Development Using a Multifunctional Tool" in Proceedings of the International Workshop Computer Aided Language Processing (CALP) '2007, Borovets, Bulgaria, September 2007, - (2007)
Serbian NER&Beyond: The Archaic and the Modern Intertwinned
U ovom radu predstavljamo srpski književni korpus koji se razvija pod okriljem COST Akcije „Distant Reading for European Literary History” CA16204. Koristeći ovaj korpus romana napisanih pre više od jednog veka, razvili smo i učinili javno dostupnim Sistem za prepoznavanje imenovanih entiteta (NER) obučen da prepozna 7 različitih tipova imenovanih entiteta, sa konvolucionom neuronskom mrežom (CNN), koja ima F1 rezultat od ≈91% na test skupu podataka. Ovaj model je dalje ocenjen na posebnom skupu podataka za evaluaciju. Završavamo poređenje ...... evaluation. Web users can naviga- te to http://ner.jerteh.rs/ in order to apply the SrpCNNER model directly on input text. The model can also be applied to a custom- size collection of text files using the previously mentioned NER&Beyond web platform. story), https://zenodo.org/communities/eltec 7 SrpELTeC ...
... entity, so the evaluators were asked to identify and anno- tate them when they occur in text. SrpNER does not recognize WORK entity either, but these annotations were in many cases added by volunteer readers during text correction. Afterwards, students were given different no- vel chapters along with the ...
... distribution of different en- tity types over SrpELTeC-gold novels. The first four digits of text identifiers represent the year of the first publication of a novel. For some novels, NER was not performed on the whole text, but rather on randomly selected chapters. These annotated samples were also included ...Branislava Šandrih Todorović, Cvetana Krstev, Ranka Stanković, Milica Ikonić Nešić. "Serbian NER&Beyond: The Archaic and the Modern Intertwinned" in Proceedings of the Conference Recent Advances in Natural Language Processing - Deep Learning for Natural Language Processing Methods and Applications, INCOMA Ltd. Shoumen, BULGARIA (2021). https://doi.org/10.26615/978-954-452-072-4_141
Building learning capacity by blending different sources of knowledge
... texts and corpora. Aligned texts are pairs of texts in different languages, mainly an original and its translation, aligned on some structural level, most often the sentence. Aligned texts in BMP are in the standard, Translation Memory eXchange (TMX) format, which is XML-compliant. Corpora are large ...
... main features, edX offers interactive online learning software, which provides for production of multimedia educational materials, by combining text, images and videos. Exercises are also included, enabling students to check immediately their understanding of the concepts introduced by the ...Ivan Obradović, Ranka Stanković, Olivera Kitanović, Dalibor Vorkapić. "Building learning capacity by blending different sources of knowledge" in International Journal of Learning and Intellectual Capital (2016). https://doi.org/10.1504/IJLIC.2016.075698
Претрага корпуса заснована на употреби екстерних лексичких ресурса путем веб-сервиса
У раду се разматра хибридни приступ претрази корпуса, илустрован на примеру алатки OCWB и NoSketch Engine, примењених на специјални корпус из области рударства (РудКор) и Корпус савременог српског језика (СрпКор). Разматрани приступ комбинује постојеће могућности алатки OCWB и NoSketch Engine, које своју претрагу заснивају на лингвистичкој анотацији корпуса, са новим могућностима претраге у виду консултовања екстерних језичких ресурса (морфолошки електронски речници српског језика и лексичка база података Српски ворднет). Хибридни приступ је реализован надоградњом вебсучеља која поменуте алатке користе ...... корпуса: лек- сикографске, граматичке, дијалекатске, регионалне, нестандардне, корпусе језика као нематерњег, корпусе струка (енгл. domain specific corpora) итд. У одељку 2 рада се, у општим цртама, описује лингвистичка анотација корпуса РудКор и СрпКор2013, као и могућности претраге тих корпуса по- ...
... „Improvements in Part-of-Speech Tagging with an Application to German”, In: Armstrong, S. et al. (eds.) Natural Language Processing Using Very Large Corpora, Dordrecht: Springer, 13–25. Miloš V. Utvić, Ranka M. Stanković, Aleksandra Đ. Tomašević, Mihailo Đ. Škorić, Biljana Đ. Lazić THE CORPUS SEARCH ...Милош Утвић, Ранка Станковић, Александра Томашевић, Михаило Шкорић, Биљана Лазић. "Претрага корпуса заснована на употреби екстерних лексичких ресурса путем веб-сервиса" in Научни састанак слависта у Вукове дане - Vol. 48/3 Српски језик и његови ресурси, Међународни славистички центар, Филолошки факултет, Универзитет у Београду (2019). https://doi.org/10.18485/msc.2019.48.3.ch12
A Description of Morphological Features of Serbian: a Revision using Feature System Declaration
In this paper we discuss some well-known morphological descriptions used in various projects and applications (most notably MULTEXT-East and Unitex) and illustrate the encountered problems on Serbian. We have spotted four groups of problems: the lack of a value for an existing category, the lack of a category, the interdependence of values and categories lacking some description, and the lack of a support for some types of categories. At the same time, various descriptions often describe exactly the same ...... français. Langue française 87. Paris: Larousse. Erjavec, T. (2004) MULTEXT-East Version 3: Multilingual Morphosyntactic Specifications, Lexicons and Corpora. In: Proc. of the Fourth Intl. Conf. on Language Resources and Evaluation, LREC'04, pp. 1535 - 1538, ELRA, Paris. Erjavec, T. MULTEXT-east mo ...Cvetana Krstev, Ranka Stanković, Vitas Duško. "A Description of Morphological Features of Serbian: a Revision using Feature System Declaration" in Proceedings of the 5th International Conference on Language Resources and Evaluation, LREC 2010, Valetta, Malta : European Language Resources Association (2010)
Developing Termbases for Expert Terminology under the TBX Standard
... Age of Multilingual Corpora. The Journal of Specialized Translation, 18:7-29, 2012. Uwe Reinke. State of the Art in Translation Memory Technology. Translation: Computation, Corpora, Cognition, 3(1), 2013. Laurent Romary. TBX Goes TEI - Implementing a TBX Basic Extension for the Text Encoding Initiative ...
... translation (SMT), an approach developed at IBM in the late 1980s, now the state-of-the art paradigm in MT. The exponential growth of aligned multilingual corpora greatly improved the efficiency and accuracy of SMT in general, and many tools based on this ap- proach, such as Google Translate, are thus being more ...
... Developing Termbases under the TBX Standard 13 are still bound to maintain their importance in the case of expert terminology in domains where aligned corpora are sparse [10], such as, for example mining engineering or geology. In order to secure terminological consistency in one or more termbases, and to ...Ranka Stanković, Ivan Obradović, and Miloš Utvić. "Developing Termbases for Expert Terminology under the TBX Standard" in Natural Language Processing for Serbian - Resources and Applications, Belgrade : University of Belgrade, Faculty of Mathematics (2014)
A Tel Platform Blending Academic And Entrepreneurial Knowledge
... platform provides electronic terminological resources, parallel (multilingual) corpora of lessons and texts in written form, and functionalities for searching and browsing of terminological resources and using them for text annotation. The contents of these resources conform to the methodic/didactic ...
... language support system also handles aligned texts or bitexts, pairs of semantically equivalent texts in different languages, such as an original text and its translation, that are aligned on a structural level (paragraph, sentence, phrase, etc.). Aligned texts in BAEKTEL enable better understanding ...Ivan Obradović, Ranka Stanković, Jelena Prodanović, Olivera Kitanović. "A Tel Platform Blending Academic And Entrepreneurial Knowledge" in Proceedings of the The Fourth International Conference on e-Learning (eLearning-2013), September 2013, Belgrade, Serbia, Belgrade, Serbia : Belgrade Metropolitan University (2013)
An Approach to Development of Bilingual Lexical Resources
... of Philology, University of Belgrade. [4] Obradović, I., Stanković, R., Utvić, M. 2008. An Integrated Environment for Development of Parallel Corpora (in Serbian). In: Die Unterschiede zwischen dem Bosnischen/Bosniakischen, Kroatischen und Serbischen (pp. 563-578), B. Tošović (Ed.). Berlin: ...Stanković Ranka, Obradović Ivan, Trtovac Aleksandra. "An Approach to Development of Bilingual Lexical Resources" in Proceedings of the Fifth Balkan Conference in Informatics BCI 2012, Workshop on Computational Linguistics and Natural Language Processing of Balkan Languages – CLoBL 2012, September 2012, Novi Sad : BCI (2012)