Претрага
70 items
-
Advancing Sentiment Analysis in Serbian Literature: A Zero and Few-Shot Learning Approach Using the Mistral Model
Ova studija predstavlja analizu sentimenta srpskih starih romana iz perioda 1840-1920, koristeći veliki jezički model (LLM) Mistral za tehniku učenja sa zasnovani na takozvanim "zero" i "few-shot" pokušajima. Glavni pristup uvodi inovacije osmišljavanjem istraživačkih upita (promptova) uključuju tekst sa uputstvom za klasifikaciju bez primera i na osnovu nekoliko primera, omogućavajući jezičkom modelu da klasifikuje osećanja u pozitivne, negativne ili objektivne kategorije. Ova metodologija ima za cilj da pojednostavi analizu osećanja ograničavanjem odgovora, čime se povećava preciznost ...Milica Ikonić Nešić, Saša Petalinkar, Mihailo Škorić, Ranka Stanković, Biljana Rujević. "Advancing Sentiment Analysis in Serbian Literature: A Zero and Few-Shot Learning Approach Using the Mistral Model" in In Proceedings of the Sixth International Conference on Computational Linguistics in Bulgaria (CLIB 2024), BAS (2024)
-
WS4LR - a Worksation for Lexical Resources
... Balkan Languages, in Proc. of 1st International Wordnet Conference, Mysore, India Veronis, J. (ed.) (2000) Parallel Text processing: Alignment and Use of Translation Corpora, Dordrecht: Kluwer Academic Publishers Vossen, P. (ed.) (1998) EuroWordNet: A Multilingual Database with Lexical ...
... pair of semantically equivalent texts in different langauges, such as an original text and its translation, that are and aligned on a structural level (paragraph, sentence, phrase, etc.) is known as an aligned text or bitext. Aligned texts are usually constructed in two main steps: in the first ...
... important reason is the fact that in text recognition by Intex/Unitex the usage of all dictionaries is not always necessary, or even recommended. For example, dictionaries of English personal names transcribed according to Serbian orthography should not be applied to a text that makes no reference to such ...Cvetana Krstev, Ranka Stanković, Duško Vitas, Ivan Obradović. "WS4LR - a Worksation for Lexical Resources" in Proceedings of the Fifth Interantional Conference on Language Resources and Evaluation, Genoa, Italy, May 2006, ELRA - European Language Resources Association (2006)
-
Improving Document Retrieval in Large Domain Specific Textual Databases Using Lexical Resources
Large collections of textual documents represent an example of big data that requires the solution of three basic problems: the representation of documents, the representation of information needs and the matching of the two representations. This paper outlines the introduction of document indexing as a possible solution to document representation. Documents within a large textual database developed for geological projects in the Republic of Serbia for many years were indexed using methods developed within digital humanities: bag-of-words and named ...... Surrogates can also contain an abstract and/or a snippet, a relevant text fragment. The content of a document surrogate, or its part, can be generated automatically by extracting and selecting specific terms (words) from the document text. Language processing methods and techniques devel- oped within the ...
... textual content of the geological project. Future plans include digitalization and full text archiving of the project content, followed by the implementation of the approach described in this paper to this future full text database. 2.2 The Initial Solution for Document Retrieval The initial solution for ...
... normalizing length [8]. The improved system ranking uses several measures, starting with tf idf measure based on frequencies of words allocated to the text, text length, and the document frequency [14]. Further development included modification of tf idf with cosine normalization (tfc tfc), tfc nfc term weighting ...Ranka Stanković, Cvetana Krstev, Ivan Obradović, Olivera Kitanović. "Improving Document Retrieval in Large Domain Specific Textual Databases Using Lexical Resources" in Trans. Computational Collective Intelligence - Lecture Notes in Computer Science 26, Springer (2017). https://doi.org/10.1007/978-3-319-59268-8_8
-
The Nooj System as Module within an Integrated Language Processing Environment
... information retrieval and related areas. If query is further combined with ILI, a multilingual wordnet pivot, the possibility of searching text resources (web, corpus, text) in different languages with a single query is opened. NooJ supports morphological query expansion and expansion of queries by graphs ...
... pair of semantically equivalent texts in different languages, such as an original text and its translation, that are aligned on a structural level (paragraph, sentence, phrase, etc.) is known as an aligned text or bitext. One of the supported formats is the Translation Memory eXchange format ...
... resources management 4.1. Parallel Text Management The WS4LR module for management of aligned parallel texts uses texts which have previously been aligned using Xalign as an alignment tool (Bonhomme 2001). Parallel texts which usually originate from a text in one language and its translation ...Ranka Stanković, Duško Vitas, Cvetana Krstev. "The Nooj System as Module within an Integrated Language Processing Environment" in Proceedings of the 2007 International Nooj Conference, Cambridge Scholars Publishing (2008)
-
A bilingual digital library for academic and entrepreneurial knowledge management
A generic knowledge management process of organization, storage and retrieval of knowledge can suitably be fitted in a digital library. In the digital and knowledge age digital libraries can be used in knowledge management to handle intellectual assets and support knowledge creation. A multilingual digital library either stores content in more than one language or provides multilingual query access to monolingual content. In Serbia 18 of 308 scientific journals regularly published are bi-lingual, with papers simultaneously being in English ...... corpus processing system, based on automata-oriented technology (Utvić et al., 2007). Text preparation, alignment and generation of TMX documents are done within a special-purpose tool ACIDE (Aligned Corpora Integrated Development Environment) (Utvić et al., 2007). The TMX document consists of ...
... document (eXtensible Markup Language) according to TEI (Text Encoding Initiative) guidelines.2 Practically, this means that the main divisions of a text as well as its titles, paragraphs and segments (sentences) have to be XML tagged. Any text editing software with support for well- formedness checking ...
... of elements designed for description of each text collection. Metadata are stored in Serbian and in English, where the attribute lang specifies the language of metadata element. Metadata elements provide information on: − text collections: text collection title (journal or project), ISSN1 and ...Ranka Stanković, Cvetana Krstev, Biljana Lazić, Dalibor Vorkapić. "A bilingual digital library for academic and entrepreneurial knowledge management" in Proceeding of 10th International Forum on Knowledge Asset Dynamics — IFKAD 2015: Culture, Innovation and Entrepreneurship: connecting the knowledge dots, Bari, Italy, 10-12 June 2015, Bari : IFKAD (2015)
-
Development and Evaluation of Three Named Entity Recognition Systems for Serbian - The Case of Personal Names
In this paper we present a rule- and lexicon-based system for the recognition of Named Entities (NE) in Serbian news paper texts that was used to prepare a gold standard annotated with personal names. It was further used to prepare training sets for four different levels of annota tion, which were further used to train two Named Entity Recognition (NER) sys tems: Stanford and spaCy. All obtained models, together with a rule- and lexicon based system were evaluated on ...... Sophia Ananiadou, and Juníchi Tsu- jii. 2012. BRAT: a Web-based Tool for NLP- Assisted Text Annotation. In Proceedings of the Demonstrations Session at EACL 2012. Duško Vitas and Cvetana Krstev. 2012. Processing of Corpora of Serbian using Electronic Dictionaries. Prace Filologiczne LXIII:279–292. ...
... al., 2012) is a web-based tool9 for text anno- tation, i.e., for adding notes to existing text doc- uments. It is designed for structured annotation, allowing embedded annotations, which are espe- cially convenient for NER. Annotations are ex- ternal, so for each text file, an additional annota- tion file ...
... Petrović After running STANFORDNER on a text, an output is provided in already mentioned CoNLL02 format. We used CoNLL02 7→ BRAT converter available within NER&BEYOND online tool. Finally, for both SPACY NER and STANAFORDNER output files, we applied ANN + TEXT 7→ XML converter offered by Gemini, also ...Branislava Šandrih, Cvetana Krstev, Ranka Stanković. "Development and Evaluation of Three Named Entity Recognition Systems for Serbian - The Case of Personal Names" in Proceedings - Natural Language Processing in a Deep Learning World, Incoma Ltd., Shoumen, Bulgaria (2019). https://doi.org/10.26615/978-954-452-056-4_122
-
Чији је пример? Анализа лексичких обележја на примерима Речника САНУ
У овом раду поставља се питање: да ли се може утврдити ко је аутор неког текста уколико се анализирају искључиво његова лексичка обележја? Како бисмо покушали да добијемо одговор на ово питање, посматрали смо примере у оквиру речничког чланка појединачне лексеме Речника САНУ, који су забележени у пет томова (и то: I, II, XVIII, XIX и XX). Сваки пример је преузет из неког извора на шта упућују скраћенице, наведене у заградама. Од преко 5.000 понуђених извора, определили смо се ...... текста (енгл. Text Summarization), лексичког раш- члањавања (енгл. Dependency Parsing), обележавања текста према врсти речи (енгл. Part-of-Speech Tagging), лематизације (енгл. Lemmatization), препозна- вања именованих ентитета (енгл. Named Entity Recognition), класификације текста (енгл. Text Classification) ...
... речник српскохрватског књижевног и народног језика сану, Београд: САНУ, VII‒XXVI. Витас/Крстев 2012: Duško Vitas & Cvetana Krstev, Processing of Corpora of Serbian Using Electronic Dictionaries. Prace Filologiczne, vol. LXIII, Warszawa, 279–292. Вуловић/Ђинђић/Радоњић 2008: Наташа Вуловић, Марија ...
... Београд: Институт за српски језик САНУ, 115–119. Едер и др. 2016: M. Eder, J. Rybicki & M. Kestemont, Stylometry with R: a package for computational text analysis, R Journal 8(1): 107–121. Закон о Речнику Српске академије наука и уметности: http://www.mpn. gov.rs/wp-content/uploads/2015/08/zakon_o_r ...Бранислава Б. Шандрих, Ранка М. Станковић, Мирјана С. Гочанин. "Чији је пример? Анализа лексичких обележја на примерима Речника САНУ" in Српски језик и његови ресурси, Међународни славистички центар, Филолошки факултет, Универзитет у Београду (2019). https://doi.org/10.18485/msc.2019.48.3.ch13
-
Using Lexical Resources for Irony and Sarcasm Classification
The paper presents a language dependent model for classification of statements into ironic and non-ironic. The model uses various language resources: morphological dictionaries, sentiment lexicon, lexicon of markers and a WordNet based ontology. This approach uses various features: antonymous pairs obtained using the reasoning rules over the Serbian WordNet ontology (R), antonymous pairs in which one member has positive sentiment polarity (PPR), polarity of positive sentiment words (PSP), ordered sequence of sentiment tags (OSA), Part-of-Speech tags of words (POS) ...... Patti, Andrea Bolioli, and Luigi Di Caro. 2012. Annotating irony in a novel italian corpus for sentiment analysis. In Proc. of the 4th Workshop on Corpora for Research on Emotion Sentiment and Social Signals. 1–7. [16] Yanfen Hao and Tony Veale. 2010. An Ironic Fist in a Velvet Glove: Creative Mis-r ...
... from RDF triples (?a swn:irony ?z), where a is the first and z is the last member of the observed rule. According to [1], existence of irony in a text is characterized by markers. Those are literary devices which indicate that irony is present, but if we remove them, ironic meaning does not change ...Miljana Mladenović, Cvetana Krstev, Jelena Mitrović, Ranka Stanković. "Using Lexical Resources for Irony and Sarcasm Classification" in Proceedings of the 8th Balkan Conference in Informatics (BCI '17), New York, NY, USA, : ACM (2017). https://doi.org/
-
Ontološki model upravljanja rizikom u rudarstvu
Olivera Kitanović (2021)Rudarska proizvodnja obuhvata kompleksne tehnološke sisteme, što nameće potrebu za uspostavljanjem i unapređivanjem sistema upravljanja rizikom. Heterogenost i obim podataka neophodnih za upravljanje rizikom zahtevaju sistem koji ih na fleksibilan način integriše i omogućava njihovo optimalno korišćenje. Osnovni cilj ove disertacije je razvoj ontologije za domen rudarstva i na njoj zasnovanog modela za upravljanje rizikom. Njegova realizacija podrazumeva i implementaciju algoritama ekstrakcije informacija za popunjavanje ontologije, kao i odgovarajuće softversko rešenje. Razvoj modela obuhvata i značajno proširenje rudarskog korpusa, kao ...rudarstvo, rizik, upravljanje rizikom, procena rizika, ontologija, semantička mreža, ekstrakcija informacija, upravljanje znanjem, računarska lingvistika... Proactive Risk Management Based On The Text Mining Classification.” International Journal of Engineering Research and Technology 1. Roller, Stephen, Douwe Kiela, and Maximilian Nickel. 2018. “Hearst Patterns Revisited: Automatic Hypernym Detection from Large Text Corpora.” ArXiv Preprint ArXiv:1806.03191 ...
... Sharing?” International Journal of Human-Computer Studies 43 (5–6): 907–28. Hearst, Marti A. 1992. “Automatic Acquisition of Hyponyms from Large Text Corpora.” In Coling 1992 Volume 2: The 15th International Conference on Computational Linguistics. Henley, Ernest J, and Hiromitsu Kumamoto. 1996. “Pr ...
... Symposium: Exploring Attitude and Affect in Text: Theories and Applications, Stanford, CA. Ružin, Tatjana Ž. 2015. “Glagoli Uzrokovanja u Engleskom i Srpskom Jeziku.” PhD Thesis, Univerzitet u Beogradu-Filološki fakultet. Salton, Gerard. 1989. “Automatic Text Processing: The Transformation, Analysis, ...Olivera Kitanović. Ontološki model upravljanja rizikom u rudarstvu, Beograd : [O. Kitanović], 2021
-
Surface functional groups and degree of carbonization of selected chars from different processes and feedstock
Marija Ilić, Franz-Hubert Haegel, Aleksandar Lolić, Zoran Nedić, Tomislav Tosti, Ivana Sredović Ignjatović, Andreas Linden, Nicolai D. Jablonowski, Heinrich Hartmann (2022)The knowledge ofthe structural and chemical properties of biochars is decisive for their application as technical products. For this reason, methods for the characterization of biochars that are generally applicable and allow quality control are highly desired. Several methods that have shown potential in other studies were used to investigate two activated carbons and seven biochars from different processes and feedstock. The chars were chosen to cover a wide range of chemical composition and structural properties as a hardness ...Biougalj, različiti procesi dobijanja, različite sirovine, površinski aktivne grupe, x-ray, FT-IR,XPS... Clara, CA with a wavelength of 532 nm and a power of 25 mW was used as the excitation source. Spectra were fitted with OriginPro 2019 (OriginLab Corpora- tion, Northampton, MA, USA) after appropriate baseline correction as described in the Results. Spectral induced polarization. SIP measurements ...
... S2 Fig in S1 File. The scattering angles are given as values of 20 and for better comparison with some literature also as distances in A in the text. Qtz pj OR AAA || AC2 Qtz SW500f OO Cal HW500f Cal BW550s Cal ~ HW1100g Cal PW700g Alb CS180h M200h 20 30 40 50 60 70 80 ...Marija Ilić, Franz-Hubert Haegel, Aleksandar Lolić, Zoran Nedić, Tomislav Tosti, Ivana Sredović Ignjatović, Andreas Linden, Nicolai D. Jablonowski, Heinrich Hartmann. "Surface functional groups and degree of carbonization of selected chars from different processes and feedstock" in PLOS ONE (2022). https://doi.org/10.1371/journal.pone.0277365