Tomaž Erjavec

2.7k total citations
141 papers, 1.4k citations indexed

About

Tomaž Erjavec is a scholar working on Artificial Intelligence, Language and Linguistics and Information Systems. According to data from OpenAlex, Tomaž Erjavec has authored 141 papers receiving a total of 1.4k indexed citations (citations by other indexed papers that have themselves been cited), including 109 papers in Artificial Intelligence, 46 papers in Language and Linguistics and 13 papers in Information Systems. Recurrent topics in Tomaž Erjavec's work include Natural Language Processing Techniques (96 papers), Topic Modeling (34 papers) and Lexicography and Language Studies (21 papers). Tomaž Erjavec is often cited by papers focused on Natural Language Processing Techniques (96 papers), Topic Modeling (34 papers) and Lexicography and Language Studies (21 papers). Tomaž Erjavec collaborates with scholars based in Slovenia, Croatia and United States. Tomaž Erjavec's co-authors include Dan Tufiş, Darja Fišer, Nikola Ljubešić, Nancy Ide, Bruno Pouliquen, Camelia Ignat, Ralf Steinberger, Sašo Džeroski, Dániel Varga and Simon Krek and has published in prestigious journals such as SHILAP Revista de lepidopterología, Language Resources and Evaluation and Science of Computer Programming.

In The Last Decade

Tomaž Erjavec

127 papers receiving 1.2k citations

Peers — A (Enhanced Table)

Peers by citation overlap · career bar shows stage (early→late) cites · hero ref

Name h Career Trend Papers Cites
Tomaž Erjavec Slovenia 18 1.2k 339 99 56 46 141 1.4k
Harold Somers United Kingdom 14 968 0.8× 284 0.8× 111 1.1× 41 0.7× 30 0.7× 65 1.2k
Stefan Evert Germany 20 934 0.8× 284 0.8× 99 1.0× 55 1.0× 66 1.4× 73 1.2k
Nicoletta Calzolari Italy 21 1.4k 1.1× 373 1.1× 124 1.3× 174 3.1× 33 0.7× 105 1.6k
Hans Uszkoreit Germany 20 1.4k 1.2× 235 0.7× 186 1.9× 96 1.7× 22 0.5× 135 1.6k
Roger Garside United Kingdom 10 609 0.5× 251 0.7× 113 1.1× 36 0.6× 56 1.2× 16 944
Laurent Romary France 17 945 0.8× 148 0.4× 191 1.9× 94 1.7× 25 0.5× 134 1.2k
Hans van Halteren Netherlands 15 1.1k 0.9× 126 0.4× 231 2.3× 34 0.6× 74 1.6× 56 1.2k
Scott Piao United Kingdom 15 550 0.5× 118 0.3× 93 0.9× 47 0.8× 55 1.2× 57 694
Lou Burnard United Kingdom 10 528 0.4× 235 0.7× 118 1.2× 15 0.3× 30 0.7× 43 848
Josef Ruppenhofer Germany 18 1.5k 1.2× 308 0.9× 171 1.7× 65 1.2× 66 1.4× 89 1.7k

Countries citing papers authored by Tomaž Erjavec

Since Specialization
Citations

This map shows the geographic impact of Tomaž Erjavec's research. It shows the number of citations coming from papers published by authors working in each country. You can also color the map by specialization and compare the number of citations received by Tomaž Erjavec with the expected number of citations based on a country's size and research output (numbers larger than one mean the country cites Tomaž Erjavec more than expected).

Fields of papers citing papers by Tomaž Erjavec

Since Specialization
Physical SciencesHealth SciencesLife SciencesSocial Sciences

This network shows the impact of papers produced by Tomaž Erjavec. Nodes represent research fields, and links connect fields that are likely to share authors. Colored nodes show fields that tend to cite the papers produced by Tomaž Erjavec. The network helps show where Tomaž Erjavec may publish in the future.

Co-authorship network of co-authors of Tomaž Erjavec

This figure shows the co-authorship network connecting the top 25 collaborators of Tomaž Erjavec. A scholar is included among the top collaborators of Tomaž Erjavec based on the total number of citations received by their joint publications. Widths of edges represent the number of papers authors have co-authored together. Node borders signify the number of papers an author published with Tomaž Erjavec. Tomaž Erjavec is excluded from the visualization to improve readability, since they are connected to all nodes in the network.

All Works

20 of 20 papers shown
1.
Krek, Simon, et al.. (2020). Gigafida 2.0: The Reference Corpus of Written Standard Slovene. Language Resources and Evaluation. 3340–3345. 6 indexed citations
2.
Erjavec, Tomaž, et al.. (2019). Korpusna analiza klitik in njim podobnih elementov v slovenskem knjižnem jeziku 16. stoletja. SHILAP Revista de lepidopterología. 12. 3–19. 2 indexed citations
3.
Erjavec, Tomaž, et al.. (2018). The Sloleks Morphological Lexicon and its Future Development. 2 indexed citations
4.
Erjavec, Tomaž, et al.. (2018). Opus-MontenegrinSubs 1.0: First electronic corpus of the Montenegrin language. Työväentutkimus Vuosikirja. 24–28. 3 indexed citations
5.
Krek, Simon, et al.. (2018). Leksikon besednih oblik Sloleks in smernice njegovega razvoja. 2 indexed citations
6.
Ljubešić, Nikola, Tomaž Erjavec, & Darja Fišer. (2016). Corpus-Based Diacritic Restoration for South Slavic Languages.. Language Resources and Evaluation. 3612–3616. 9 indexed citations
7.
Erjavec, Tomaž, et al.. (2016). Gold-Standard Datasets for Annotation of Slovene Computer-Mediated Communication.. 29–40. 3 indexed citations
8.
Ljubešić, Nikola & Tomaž Erjavec. (2016). Corpus vs. Lexicon Supervision in Morphosyntactic Tagging: the Case of Slovene. Language Resources and Evaluation. 1527–1531. 15 indexed citations
9.
Ljubešić, Nikola, et al.. (2016). Normalising Slovene data: historical texts vs. user-generated content.. 20 indexed citations
10.
Podpečan, Vid, Tomaž Erjavec, Senja Pollak, et al.. (2015). Text mining platform for NLP workflow design, replication and reuse. Lirias (KU Leuven). 2 indexed citations
11.
Erjavec, Tomaž. (2012). The goo300k corpus of historical Slovene. Language Resources and Evaluation. 2257–2260. 5 indexed citations
12.
Erjavec, Tomaž, et al.. (2011). OD BIOGRAFSKEGA LEKSIKONA DO ZNANSTVENOKRITIČNE IZDAJE: VPRAŠANJE TRAJNOSTI ELEKTRONSKIH BESEDIL. 55(1).
13.
Sharoff, Serge, et al.. (2008). Designing and evaluating a Russian tagset. Language Resources and Evaluation. 279–285. 28 indexed citations
14.
Erjavec, Tomaž, et al.. (2008). A Low Cost Approach to Building a Japanese-Slovene Parallel Corpus. IEICE Technical Report; IEICE Tech. Rep.. 108(50). 7–10. 2 indexed citations
15.
Erjavec, Tomaž & Simon Krek. (2008). The JOS Morphosyntactically Tagged Corpus of Slovene. Language Resources and Evaluation. 8 indexed citations
16.
Erjavec, Tomaž. (2002). Compiling and Using the IJS-ELAN Parallel Corpus.. Informatica (slovenia). 26. 3 indexed citations
17.
Ide, Nancy, Tomaž Erjavec, & Dan Tufiş. (2001). Automatic Sense Tagging Using Parallel Corpora.. 83–90. 18 indexed citations
18.
Džeroski, Sašo, Tomaž Erjavec, & Jakub Zavrel. (2000). Morphosyntactic Tagging of Slovene: Evaluating Taggers and Tagsets.. Language Resources and Evaluation. 17 indexed citations
19.
Erjavec, Tomaž & Nancy Ide. (1998). The MULTEXT-East Corpus. Language Resources and Evaluation. 971–974. 27 indexed citations
20.
Erjavec, Tomaž, Ann Marie Lawson, & Laurent Romary. (1998). East meets West: Producing Multilingual Resources in a European Context. Language Resources and Evaluation. 8 indexed citations

Rankless uses publication and citation data sourced from OpenAlex, an open and comprehensive bibliographic database. While OpenAlex provides broad and valuable coverage of the global research landscape, it—like all bibliographic datasets—has inherent limitations. These include incomplete records, variations in author disambiguation, differences in journal indexing, and delays in data updates. As a result, some metrics and network relationships displayed in Rankless may not fully capture the entirety of a scholar's output or impact.

Explore authors with similar magnitude of impact

Rankless by CCL
2026