Tomaž Erjavec

2.7k citations

141 papers · 1.4k indexed · h-index 18

Artificial Intelligence top 1%
- Natural Language Processing Techniques 96
- Topic Modeling 34
- Authorship Attribution and Profiling 17
- Semantic Web and Ontologies 15
- Text Readability and Simplification 14
Language and Linguistics top 1%
- Lexicography and Language Studies 21
- Linguistics and language evolution 21
- Linguistics, Language Diversity, and Identity 11
Linguistics and Language top 10%
Human-Computer Interaction top 10%
Information Systems top 10%

Co-authors: Dan Tufiş Darja Fišer Nikola Ljubešić Nancy Ide Bruno Pouliquen Camelia Ignat Ralf Steinberger Sašo Džeroski
Cited by: Artificial Intelligence Language and Linguistics Linguistics and Language
Journals: SHILAP Revista de lepidopterología (9 papers)Language Resources and Evaluation (25 papers)Science of Computer Programming (1 paper)
Partner nations: Slovenia Croatia United States

In The Last Decade

Tomaž Erjavec

127 papers receiving 1.2k citations

Peers

Countries citing papers authored by Tomaž Erjavec

Since Specialization

Citations

This map shows the geographic impact of Tomaž Erjavec's research. It shows the number of citations coming from papers published by authors working in each country. You can also color the map by specialization and compare the number of citations received by Tomaž Erjavec with the expected number of citations based on a country's size and research output (numbers larger than one mean the country cites Tomaž Erjavec more than expected).

Fields of papers citing papers by Tomaž Erjavec

Since Specialization

Physical SciencesHealth SciencesLife SciencesSocial Sciences

This network shows the impact of papers produced by Tomaž Erjavec. Nodes represent research fields, and links connect fields that are likely to share authors. Colored nodes show fields that tend to cite the papers produced by Tomaž Erjavec. The network helps show where Tomaž Erjavec may publish in the future.

Co-authorship network

The 25 scholars most cited alongside Tomaž Erjavec, linked wherever they have co-authored with each other. Click a name or a connecting line to browse the papers they share.

Border = papers with Tomaž Erjavec Line = papers co-authored together Tomaž Erjavec links everyone, so they are left out of the graph.

All Works

Sort: Min cites: Since: Top N: Style:

20 of 20 papers shown

#	Work
1	Gigafida 2.0: The Reference Corpus of Written Standard Slovene Language Resources and Evaluation ·Simon Krek,Špela Arhar Holdt,Tomaž Erjavec,Jaka Čibej,Andraž Repar,Polona Gantar,Nikola Ljubešić,Iztok Kosem,Kaja Dobrovoljc	2020	6
2	Korpusna analiza klitik in njim podobnih elementov v slovenskem knjižnem jeziku 16. stoletja SHILAP Revista de lepidopterología ·Alenka Jelovšek,Tomaž Erjavec	2019	2
3	The Sloleks Morphological Lexicon and its Future Development Kaja Dobrovoljc,Simon Krek,Tomaž Erjavec	2018	2
4	Opus-MontenegrinSubs 1.0: First electronic corpus of the Montenegrin language Työväentutkimus Vuosikirja ·Petar Božović,Tomaž Erjavec,Jörg Tiedemann,Nikola Ljubešić,Vojko Gorjanc	2018	3
5	Leksikon besednih oblik Sloleks in smernice njegovega razvoja Kaja Dobrovoljc,Simon Krek,Tomaž Erjavec	2018	2
6	Corpus-Based Diacritic Restoration for South Slavic Languages. Language Resources and Evaluation ·Nikola Ljubešić,Tomaž Erjavec,Darja Fišer	2016	9
7	Gold-Standard Datasets for Annotation of Slovene Computer-Mediated Communication. Tomaž Erjavec,Jaka Čibej,Špela Arhar Holdt,Nikola Ljubešić,Darja Fišer	2016	3
8	Corpus vs. Lexicon Supervision in Morphosyntactic Tagging: the Case of Slovene Language Resources and Evaluation ·Nikola Ljubešić,Tomaž Erjavec	2016	15
9	Normalising Slovene data: historical texts vs. user-generated content. Nikola Ljubešić,Katja Zupan,Darja Fišer,Tomaž Erjavec	2016	20
10	Text mining platform for NLP workflow design, replication and reuse Lirias (KU Leuven) ·Matic Perovšek,Vid Podpečan,Janez Kranjc,Tomaž Erjavec,Senja Pollak,Ngoc Quynh Do Thi,Xiao Liu,Cameron Smith,Marc Cavazza,Nada Lavrač	2015	2
11	The goo300k corpus of historical Slovene Language Resources and Evaluation ·Tomaž Erjavec	2012	5
12	OD BIOGRAFSKEGA LEKSIKONA DO ZNANSTVENOKRITIČNE IZDAJE: VPRAŠANJE TRAJNOSTI ELEKTRONSKIH BESEDIL Tomaž Erjavec,Jan Jona Javoršek,Matija Ogrin,Petra Vide Ogrin	2011	0
13	Designing and evaluating a Russian tagset Language Resources and Evaluation ·Serge Sharoff,Михаил Копотев,Tomaž Erjavec,Anna Feldman,Dagmar Divjak	2008	28
14	A Low Cost Approach to Building a Japanese-Slovene Parallel Corpus IEICE Technical Report; IEICE Tech. Rep. ·Kristina Hmeljak,Tomaž Erjavec	2008	2
15	The JOS Morphosyntactically Tagged Corpus of Slovene Language Resources and Evaluation ·Tomaž Erjavec,Simon Krek	2008	8
16	Compiling and Using the IJS-ELAN Parallel Corpus. Informatica (slovenia) ·Tomaž Erjavec	2002	3
17	Automatic Sense Tagging Using Parallel Corpora. Nancy Ide,Tomaž Erjavec,Dan Tufiş	2001	18
18	Morphosyntactic Tagging of Slovene: Evaluating Taggers and Tagsets. Language Resources and Evaluation ·Sašo Džeroski,Tomaž Erjavec,Jakub Zavrel	2000	17
19	The MULTEXT-East Corpus Language Resources and Evaluation ·Tomaž Erjavec,Nancy Ide	1998	27
20	East meets West: Producing Multilingual Resources in a European Context Language Resources and Evaluation ·Tomaž Erjavec,Ann Marie Lawson,Laurent Romary	1998	8

About Tomaž Erjavec

Tomaž Erjavec is a scholar working on Language and Linguistics, Artificial Intelligence and Human-Computer Interaction, having authored 141 papers that have together received 1.4k indexed citations. Recurring topics across this work include Natural Language Processing Techniques (96 papers), Topic Modeling (34 papers), Lexicography and Language Studies (21 papers), Linguistics and language evolution (21 papers), Authorship Attribution and Profiling (17 papers), Semantic Web and Ontologies (15 papers), Text Readability and Simplification (14 papers) and Linguistics, Language Diversity, and Identity (11 papers). The work is most often cited by research in Artificial Intelligence (1.2k citations), Language and Linguistics (339 citations) and Linguistics and Language (37 citations). Tomaž Erjavec has collaborated with scholars based in Slovenia, Croatia and United States. Frequent co-authors include Dan Tufiş, Darja Fišer, Nikola Ljubešić, Nancy Ide, Bruno Pouliquen, Camelia Ignat, Ralf Steinberger, Sašo Džeroski, Dániel Varga and Simon Krek. Their work appears in journals such as SHILAP Revista de lepidopterología, Language Resources and Evaluation and Science of Computer Programming.

Rankless uses publication and citation data sourced from OpenAlex, an open and comprehensive bibliographic database. While OpenAlex provides broad and valuable coverage of the global research landscape, it—like all bibliographic datasets—has inherent limitations. These include incomplete records, variations in author disambiguation, differences in journal indexing, and delays in data updates. As a result, some metrics and network relationships displayed in Rankless may not fully capture the entirety of a scholar's output or impact.

Explore authors with similar magnitude of impact