Frank Smadja

2.2k total citations · 1 hit paper
14 papers, 1.3k citations indexed

About

Frank Smadja is a scholar working on Artificial Intelligence, Language and Linguistics and Developmental and Educational Psychology. According to data from OpenAlex, Frank Smadja has authored 14 papers receiving a total of 1.3k indexed citations (citations by other indexed papers that have themselves been cited), including 13 papers in Artificial Intelligence, 3 papers in Language and Linguistics and 3 papers in Developmental and Educational Psychology. Recurrent topics in Frank Smadja's work include Natural Language Processing Techniques (12 papers), Topic Modeling (8 papers) and Text Readability and Simplification (4 papers). Frank Smadja is often cited by papers focused on Natural Language Processing Techniques (12 papers), Topic Modeling (8 papers) and Text Readability and Simplification (4 papers). Frank Smadja collaborates with scholars based in United States and Israel. Frank Smadja's co-authors include Kathleen McKeown, Vasileios Hatzivassiloglou, Grigory Begelman, Philipp Keller, Yoelle Maarek, Jong‐Seok Lim and Michael Elhadad and has published in prestigious journals such as Computational Linguistics, Computational Intelligence and Literary and Linguistic Computing.

In The Last Decade

Frank Smadja

14 papers receiving 1.1k citations

Hit Papers

Retrieving collocations from text: Xtract 1993 2026 2004 2015 1993 100 200 300 400 500

Peers

Frank Smadja
Roxana Gîrju United States
Judith L. Klavans United States
Claudia Leacock United States
Douglas E. Appelt United States
Collin F. Baker United States
Harold Somers United Kingdom
Roxana Gîrju United States
Frank Smadja
Citations per year, relative to Frank Smadja Frank Smadja (= 1×) peers Roxana Gîrju

Countries citing papers authored by Frank Smadja

Since Specialization
Citations

This map shows the geographic impact of Frank Smadja's research. It shows the number of citations coming from papers published by authors working in each country. You can also color the map by specialization and compare the number of citations received by Frank Smadja with the expected number of citations based on a country's size and research output (numbers larger than one mean the country cites Frank Smadja more than expected).

Fields of papers citing papers by Frank Smadja

Since Specialization
Physical SciencesHealth SciencesLife SciencesSocial Sciences

This network shows the impact of papers produced by Frank Smadja. Nodes represent research fields, and links connect fields that are likely to share authors. Colored nodes show fields that tend to cite the papers produced by Frank Smadja. The network helps show where Frank Smadja may publish in the future.

Co-authorship network of co-authors of Frank Smadja

This figure shows the co-authorship network connecting the top 25 collaborators of Frank Smadja. A scholar is included among the top collaborators of Frank Smadja based on the total number of citations received by their joint publications. Widths of edges represent the number of papers authors have co-authored together. Node borders signify the number of papers an author published with Frank Smadja. Frank Smadja is excluded from the visualization to improve readability, since they are connected to all nodes in the network.

All Works

14 of 14 papers shown
1.
Begelman, Grigory, Philipp Keller, & Frank Smadja. (2006). Automated Tag Clustering: Improving search and exploration in the tag space. 248 indexed citations
2.
McKeown, Kathleen, Frank Smadja, & Vasileios Hatzivassiloglou. (1996). Translating Collocations for Bilingual Lexicons: A Statistical Approach. Computational Linguistics. 22(1). 1–38. 292 indexed citations
3.
Smadja, Frank & Kathleen McKeown. (1994). Translating collocations for use in bilingual lexicons. 152–152. 9 indexed citations
4.
Smadja, Frank. (1993). Retrieving collocations from text: Xtract. Computational Linguistics. 19(1). 143–177. 512 indexed citations breakdown →
5.
Smadja, Frank. (1992). Xtract: An overview. Computers and the Humanities. 26(5-6). 399–413. 11 indexed citations
6.
Smadja, Frank. (1992). Extracting collocations from text. An application: language generation. 9 indexed citations
7.
Smadja, Frank. (1992). How to Compile a Bilingual Collocational Lexicon . Automatically. 21 indexed citations
8.
Smadja, Frank & Kathleen McKeown. (1991). Using collocations for language generation1. Computational Intelligence. 7(4). 229–239. 33 indexed citations
9.
Smadja, Frank. (1991). From N-grams to collocations. 279–284. 29 indexed citations
10.
McKeown, Kathleen, et al.. (1990). Natural language generation in COMET. 103–139. 23 indexed citations
11.
Smadja, Frank & Kathleen McKeown. (1990). Automatically extracting and representing collocations for language generation. The COCOON platform (University of Paris). 252–259. 88 indexed citations
12.
Smadja, Frank. (1989). Lexical Co-occurrence: The Missing Link. Literary and Linguistic Computing. 4(3). 163–168. 36 indexed citations
13.
Smadja, Frank. (1989). Dictionaries for Language Generation Accounting for Co-occurrence Knowledge. Columbia Academic Commons (Columbia University). 1 indexed citations
14.
Smadja, Frank & Yoelle Maarek. (1989). Full-Text Indexing Based on Lexical Relations. Columbia Academic Commons (Columbia University). 7 indexed citations

Rankless uses publication and citation data sourced from OpenAlex, an open and comprehensive bibliographic database. While OpenAlex provides broad and valuable coverage of the global research landscape, it—like all bibliographic datasets—has inherent limitations. These include incomplete records, variations in author disambiguation, differences in journal indexing, and delays in data updates. As a result, some metrics and network relationships displayed in Rankless may not fully capture the entirety of a scholar's output or impact.

Explore authors with similar magnitude of impact

Rankless by CCL
2026