Michael Cafarella

7.8k total citations · 4 hit papers
98 papers, 4.9k citations indexed

About

Michael Cafarella is a scholar working on Artificial Intelligence, Information Systems and Computer Networks and Communications. According to data from OpenAlex, Michael Cafarella has authored 98 papers receiving a total of 4.9k indexed citations (citations by other indexed papers that have themselves been cited), including 48 papers in Artificial Intelligence, 39 papers in Information Systems and 38 papers in Computer Networks and Communications. Recurrent topics in Michael Cafarella's work include Advanced Database Systems and Queries (24 papers), Web Data Mining and Analysis (23 papers) and Data Quality and Management (22 papers). Michael Cafarella is often cited by papers focused on Advanced Database Systems and Queries (24 papers), Web Data Mining and Analysis (23 papers) and Data Quality and Management (22 papers). Michael Cafarella collaborates with scholars based in United States, Israel and France. Michael Cafarella's co-authors include Oren Etzioni, Stephen Soderland, Michele Banko, Alon Halevy, Alexander Yates, Doug Downey, Daniel S. Weld, Ana-Maria Popescu, Tal Shaked and Eugene Wu and has published in prestigious journals such as PLoS ONE, Chemistry of Materials and Communications of the ACM.

In The Last Decade

Michael Cafarella

94 papers receiving 4.4k citations

Hit Papers

Open information extracti... 2004 2026 2011 2018 2007 2005 2004 2008 250 500 750

Author Peers

Peers are selected by citation overlap in the author's most active subfields. citations · hero ref

Author Last Decade Papers Cites
Michael Cafarella 3.3k 2.0k 1.1k 992 623 98 4.9k
Riccardo Rosati 3.6k 1.1× 1.2k 0.6× 481 0.4× 2.2k 2.2× 648 1.0× 145 4.3k
Barbara Pernici 1.8k 0.5× 2.7k 1.4× 420 0.4× 2.0k 2.0× 527 0.8× 208 4.6k
Mausam Mausam 2.1k 0.6× 558 0.3× 343 0.3× 339 0.3× 256 0.4× 81 2.8k
Xiaoyuan Su 1.0k 0.3× 2.1k 1.0× 324 0.3× 469 0.5× 186 0.3× 22 2.6k
Naomie Salim 2.2k 0.7× 1.1k 0.5× 188 0.2× 215 0.2× 158 0.3× 213 4.1k
Stephen Soderland 5.8k 1.7× 2.4k 1.2× 859 0.8× 487 0.5× 344 0.6× 65 6.6k
Steffen Rendle 3.9k 1.2× 5.6k 2.8× 1.3k 1.2× 734 0.7× 493 0.8× 32 6.9k
Alon Y. Levy 3.5k 1.0× 1.7k 0.9× 421 0.4× 4.0k 4.1× 2.4k 3.8× 91 5.3k
Michael Lesk 2.1k 0.6× 1.0k 0.5× 122 0.1× 582 0.6× 256 0.4× 102 3.4k
Jiajing Wu 805 0.2× 1.5k 0.8× 112 0.1× 1.0k 1.0× 112 0.2× 139 3.3k

Countries citing papers authored by Michael Cafarella

Since Specialization
Citations

This map shows the geographic impact of Michael Cafarella's research. It shows the number of citations coming from papers published by authors working in each country. You can also color the map by specialization and compare the number of citations received by Michael Cafarella with the expected number of citations based on a country's size and research output (numbers larger than one mean the country cites Michael Cafarella more than expected).

Fields of papers citing papers by Michael Cafarella

Since Specialization
Physical SciencesHealth SciencesLife SciencesSocial Sciences

This network shows the impact of papers produced by Michael Cafarella. Nodes represent research fields, and links connect fields that are likely to share authors. Colored nodes show fields that tend to cite the papers produced by Michael Cafarella. The network helps show where Michael Cafarella may publish in the future.

Co-authorship network of co-authors of Michael Cafarella

This figure shows the co-authorship network connecting the top 25 collaborators of Michael Cafarella. A scholar is included among the top collaborators of Michael Cafarella based on the total number of citations received by their joint publications. Widths of edges represent the number of papers authors have co-authored together. Node borders signify the number of papers an author published with Michael Cafarella. Michael Cafarella is excluded from the visualization to improve readability, since they are connected to all nodes in the network.

All Works

20 of 20 papers shown
1.
Liu, Chunwei, et al.. (2024). Press ECCS to Doubt (Your Causal Graph). 6–15.
2.
Perron, Matthew, Raul Castro Fernandez, David J. DeWitt, Michael Cafarella, & Samuel Madden. (2023). Cackle: Analytical Workload Cost and Performance Stability With Elastic Pools. Proceedings of the ACM on Management of Data. 1(4). 1–25. 3 indexed citations
3.
Cafarella, Michael, et al.. (2023). Using Machine Learning to Construct Hedonic Price Indices. SSRN Electronic Journal. 1 indexed citations
4.
Cafarella, Michael, et al.. (2020). Constructing Expressive Relational Queries with Dual-Specification Synthesis.. Conference on Innovative Data Systems Research. 2 indexed citations
5.
Flinn, Jason, et al.. (2018). Sledgehammer: cluster-fueled debugging. Operating Systems Design and Implementation. 545–560. 3 indexed citations
6.
Kolli, Aasheesh, et al.. (2016). HARE: Hardware accelerator for regular expressions. 1–12. 38 indexed citations
7.
Anderson, Michael R., et al.. (2016). Runtime Support for Human-in-the-Loop Feature Engineering System.. IEEE Data(base) Engineering Bulletin. 39. 62–84. 4 indexed citations
8.
Chow, Michael, Kaushik Veeraraghavan, Michael Cafarella, & Jason Flinn. (2016). DQBarge: improving data-quality tradeoffs in large-scale internet services. Operating Systems Design and Implementation. 771–786. 7 indexed citations
9.
Kolli, Aasheesh, et al.. (2016). HARE: hardware accelerator for regular expressions. International Symposium on Microarchitecture. 1–12. 29 indexed citations
10.
Anderson, Michael R., Victor Bittorf, Matthew Burgess, et al.. (2013). Brainwash: A data system for feature engineering. Conference on Innovative Data Systems Research. 68 indexed citations
11.
Cafarella, Michael, et al.. (2013). Ringtail: Feature Selection For Easier Nowcasting.. 49–54. 7 indexed citations
12.
Cafarella, Michael. (2009). Extracting and Querying a Comprehensive Web Database.. Conference on Innovative Data Systems Research. 22 indexed citations
13.
Cafarella, Michael, Alon Halevy, Yang Zhang, Daisy Zhe Wang, & Eugene Wu. (2008). Uncovering the Relational Web. 91 indexed citations
14.
Cafarella, Michael, Christopher Ré, Dan Suciu, Oren Etzioni, & Michele Banko. (2007). Structured querying of web text. Conference on Innovative Data Systems Research. 21 indexed citations
15.
Banko, Michele, et al.. (2007). Open information extraction from the web. International Joint Conference on Artificial Intelligence. 2670–2676. 831 indexed citations breakdown →
16.
Cafarella, Michael, Dan Suciu, & Oren Etzioni. (2007). Navigating Extracted Data with Schema Discovery.. 15 indexed citations
17.
Cafarella, Michael, Christopher Ré, Dan Suciu, & Oren Etzioni. (2007). Structured Querying of Web Text Data: A Technical Challenge.. Conference on Innovative Data Systems Research. 225–234. 39 indexed citations
18.
Etzioni, Oren, Michele Banko, & Michael Cafarella. (2006). Machine reading. National Conference on Artificial Intelligence. 1517–1519. 68 indexed citations
19.
Cafarella, Michael, Oren Etzioni, & Dan Suciu. (2006). Structured Queries Over Web Text.. IEEE Data(base) Engineering Bulletin. 29(4). 45–51. 4 indexed citations
20.
Etzioni, Oren, Michael Cafarella, Doug Downey, et al.. (2004). Methods for domain-independent information extraction from the web: an experimental comparison. National Conference on Artificial Intelligence. 391–398. 73 indexed citations

Rankless uses publication and citation data sourced from OpenAlex, an open and comprehensive bibliographic database. While OpenAlex provides broad and valuable coverage of the global research landscape, it—like all bibliographic datasets—has inherent limitations. These include incomplete records, variations in author disambiguation, differences in journal indexing, and delays in data updates. As a result, some metrics and network relationships displayed in Rankless may not fully capture the entirety of a scholar's output or impact.

Explore authors with similar magnitude of impact

Rankless by CCL
2026