John R. Hershey

11.2k total citations · 3 hit papers
102 papers, 4.8k citations indexed

About

John R. Hershey is a scholar working on Signal Processing, Artificial Intelligence and Computational Mechanics. According to data from OpenAlex, John R. Hershey has authored 102 papers receiving a total of 4.8k indexed citations (citations by other indexed papers that have themselves been cited), including 85 papers in Signal Processing, 72 papers in Artificial Intelligence and 14 papers in Computational Mechanics. Recurrent topics in John R. Hershey's work include Speech and Audio Processing (81 papers), Speech Recognition and Synthesis (62 papers) and Music and Audio Processing (55 papers). John R. Hershey is often cited by papers focused on Speech and Audio Processing (81 papers), Speech Recognition and Synthesis (62 papers) and Music and Audio Processing (55 papers). John R. Hershey collaborates with scholars based in United States, Japan and Germany. John R. Hershey's co-authors include Shinji Watanabe, Peder A. Olsen, Jonathan Le Roux, Takaaki Hori, Hakan Erdoğan, Zhong-Qiu Wang, Steven J. Rennie, Suyoun Kim, Tomoki Hayashi and Felix Weninger and has published in prestigious journals such as The Journal of the Acoustical Society of America, IEEE Signal Processing Magazine and IEEE Journal of Selected Topics in Signal Processing.

In The Last Decade

John R. Hershey

96 papers receiving 4.5k citations

Hit Papers

Approximating the Kullback Leibler Divergence Between Gau... 2007 2026 2013 2019 2007 2017 2015 200 400 600

Peers

John R. Hershey
Michael L. Seltzer United States
Paris Smaragdis United States
Jonathan Le Roux United States
Bhiksha Raj United States
Eng Siong Chng Singapore
H. Bourlard Switzerland
John R. Hershey
Citations per year, relative to John R. Hershey John R. Hershey (= 1×) peers Reinhold Haeb‐Umbach

Countries citing papers authored by John R. Hershey

Since Specialization
Citations

This map shows the geographic impact of John R. Hershey's research. It shows the number of citations coming from papers published by authors working in each country. You can also color the map by specialization and compare the number of citations received by John R. Hershey with the expected number of citations based on a country's size and research output (numbers larger than one mean the country cites John R. Hershey more than expected).

Fields of papers citing papers by John R. Hershey

Since Specialization
Physical SciencesHealth SciencesLife SciencesSocial Sciences

This network shows the impact of papers produced by John R. Hershey. Nodes represent research fields, and links connect fields that are likely to share authors. Colored nodes show fields that tend to cite the papers produced by John R. Hershey. The network helps show where John R. Hershey may publish in the future.

Co-authorship network of co-authors of John R. Hershey

This figure shows the co-authorship network connecting the top 25 collaborators of John R. Hershey. A scholar is included among the top collaborators of John R. Hershey based on the total number of citations received by their joint publications. Widths of edges represent the number of papers authors have co-authored together. Node borders signify the number of papers an author published with John R. Hershey. John R. Hershey is excluded from the visualization to improve readability, since they are connected to all nodes in the network.

All Works

20 of 20 papers shown
2.
Hamilton, Mark, Andrew Zisserman, John R. Hershey, & William T. Freeman. (2024). Separating the “Chirp” from the “Chat”: Self-supervised Visual Grounding of Sound and Language. 13117–13127. 4 indexed citations
3.
Tzinis, Efthymios, Scott Wisdom, Aren Jansen, et al.. (2021). Into the Wild with AudioScope: Unsupervised Audio-Visual Separation of On-Screen Sounds. arXiv (Cornell University). 2 indexed citations
4.
Turpault, Nicolas, Romain Serizel, Scott Wisdom, et al.. (2020). Sound Event Detection and Separation: a Benchmark on Desed Synthetic\n Soundscapes. arXiv (Cornell University). 20 indexed citations
5.
Wisdom, Scott, Efthymios Tzinis, Hakan Erdoğan, et al.. (2020). Unsupervised Sound Separation Using Mixture Invariant Training. Neural Information Processing Systems. 33. 3846–3857. 8 indexed citations
6.
Roux, Jonathan Le, et al.. (2019). Phasebook and Friends: Leveraging Discrete Representations for Source Separation. IEEE Journal of Selected Topics in Signal Processing. 13(2). 370–382. 40 indexed citations
7.
Wang, Quan, Hannah Muckenhirn, Kevin Wilson, et al.. (2019). VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking. 2728–2732. 217 indexed citations
8.
Wang, Zhong-Qiu, Jonathan Le Roux, DeLiang Wang, & John R. Hershey. (2018). End-to-End Speech Separation with Unfolded Iterative Phase Reconstruction. 2708–2712. 79 indexed citations
9.
Hori, Takaaki, Shinji Watanabe, & John R. Hershey. (2017). Joint CTC/attention decoding for end-to-end speech recognition. 518–529. 93 indexed citations
10.
Watanabe, Shinji, Takaaki Hori, Suyoun Kim, John R. Hershey, & Tomoki Hayashi. (2017). Hybrid CTC/Attention Architecture for End-to-End Speech Recognition. IEEE Journal of Selected Topics in Signal Processing. 11(8). 1240–1253. 483 indexed citations breakdown →
11.
Erdoğan, Hakan, John R. Hershey, Shinji Watanabe, Michael Mandel, & Jonathan Le Roux. (2016). Improved MVDR Beamforming Using Single-Channel Mask Prediction Networks. 1981–1985. 216 indexed citations
12.
Roux, Jonathan Le, John R. Hershey, & Felix Weninger. (2015). Deep NMF for speech separation. 66–70. 77 indexed citations
13.
Yoshino, Koichiro, Shinji Watanabe, Jonathan Le Roux, & John R. Hershey. (2013). Statistical Dialogue Management using Intention Dependency Graph. International Joint Conference on Natural Language Processing. 962–966. 3 indexed citations
14.
Févotte, Cédric, Jonathan Le Roux, & John R. Hershey. (2013). Non-negative dynamical system with application to speech and audio. 3158–3162. 31 indexed citations
15.
Rennie, Steven J., Peder A. Olsen, John R. Hershey, & Trausti Kristjansson. (2006). The Iroquois Model: Using Temporal Dynamics to Separate Speakers. Conference of the International Speech Communication Association. 24–30. 7 indexed citations
16.
Hershey, John R., Trausti Kristjansson, & Zhengyou Zhang. (2004). Model-based fusion of bone and air sensors for speech enhancement and robust speech recognition.. Conference of the International Speech Communication Association. 139. 9 indexed citations
17.
Marks, Tim K., J. Cooper Roddey, Javier R. Movellan, & John R. Hershey. (2004). Joint Tracking of Pose, Expression, and Texture using Conditionally Gaussian Filters. Neural Information Processing Systems. 17. 889–896. 8 indexed citations
18.
Movellan, Javier R., John R. Hershey, & Josh Susskind. (2004). Real-Time Video Tracking Using Convolution HMMs. Angiology. 45(6). 461–7. 5 indexed citations
19.
Hershey, John R. & Michael A. Casey. (2001). Audio-Visual Sound Separation Via Hidden Markov Models. Neural Information Processing Systems. 14. 1173–1180. 41 indexed citations
20.
Hershey, John R. & Javier R. Movellan. (1999). Audio Vision: Using Audio-Visual Synchrony to Locate Sounds. Neural Information Processing Systems. 12. 813–819. 157 indexed citations

Rankless uses publication and citation data sourced from OpenAlex, an open and comprehensive bibliographic database. While OpenAlex provides broad and valuable coverage of the global research landscape, it—like all bibliographic datasets—has inherent limitations. These include incomplete records, variations in author disambiguation, differences in journal indexing, and delays in data updates. As a result, some metrics and network relationships displayed in Rankless may not fully capture the entirety of a scholar's output or impact.

Explore authors with similar magnitude of impact

Rankless by CCL
2026