John R. Hershey

11.2k citations

102 papers · 4.8k indexed · 3 hit papers · h-index 32

Signal Processing top 0.05%
Artificial Intelligence top 0.2%
Computational Mechanics top 1%
Computer Vision and Pattern Recognition top 2%
Cognitive Neuroscience top 5%

Co-authors: Shinji Watanabe Peder A. Olsen Jonathan Le Roux Takaaki Hori Hakan Erdoğan Zhong-Qiu Wang Steven J. Rennie Suyoun Kim
Topics: Speech and Audio Processing (81 papers)Speech Recognition and Synthesis (62 papers)Music and Audio Processing (55 papers)
Cited by: Signal Processing Artificial Intelligence Computational Mechanics
Journals: The Journal of the Acoustical Society of America IEEE Signal Processing Magazine IEEE Journal of Selected Topics in Signal Processing
Partner nations: United States Japan Germany

In The Last Decade

John R. Hershey

96 papers receiving 4.5k citations

Hit Papers

align trajectories

What are hit papers?

Hit papers significantly outperform the citation benchmark for their cohort. A paper qualifies if it has ≥500 total citations, achieves ≥1.5× the top-1% citation threshold for papers in the same subfield and year (this is the minimum needed to enter the top 1%, not the average within it), or reaches the top citation threshold in at least one of its specific research topics.

Approximating the Kullback Leibler Divergence Between Gaussian Mixture Models

2007 740 citations John R. Hershey, Peder A. Olsen profile →
Hybrid CTC/Attention Architecture for End-to-End Speech Recognition

2017 483 citations Shinji Watanabe, Takaaki Hori et al. IEEE Journal of Selected Topics in Signal Processing profile →
Phase-sensitive and recognition-boosted speech separation using deep recurrent neural networks

2015 447 citations Hakan Erdoğan, John R. Hershey et al. profile →

Peers

Countries citing papers authored by John R. Hershey

Since Specialization

Citations

This map shows the geographic impact of John R. Hershey's research. It shows the number of citations coming from papers published by authors working in each country. You can also color the map by specialization and compare the number of citations received by John R. Hershey with the expected number of citations based on a country's size and research output (numbers larger than one mean the country cites John R. Hershey more than expected).

Fields of papers citing papers by John R. Hershey

Since Specialization

Physical SciencesHealth SciencesLife SciencesSocial Sciences

This network shows the impact of papers produced by John R. Hershey. Nodes represent research fields, and links connect fields that are likely to share authors. Colored nodes show fields that tend to cite the papers produced by John R. Hershey. The network helps show where John R. Hershey may publish in the future.

Co-authorship network of co-authors of John R. Hershey

This figure shows the co-authorship network connecting the top 25 collaborators of John R. Hershey. A scholar is included among the top collaborators of John R. Hershey based on the total number of citations received by their joint publications. Widths of edges represent the number of papers authors have co-authored together. Node borders signify the number of papers an author published with John R. Hershey. John R. Hershey is excluded from the visualization to improve readability, since they are connected to all nodes in the network.

All Works

Sort: Min cites: Since: Top N: Style:

20 of 20 papers shown

#	Work	Indexed citations
1	Towards Sub-millisecond Latency Real-Time Speech Enhancement Models on Hearables 2025 ·(unknown),Chandan K. Reddy,Scott Wisdom,(unknown),John R. Hershey,Richard F. Lyon	0
2	Separating the “Chirp” from the “Chat”: Self-supervised Visual Grounding of Sound and Language 2024 ·Mark Hamilton,Andrew Zisserman,John R. Hershey,William T. Freeman	4
3	Into the Wild with AudioScope: Unsupervised Audio-Visual Separation of On-Screen Sounds 2021 ·arXiv (Cornell University) ·Efthymios Tzinis,Scott Wisdom,Aren Jansen,Shawn Hershey,Tal Remez,Dan Ellis,John R. Hershey	2
4	Sound Event Detection and Separation: a Benchmark on Desed Synthetic\n Soundscapes 2020 ·arXiv (Cornell University) ·Nicolas Turpault,Romain Serizel,Scott Wisdom,Hakan Erdoğan,John R. Hershey,Eduardo Fonseca,Prem Seetharaman,Justin Salamon	20
5	Unsupervised Sound Separation Using Mixture Invariant Training 2020 ·Neural Information Processing Systems ·Scott Wisdom,Efthymios Tzinis,Hakan Erdoğan,Ron Weiss,Kevin Wilson,John R. Hershey	8
6	Phasebook and Friends: Leveraging Discrete Representations for Source Separation 2019 ·IEEE Journal of Selected Topics in Signal Processing ·Jonathan Le Roux,Gordon Wichern,Shinji Watanabe,(unknown),John R. Hershey	40
7	VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking 2019 ·Quan Wang,Hannah Muckenhirn,Kevin Wilson,Prashant Sridhar,Zelin Wu,John R. Hershey,Rif A. Saurous,Ron J. Weiss,Jia Ye,Ignacio López Moreno	217
8	End-to-End Speech Separation with Unfolded Iterative Phase Reconstruction 2018 ·Zhong-Qiu Wang,Jonathan Le Roux,DeLiang Wang,John R. Hershey	79
9	Joint CTC/attention decoding for end-to-end speech recognition 2017 ·Takaaki Hori,Shinji Watanabe,John R. Hershey	93
10	Hybrid CTC/Attention Architecture for End-to-End Speech Recognitionbreakdown → 2017 ·IEEE Journal of Selected Topics in Signal Processing ·Shinji Watanabe,Takaaki Hori,Suyoun Kim,John R. Hershey,Tomoki Hayashi	483
11	Improved MVDR Beamforming Using Single-Channel Mask Prediction Networks 2016 ·Hakan Erdoğan,John R. Hershey,Shinji Watanabe,Michael Mandel,Jonathan Le Roux	216
12	Deep NMF for speech separation 2015 ·Jonathan Le Roux,John R. Hershey,Felix Weninger	77
13	Statistical Dialogue Management using Intention Dependency Graph 2013 ·International Joint Conference on Natural Language Processing ·Koichiro Yoshino,Shinji Watanabe,Jonathan Le Roux,John R. Hershey	3
14	Non-negative dynamical system with application to speech and audio 2013 ·Cédric Févotte,Jonathan Le Roux,John R. Hershey	31
15	The Iroquois Model: Using Temporal Dynamics to Separate Speakers 2006 ·Conference of the International Speech Communication Association ·Steven J. Rennie,Peder A. Olsen,John R. Hershey,Trausti Kristjansson	7
16	Model-based fusion of bone and air sensors for speech enhancement and robust speech recognition. 2004 ·Conference of the International Speech Communication Association ·John R. Hershey,Trausti Kristjansson,Zhengyou Zhang	9
17	Joint Tracking of Pose, Expression, and Texture using Conditionally Gaussian Filters 2004 ·Neural Information Processing Systems ·Tim K. Marks,J. Cooper Roddey,Javier R. Movellan,John R. Hershey	8
18	Real-Time Video Tracking Using Convolution HMMs 2004 ·Angiology ·Javier R. Movellan,John R. Hershey,Josh Susskind	5
19	Audio-Visual Sound Separation Via Hidden Markov Models 2001 ·Neural Information Processing Systems ·John R. Hershey,Michael A. Casey	41
20	Audio Vision: Using Audio-Visual Synchrony to Locate Sounds 1999 ·Neural Information Processing Systems ·John R. Hershey,Javier R. Movellan	157

About John R. Hershey

John R. Hershey is a scholar working on Signal Processing, Artificial Intelligence and Computational Mathematics, having authored 102 papers that have together received 4.8k indexed citations. Recurring topics across this work include Speech and Audio Processing (81 papers), Speech Recognition and Synthesis (62 papers) and Music and Audio Processing (55 papers). The work is most often cited by research in Signal Processing (3.6k citations), Artificial Intelligence (3.0k citations) and Computational Mechanics (850 citations). John R. Hershey has collaborated with scholars based in United States, Japan and Germany. Frequent co-authors include Shinji Watanabe, Peder A. Olsen, Jonathan Le Roux, Takaaki Hori, Hakan Erdoğan, Zhong-Qiu Wang, Steven J. Rennie, Suyoun Kim, Tomoki Hayashi and Felix Weninger. Their work appears in journals such as The Journal of the Acoustical Society of America, IEEE Signal Processing Magazine and IEEE Journal of Selected Topics in Signal Processing.

Rankless uses publication and citation data sourced from OpenAlex, an open and comprehensive bibliographic database. While OpenAlex provides broad and valuable coverage of the global research landscape, it—like all bibliographic datasets—has inherent limitations. These include incomplete records, variations in author disambiguation, differences in journal indexing, and delays in data updates. As a result, some metrics and network relationships displayed in Rankless may not fully capture the entirety of a scholar's output or impact.

Explore authors with similar magnitude of impact