Naoyuki Kanda

4.2k total citations · 1 hit paper
82 papers, 2.3k citations indexed

About

Naoyuki Kanda is a scholar working on Artificial Intelligence, Signal Processing and Computer Vision and Pattern Recognition. According to data from OpenAlex, Naoyuki Kanda has authored 82 papers receiving a total of 2.3k indexed citations (citations by other indexed papers that have themselves been cited), including 72 papers in Artificial Intelligence, 61 papers in Signal Processing and 9 papers in Computer Vision and Pattern Recognition. Recurrent topics in Naoyuki Kanda's work include Speech Recognition and Synthesis (64 papers), Speech and Audio Processing (53 papers) and Music and Audio Processing (48 papers). Naoyuki Kanda is often cited by papers focused on Speech Recognition and Synthesis (64 papers), Speech and Audio Processing (53 papers) and Music and Audio Processing (48 papers). Naoyuki Kanda collaborates with scholars based in United States, Japan and China. Naoyuki Kanda's co-authors include Takuya Yoshioka, Kenji Nagamatsu, Jinyu Li, Yusuke Fujita, Shota Horiguchi, Shinji Watanabe, Jian Wu, Xiong Xiao, Zhuo Chen and Yu Wu and has published in prestigious journals such as Advanced Materials, Nano Letters and Nanoscale.

In The Last Decade

Naoyuki Kanda

81 papers receiving 2.2k citations

Hit Papers

align trajectories

What are hit papers?

Hit papers significantly outperform the citation benchmark for their cohort. A paper qualifies if it has ≥500 total citations, achieves ≥1.5× the top-1% citation threshold for papers in the same subfield and year (this is the minimum needed to enter the top 1%, not the average within it), or reaches the top citation threshold in at least one of its specific research topics.

WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing

2022 832 citations Sanyuan Chen, Shujie Liu et al. profile →

Peers

Naoyuki Kanda

AI

SP

CVPR

ECP

MC

Shang-Wen Li United States

Mei-Yuh Hwang United States

Sanyuan Chen China

Zhengyang Chen China

Jia Ye China

Jean-François Bonastre France

Réda Dehak France

Xiangang Li China

Oliver Watts United Kingdom

Chanwoo Kim United States

Shang-Wen Li United States

Naoyuki Kanda

1.2k ×0.6

AI

582 ×0.4

SP

239 ×1.4

CVPR

138 ×0.9

ECP

124 ×1.3

MC

Citations per year, relative to Naoyuki Kanda Naoyuki Kanda (= 1×) peers Shang-Wen Li

Countries citing papers authored by Naoyuki Kanda

Since Specialization

Citations

This map shows the geographic impact of Naoyuki Kanda's research. It shows the number of citations coming from papers published by authors working in each country. You can also color the map by specialization and compare the number of citations received by Naoyuki Kanda with the expected number of citations based on a country's size and research output (numbers larger than one mean the country cites Naoyuki Kanda more than expected).

Fields of papers citing papers by Naoyuki Kanda

Since Specialization

Physical SciencesHealth SciencesLife SciencesSocial Sciences

This network shows the impact of papers produced by Naoyuki Kanda. Nodes represent research fields, and links connect fields that are likely to share authors. Colored nodes show fields that tend to cite the papers produced by Naoyuki Kanda. The network helps show where Naoyuki Kanda may publish in the future.

Co-authorship network of co-authors of Naoyuki Kanda

This figure shows the co-authorship network connecting the top 25 collaborators of Naoyuki Kanda. A scholar is included among the top collaborators of Naoyuki Kanda based on the total number of citations received by their joint publications. Widths of edges represent the number of papers authors have co-authored together. Node borders signify the number of papers an author published with Naoyuki Kanda. Naoyuki Kanda is excluded from the visualization to improve readability, since they are connected to all nodes in the network.

All Works

Sort: Min cites: Since: Top N: Style:

20 of 20 papers shown

1.

Vinnikov, Alon, et al.. (2025). Summary of the NOTSOFAR-1 challenge: Highlights and learnings. Computer Speech & Language. 93. 101796–101796. 1 indexed citations

2.

Eskimez, Şefik Emre, Xiaofei Wang, Zhen Xiao, et al.. (2024). Total-Duration-Aware Duration Modeling for Text-to-Speech Systems. 2290–2294. 1 indexed citations

3.

Chen, Zhuo, Naoyuki Kanda, Şefik Emre Eskimez, et al.. (2024). SpeechX: Neural Codec Language Model as a Versatile Speech Transformer. IEEE/ACM Transactions on Audio Speech and Language Processing. 32. 3355–3364. 18 indexed citations

4.

Nakanishi, Yusuke, Naoyuki Kanda, Yasufumi Takahashi, et al.. (2024). Superatomic Layer of Cubic Mo₄S₄ Clusters Connected by Cl Cross‐Linking. Advanced Materials. 36(39). e2404249–e2404249. 7 indexed citations

5.

Yang, Ziyi, Mahmoud Khademi, Xu Yi‐chong, et al.. (2024). i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data. 1615–1627. 2 indexed citations

6.

Kanda, Naoyuki, Xiaofei Wang, Junkun Chen, et al.. (2024). Diarist: Streaming Speech Translation with Speaker Diarization. 10866–10870. 3 indexed citations

7.

Yang, Ziyi, Yuwei Fang, Chenguang Zhu, et al.. (2023). i-Code: An Integrative and Composable Multimodal Learning Framework. Proceedings of the AAAI Conference on Artificial Intelligence. 37(9). 10880–10890. 20 indexed citations

8.

Kanda, Naoyuki, Xiong Xiao, Yashesh Gaur, et al.. (2022). Transcribe-to-Diarize: Neural Speaker Diarization for Unlimited Number of Speakers Using End-to-End Speaker-Attributed ASR. ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 8082–8086. 18 indexed citations

9.

Kanda, Naoyuki, et al.. (2022). Streaming Multi-Talker ASR with Token-Level Serialized Output Training. Interspeech 2022. 3774–3778. 2 indexed citations

10.

Lu, Liang, Naoyuki Kanda, Jinyu Li, & Yifan Gong. (2021). Streaming End-to-End Multi-Talker Speech Recognition. IEEE Signal Processing Letters. 28. 803–807. 26 indexed citations

11.

Kanda, Naoyuki, Zhong Meng, Liang Lu, et al.. (2021). Minimum Bayes Risk Training for End-to-End Speaker-Attributed ASR. 6503–6507. 9 indexed citations

12.

Wu, Jian, Zhuo Chen, Sanyuan Chen, et al.. (2021). Investigation of Practical Aspects of Single Channel Speech Separation for ASR. 3066–3070. 9 indexed citations

13.

Kanda, Naoyuki, Guoli Ye, Yashesh Gaur, et al.. (2021). End-to-End Speaker-Attributed ASR with Transformer. 4413–4417. 14 indexed citations

14.

Kanda, Naoyuki, Yashesh Gaur, Xiaofei Wang, Zhong Meng, & Takuya Yoshioka. (2020). Serialized Output Training for End-to-End Overlapped Speech Recognition. 2797–2801. 31 indexed citations

15.

Kanda, Naoyuki, Yusuke Nakanishi, Dan Liu, et al.. (2020). Efficient growth and characterization of one-dimensional transition metal tellurides inside carbon nanotubes. Nanoscale. 12(33). 17185–17190. 23 indexed citations

16.

Fujita, Yusuke, Naoyuki Kanda, Shota Horiguchi, et al.. (2019). End-to-End Neural Speaker Diarization with Self-Attention. 296–303. 116 indexed citations

17.

Fujita, Yusuke, Naoyuki Kanda, Shota Horiguchi, Kenji Nagamatsu, & Shinji Watanabe. (2019). End-to-End Neural Speaker Diarization with Permutation-Free Objectives. 4300–4304. 126 indexed citations

18.

Horiguchi, Shota, Naoyuki Kanda, & Kenji Nagamatsu. (2019). Multimodal Response Obligation Detection with Unsupervised Online Domain Adaptation. 4180–4184. 1 indexed citations

19.

Kanda, Naoyuki, Ryu Takeda, & Yasunari Obuchi. (2013). Elastic spectral distortion for low resource speech recognition with deep neural networks. 309–314. 81 indexed citations

20.

Obuchi, Yasunari, Ryu Takeda, & Naoyuki Kanda. (2012). Voice activity detection based on augmented statistical noise suppression. Asia-Pacific Signal and Information Processing Association Annual Summit and Conference. 1–4. 3 indexed citations

Rankless uses publication and citation data sourced from OpenAlex, an open and comprehensive bibliographic database. While OpenAlex provides broad and valuable coverage of the global research landscape, it—like all bibliographic datasets—has inherent limitations. These include incomplete records, variations in author disambiguation, differences in journal indexing, and delays in data updates. As a result, some metrics and network relationships displayed in Rankless may not fully capture the entirety of a scholar's output or impact.

Explore authors with similar magnitude of impact