Naoyuki Kanda

4.2k total citations · 1 hit paper
82 papers, 2.3k citations indexed

About

Naoyuki Kanda is a scholar working on Artificial Intelligence, Signal Processing and Computer Vision and Pattern Recognition. According to data from OpenAlex, Naoyuki Kanda has authored 82 papers receiving a total of 2.3k indexed citations (citations by other indexed papers that have themselves been cited), including 72 papers in Artificial Intelligence, 61 papers in Signal Processing and 9 papers in Computer Vision and Pattern Recognition. Recurrent topics in Naoyuki Kanda's work include Speech Recognition and Synthesis (64 papers), Speech and Audio Processing (53 papers) and Music and Audio Processing (48 papers). Naoyuki Kanda is often cited by papers focused on Speech Recognition and Synthesis (64 papers), Speech and Audio Processing (53 papers) and Music and Audio Processing (48 papers). Naoyuki Kanda collaborates with scholars based in United States, Japan and China. Naoyuki Kanda's co-authors include Takuya Yoshioka, Kenji Nagamatsu, Jinyu Li, Yusuke Fujita, Shota Horiguchi, Shinji Watanabe, Jian Wu, Xiong Xiao, Zhuo Chen and Yu Wu and has published in prestigious journals such as Advanced Materials, Nano Letters and Nanoscale.

In The Last Decade

Naoyuki Kanda

81 papers receiving 2.2k citations

Hit Papers

WavLM: Large-Scale Self-Supervised Pre-Training for Full ... 2022 2026 2023 2024 2022 250 500 750

Peers

Naoyuki Kanda
Shang-Wen Li United States
Mei-Yuh Hwang United States
Jia Ye China
Oliver Watts United Kingdom
Chanwoo Kim United States
Shang-Wen Li United States
Naoyuki Kanda
Citations per year, relative to Naoyuki Kanda Naoyuki Kanda (= 1×) peers Shang-Wen Li

Countries citing papers authored by Naoyuki Kanda

Since Specialization
Citations

This map shows the geographic impact of Naoyuki Kanda's research. It shows the number of citations coming from papers published by authors working in each country. You can also color the map by specialization and compare the number of citations received by Naoyuki Kanda with the expected number of citations based on a country's size and research output (numbers larger than one mean the country cites Naoyuki Kanda more than expected).

Fields of papers citing papers by Naoyuki Kanda

Since Specialization
Physical SciencesHealth SciencesLife SciencesSocial Sciences

This network shows the impact of papers produced by Naoyuki Kanda. Nodes represent research fields, and links connect fields that are likely to share authors. Colored nodes show fields that tend to cite the papers produced by Naoyuki Kanda. The network helps show where Naoyuki Kanda may publish in the future.

Co-authorship network of co-authors of Naoyuki Kanda

This figure shows the co-authorship network connecting the top 25 collaborators of Naoyuki Kanda. A scholar is included among the top collaborators of Naoyuki Kanda based on the total number of citations received by their joint publications. Widths of edges represent the number of papers authors have co-authored together. Node borders signify the number of papers an author published with Naoyuki Kanda. Naoyuki Kanda is excluded from the visualization to improve readability, since they are connected to all nodes in the network.

All Works

20 of 20 papers shown
1.
Vinnikov, Alon, et al.. (2025). Summary of the NOTSOFAR-1 challenge: Highlights and learnings. Computer Speech & Language. 93. 101796–101796. 1 indexed citations
2.
Eskimez, Şefik Emre, Xiaofei Wang, Zhen Xiao, et al.. (2024). Total-Duration-Aware Duration Modeling for Text-to-Speech Systems. 2290–2294. 1 indexed citations
3.
Chen, Zhuo, Naoyuki Kanda, Şefik Emre Eskimez, et al.. (2024). SpeechX: Neural Codec Language Model as a Versatile Speech Transformer. IEEE/ACM Transactions on Audio Speech and Language Processing. 32. 3355–3364. 18 indexed citations
4.
Nakanishi, Yusuke, Naoyuki Kanda, Yasufumi Takahashi, et al.. (2024). Superatomic Layer of Cubic Mo4S4 Clusters Connected by Cl Cross‐Linking. Advanced Materials. 36(39). e2404249–e2404249. 7 indexed citations
5.
Yang, Ziyi, Mahmoud Khademi, Xu Yi‐chong, et al.. (2024). i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data. 1615–1627. 2 indexed citations
6.
Kanda, Naoyuki, Xiaofei Wang, Junkun Chen, et al.. (2024). Diarist: Streaming Speech Translation with Speaker Diarization. 10866–10870. 3 indexed citations
7.
Yang, Ziyi, Yuwei Fang, Chenguang Zhu, et al.. (2023). i-Code: An Integrative and Composable Multimodal Learning Framework. Proceedings of the AAAI Conference on Artificial Intelligence. 37(9). 10880–10890. 20 indexed citations
8.
Kanda, Naoyuki, Xiong Xiao, Yashesh Gaur, et al.. (2022). Transcribe-to-Diarize: Neural Speaker Diarization for Unlimited Number of Speakers Using End-to-End Speaker-Attributed ASR. ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 8082–8086. 18 indexed citations
9.
Kanda, Naoyuki, et al.. (2022). Streaming Multi-Talker ASR with Token-Level Serialized Output Training. Interspeech 2022. 3774–3778. 2 indexed citations
10.
Lu, Liang, Naoyuki Kanda, Jinyu Li, & Yifan Gong. (2021). Streaming End-to-End Multi-Talker Speech Recognition. IEEE Signal Processing Letters. 28. 803–807. 26 indexed citations
11.
Kanda, Naoyuki, Zhong Meng, Liang Lu, et al.. (2021). Minimum Bayes Risk Training for End-to-End Speaker-Attributed ASR. 6503–6507. 9 indexed citations
12.
Wu, Jian, Zhuo Chen, Sanyuan Chen, et al.. (2021). Investigation of Practical Aspects of Single Channel Speech Separation for ASR. 3066–3070. 9 indexed citations
13.
Kanda, Naoyuki, Guoli Ye, Yashesh Gaur, et al.. (2021). End-to-End Speaker-Attributed ASR with Transformer. 4413–4417. 14 indexed citations
14.
Kanda, Naoyuki, Yashesh Gaur, Xiaofei Wang, Zhong Meng, & Takuya Yoshioka. (2020). Serialized Output Training for End-to-End Overlapped Speech Recognition. 2797–2801. 31 indexed citations
15.
Kanda, Naoyuki, Yusuke Nakanishi, Dan Liu, et al.. (2020). Efficient growth and characterization of one-dimensional transition metal tellurides inside carbon nanotubes. Nanoscale. 12(33). 17185–17190. 23 indexed citations
16.
Fujita, Yusuke, Naoyuki Kanda, Shota Horiguchi, et al.. (2019). End-to-End Neural Speaker Diarization with Self-Attention. 296–303. 116 indexed citations
17.
Fujita, Yusuke, Naoyuki Kanda, Shota Horiguchi, Kenji Nagamatsu, & Shinji Watanabe. (2019). End-to-End Neural Speaker Diarization with Permutation-Free Objectives. 4300–4304. 126 indexed citations
18.
Horiguchi, Shota, Naoyuki Kanda, & Kenji Nagamatsu. (2019). Multimodal Response Obligation Detection with Unsupervised Online Domain Adaptation. 4180–4184. 1 indexed citations
19.
Kanda, Naoyuki, Ryu Takeda, & Yasunari Obuchi. (2013). Elastic spectral distortion for low resource speech recognition with deep neural networks. 309–314. 81 indexed citations
20.
Obuchi, Yasunari, Ryu Takeda, & Naoyuki Kanda. (2012). Voice activity detection based on augmented statistical noise suppression. Asia-Pacific Signal and Information Processing Association Annual Summit and Conference. 1–4. 3 indexed citations

Rankless uses publication and citation data sourced from OpenAlex, an open and comprehensive bibliographic database. While OpenAlex provides broad and valuable coverage of the global research landscape, it—like all bibliographic datasets—has inherent limitations. These include incomplete records, variations in author disambiguation, differences in journal indexing, and delays in data updates. As a result, some metrics and network relationships displayed in Rankless may not fully capture the entirety of a scholar's output or impact.

Explore authors with similar magnitude of impact

Rankless by CCL
2026