Home 9 Conferences 9 ISMIR 2024

ISMIR 2024

Full Proceedings

Proceedings of the 25th International Society for Music Information Retrieval Conference
San Francisco, California, USA and Online, November 10-14, 2024 (ISBN: 978-1-7327299-4-0)

Papers

Saraga Audiovisual: A Large Multimodal Open Data Collection for the Analysis of Carnatic Music 61-69
Adithi Shankar, Genís Plaja-Roglans, Thomas Nuttall, Martín Rocamora, Xavier Serra
X-Cover: Better Music Version Identification System by Integrating Pretrained ASR Model 70-77
Xingjian Du, Mingyu Liu, Pei Zou, Xia Liang, Zijie Wang, Huidong Liang, Bilei Zhu
FruitsMusic: A Real-World Corpus of Japanese Idol-Group Songs 86-94
Hitoshi Suda, Shunsuke Yoshida, Tomohiko Nakamura, Satoru Fukayama, Jun Ogata
Can LLMs “Reason” in Music? an Evaluation of LLMs’ Capability of Music Understanding and Generation 103-110
Ziya Zhou, Yuhang Wu, Zhiyue Wu, Xinyue Zhang, Ruibin Yuan, Yinghao Ma, Lu Wang, Emmanouil Benetos, Wei Xue, Yike Guo
Towards Automated Personal Value Estimation in Song Lyrics 137-145
Andrew M. Demetriou, Jaehun Kim, Sandy Manolios, Cynthia Liem
Audio Conditioning for Music Generation via Discrete Bottleneck Features 146-153
Simon Rouard, Yossi Adi, Jade Copet, Axel Roebel, Alexandre Defossez
Automatic Detection of Moral Values in Music Lyrics 164-172
Vjosa Preniqi, Iacopo Ghinassi, Julia Ive, Kyriaki Kalimeri, Charalampos Saitis
Six Dragons Fly Again: Reviving 15th-Century Korean Court Music With Transformers and Novel Encoding 217-224
Danbinaerin Han, Mark R. H. Gotham, DongMin Kim, Hannah Park, Sihun Lee, Dasaem Jeong
Lessons Learned From a Project to Encode Mensural Music on a Large Scale With Optical Music Recognition 225-231
David Rizo, Jorge Calvo-Zaragoza, Patricia García-Iasci, Teresa Delgado-Sánchez
Notewise Evaluation for Music Source Separation: A Case Study for Separated Piano Tracks 248-255
Yigitcan Özer, Hans-Ulrich Berendes, Vlora Arifi-Müller, Fabian-Robert Stöter, Meinard Müller
Automatic Estimation of Singing Voice Musical Dynamics 256-263
Jyoti Narang, Nazif Can Tamer, Viviana De La Vega, Xavier Serra
Diff-a-Riff: Musical Accompaniment Co-Creation via Latent Diffusion Models 272-280
Javier Nistal, Marco Pasini, Cyran Aouameur, Maarten Grachten, Stefan Lattner
Exploring Internet Radio Across the Globe With the MIRAGE Online Dashboard 281-287
Ngan V.T. Nguyen, Elizabeth Acosta, Tommy Dang, David Sears
MIDI-to-Tab: Guitar Tablature Inference via Masked Language Modeling 288-294
Andrew C. Edwards, Xavier Riley, Pedro Pereira Sarmento, Simon Dixon
Transcription-Based Lyrics Embeddings: Simple Extraction of Effective Lyrics Embeddings From Audio 295-303
Jaehun Kim, Florian Henkel, Camilo Landau, Samuel E. Sandberg, Andreas F. Ehmann
From Real to Cloned Singer Identification 327-334
Dorian Desblancs, Gabriel Meseguer-Brocal, Romain Hennequin, Manuel Moussallam
Field Study on Children’s Home Piano Practice: Developing a Comprehensive System for Enhanced Student-Teacher Engagement 381-388
Seikoh Fukuda, Yuko Fukuda, Masamichi Hosoda, Ami Motomura, Eri Sasao, Masaki Matsubara, Masahiro Niitsuma
HAISP: A Dataset of Human-AI Songwriting Processes From the AI Song Contest 397-404
Lidia J. Morris, Rebecca Leger, Michele Newman, John Ashley Burgoyne, Ryan Groves, Natasha Mangal, Jin Ha Lee
Cue Point Estimation Using Object Detection 405-412 2
Giulia Argüello, Luca A. Lanzendörfer, Roger Wattenhofer
SpecMaskGIT: Masked Generative Modeling of Audio Spectrogram for Efficient Audio Synthesis and Beyond 420-428
Marco Comunità, Zhi Zhong, Akira Takahashi, Shiqi Yang, Mengjie Zhao, Koichi Saito, Yukara Ikemiya, Takashi Shibuya, Shusuke Takahashi, Yuki Mitsufuji
Long-Form Music Generation With Latent Diffusion 429-437
Zach Evans, Julian D. Parker, CJ Carr, Zachary Zukowski, Josiah Taylor, Jordi Pons
Towards Zero-Shot Amplifier Modeling: One-to-Many Amplifier Modeling via Tone Embedding Control 446-453
Yu-Hua Chen, Yen-Tung Yeh, Yuan-Chiao Cheng , Jui-Te Wu, Yu-Hsiang Ho, Jyh-Shing Roger Jang, Yi-Hsuan Yang
Unsupervised Synthetic-to-Real Adaptation for Optical Music Recognition 462-469
Noelia N. Luna-Barahona, Adrián Roselló, María Alfaro-Contreras, David Rizo, Jorge Calvo-Zaragoza
Cluster and Separate: A GNN Approach to Voice and Staff Prediction for Score Engraving 503-510
Francesco Foscarin, Emmanouil Karystinaios, Eita Nakamura, Gerhard Widmer
Towards Explainable and Interpretable Musical Difficulty Estimation: A Parameter-Efficient Approach 520-528
Pedro Ramoneda, Vsevolod E. Eremenko, Alexandre D’Hooge, Emilia Parada-Cabaleiro, Xavier Serra
Purposeful Play: Evaluation and Co-Design of Casual Music Creation Applications With Children 529-539
Michele Newman, Lidia J. Morris, Jun Kato, Masataka Goto, Jason Yip, Jin Ha Lee
PiCoGen2: Piano Cover Generation With Transfer Learning Approach and Weakly Aligned Data 555-562
Chih-Pin Tan, Hsin Ai, Yi-Hsin Chang, Shuen-Huei Guan, Yi-Hsuan Yang
Diff-MST: Differentiable Mixing Style Transfer 563-570
Soumya Sai Vanka, Christian J. Steinmetz, Jean-Baptiste Rolland, Joshua D. Reiss, George Fazekas
Continual Learning for Music Classification 596-602
Pedro González-Barrachina, María Alfaro-Contreras, Jorge Calvo-Zaragoza
A Kalman Filter Model for Synchronization in Musical Ensembles 618-624
Hugo T. Carvalho, Min Susan Li, Massimiliano Di Luca, Alan M. Wing
Stem-JEPA: A Joint-Embedding Predictive Architecture for Musical Stem Compatibility Estimation 625-633
Alain Riou, Stefan Lattner, Gaëtan Hadjeres, Michael Anslow, Geoffroy Peeters
Audio Prompt Adapter: Unleashing Music Editing Abilities for Text-to-Music With Lightweight Finetuning 634-641
Fang Duo Tsai, Shih-Lun Wu, Haven Kim, Bo-Yu Chen, Hao-Chung Cheng, Yi-Hsuan Yang
ST-ITO: Controlling Audio Effects for Style Transfer With Inference-Time Optimization 661-668
Christian J. Steinmetz, Shubhr Singh, Marco Comunità, Ilias Ibnyahya, Shanxin Yuan, Emmanouil Benetos, Joshua D. Reiss
ComposerX: Multi-Agent Symbolic Music Composition With LLMs 669-679
Qixin Deng, Qikai Yang, Ruibin Yuan, Yipeng Huang , Yi Wang, Xubo Liu, Zeyue Tian, Jiahao Pan, Ge Zhang, Hanfeng Lin, Yizhi Li, Yinghao Ma, Jie Fu, Chenghua Lin, Emmanouil Benetos, Wenwu Wang, Guangyu Xia, Wei Xue, Yike Guo
Do Music Generation Models Encode Music Theory? 680-687
Megan Wei, Michael Freeman, Chris Donahue, Chen Sun
Sanidha: A Studio Quality Multi-Modal Dataset for Carnatic Music 705-712
Venkatakrishnan Vaidyanathapuram Krishnan, Noel Alben, Anish A. Nair, Nathaniel Condit-Schultz
Combining Audio Control and Style Transfer Using Latent Diffusion 721-728
Nils Demerlé, Philippe Esling, Guillaume Doras, David Genova
Lyrics Transcription for Humans: A Readability-Aware Benchmark 737-744
Ondřej Cífka, Hendrik Schreiber, Luke Miner, Fabian-Robert Stöter
A Critical Survey of Research in Music Genre Recognition 745-782
Owen Green, Bob L. T. Sturm, Georgina Born, Melanie Wald-Fuhrmann
Exploring the Inner Mechanisms of Large Generative Music Models 791-798
Marcel A. Vélez Vásquez, Charlotte Pouw, John Ashley Burgoyne, Willem Zuidema
Quantitative Analysis of Melodic Similarity in Music Copyright Infringement Cases 799-806
Saebyul Park, Halla Kim, Jiye Jung, Juyong Park, Jeounghoon Kim, Juhan Nam
Robust Lossy Audio Compression Identification 807-813
Hendrik Vincent Koops, Gianluca Micchi, Elio Quinton
MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models 825-833
Benno Weck, Ilaria Manco, Emmanouil Benetos, Elio Quinton, George Fazekas, Dmitry Bogdanov
MidiCaps: A Large-Scale MIDI Dataset With Text Captions 858-865
Jan Melechovsky, Abhinaba Roy, Dorien Herremans
A New Dataset, Notation Software, and Representation for Computational Schenkerian Analysis 866-873
Stephen Ni-Hahn, Weihan Xu, Zirui Yin, Rico Zhu, Simon Mak, Yue Jiang, Cynthia Rudin
DITTO-2: Distilled Diffusion Inference-Time T-Optimization for Music Generation 874-881
Zachary Novack, Julian McAuley, Taylor Berg-Kirkpatrick, Nicholas J. Bryan
Towards Universal Optical Music Recognition: A Case Study on Notation Types 914-921
Juan Carlos Martinez-Sevilla, David Rizo, Jorge Calvo-Zaragoza
Toward a More Complete OMR Solution 930-937
Guang Yang, Muru Zhang, Lin Qiu, Yanming Wan, Noah A. Smith
STONE: Self-Supervised Tonality Estimator 954-961
Yuexuan Kong, Vincent Lostanlen, Gabriel Meseguer-Brocal, Stella Wong, Mathieu Lagrange, Romain Hennequin
Towards Assessing Data Replication in Music Generation With Music Similarity Metrics on Raw Audio 1004-1011
Roser Batlle-Roca, Wei-Hsiang Liao, Xavier Serra, Yuki Mitsufuji, Emilia Gómez
Hierarchical Generative Modeling of Melodic Vocal Contours in Hindustani Classical Music 1020-1028
Nithya Nadig Shikarpur, Krishna Maneesha Dendukuri, Yusong Wu, Antoine Caillon, Cheng-Zhi Anna Huang
SymPAC: Scalable Symbolic Music Generation With Prompts and Constraints 1029-1036
Haonan Chen, Jordan B. L. Smith, Janne Spijkervet, Ju-Chiang Wang, Pei Zou, Bochen Li, Qiuqiang Kong, Xingjian Du
Towards Musically Informed Evaluation of Piano Transcription Models 1068-1075
Patricia Hu, Lukáš Samuel Marták, Carlos Eduardo Cancino-Chacón, Gerhard Widmer