ISMIR 2021
Full Proceedings
Proceedings of the 22nd International Society for Music Information Retrieval Conference
Online, Nov 7-12, 2021 (ISBN: 978-1-7327299-0-2)
Online, Nov 7-12, 2021 (ISBN: 978-1-7327299-0-2)
Papers
Four-way Classification of Tabla Strokes with Models Adapted from Automatic Drum Transcription 19-26
Rohit M A, Amitrajit Bhattacharjee, Preeti Rao
Rohit M A, Amitrajit Bhattacharjee, Preeti Rao
OMR-assisted transcription: a case study with early prints 35-41
María Alfaro-Contreras, David Rizo, Jose M. Inesta, Jorge Calvo-Zaragoza
María Alfaro-Contreras, David Rizo, Jose M. Inesta, Jorge Calvo-Zaragoza
Identification of rhythm guitar sections in symbolic tablatures 58-65
Louis Bigo, David Regnier, Nicolas Martin
Louis Bigo, David Regnier, Nicolas Martin
On-Line Audio-to-Lyrics Alignment Based on a Reference Performance 66-73
Charles Brazier, Gerhard Widmer
Charles Brazier, Gerhard Widmer
Visualizing Intertextual Form with Arc Diagrams: Contour and Schema-based Methods 74-80
Aaron Carter-Enyi, Gilad Rabinovitch, Nathaniel Condit-Schultz
Aaron Carter-Enyi, Gilad Rabinovitch, Nathaniel Condit-Schultz
Unsupervised Domain Adaptation for Document Analysis of Music Score Images 81-87
Francisco J. Castellanos, Antonio-Javier Gallego, Jorge Calvo-Zaragoza
Francisco J. Castellanos, Antonio-Javier Gallego, Jorge Calvo-Zaragoza
Codified audio language modeling learns useful representations for music information retrieval 88-96
Rodrigo Castellon, Chris Donahue, Percy Liang
Rodrigo Castellon, Chris Donahue, Percy Liang
Variable-Length Music Score Infilling via XLNet and Musically Specialized Positional Encoding 97-104
Chin-Jui Chang, Chun-Yi Lee, Yi-Hsuan Yang
Chin-Jui Chang, Chun-Yi Lee, Yi-Hsuan Yang
SurpriseNet: Melody Harmonization Conditioning on User-controlled Surprise Contours 105-112
Yi-Wei Chen, Hung-Shin Lee, Yen-Hsing Chen, Hsin-Min Wang
Yi-Wei Chen, Hung-Shin Lee, Yen-Hsing Chen, Hsin-Min Wang
Semi-supervised violin fingering generation using variational autoencoders 113-120
Vincent K.M. Cheung, Hsuan-Kai Kao, Li Su
Vincent K.M. Cheung, Hsuan-Kai Kao, Li Su
Listen, Read, and Identify: Multimodal Singing Language Identification of Music 121-127
Keunwoo Choi, Yuxuan Wang
Keunwoo Choi, Yuxuan Wang
Cosine Contours: a Multipurpose Representation for Melodies 135-142
Bas Cornelissen, Willem Zuidema, John Ashley Burgoyne
Bas Cornelissen, Willem Zuidema, John Ashley Burgoyne
Controllable deep melody generation via hierarchical music structure representation 143-150
Shuqi Dai, Zeyu Jin, Celso Gomes, Roger Dannenberg
Shuqi Dai, Zeyu Jin, Celso Gomes, Roger Dannenberg
MSTRE-Net: Multistreaming Acoustic Modeling for Automatic Lyrics Transcription 151-158
Emir Demirel, Sven Ahlbäck, Simon Dixon
Emir Demirel, Sven Ahlbäck, Simon Dixon
Towards Automatic Instrumentation by Learning to Separate Parts in Symbolic Multitrack Music 159-166
Hao-Wen Dong, Chris Donahue, Taylor Berg-Kirkpatrick, Julian Mcauley
Hao-Wen Dong, Chris Donahue, Taylor Berg-Kirkpatrick, Julian Mcauley
An Empirical Evaluation of End-to-End Polyphonic Optical Music Recognition 167-173
Sachinda Edirisooriya, Hao-Wen Dong, Julian Mcauley, Taylor Berg-Kirkpatrick
Sachinda Edirisooriya, Hao-Wen Dong, Julian Mcauley, Taylor Berg-Kirkpatrick
A Hardanger Fiddle Dataset with Performances Spanning Emotional Expressions and Annotations Aligned using Image Registration 174-181
Anders Elowsson, Olivier Lartillot
Anders Elowsson, Olivier Lartillot
Building the MetaMIDI Dataset: Linking Symbolic and Audio Musical Data 182-188
Jeffrey Ens, Philippe Pasquier
Jeffrey Ens, Philippe Pasquier
Modeling and Inferring Proto-Voice Structure in Free Polyphony 189-196
Christoph Finkensiep, Martin A Rohrmeier
Christoph Finkensiep, Martin A Rohrmeier
PKSpell: Data-Driven Pitch Spelling and Key Signature Estimation 197-204
Francesco Foscarin, Nicolas Audebert, Raphael Fournier-S’Niehotta
Francesco Foscarin, Nicolas Audebert, Raphael Fournier-S’Niehotta
An interpretable music similarity measure based on path interestingness 213-219
Giovanni Gabbolini, Derek Bridge
Giovanni Gabbolini, Derek Bridge
Leveraging Hierarchical Structures for Few-Shot Musical Instrument Recognition 220-228
Hugo F Flores Garcia, Aldo Aguilar, Ethan Manilow, Bryan Pardo
Hugo F Flores Garcia, Aldo Aguilar, Ethan Manilow, Bryan Pardo
What if the ‘When’ Implies the ‘What’?: Human harmonic analysis datasets clarify the relative role of the separate steps in automatic tonal analysis 229-236
Mark R H Gotham, Rainer Kleinertz, Christof Weiss, Meinard Müller, Stephanie Klauk
Mark R H Gotham, Rainer Kleinertz, Christof Weiss, Meinard Müller, Stephanie Klauk
Let’s agree to disagree: Consensus Entropy Active Learning for Personalized Music Emotion Recognition 237-245
Juan S. Gómez-Cañón, Estefania Cano, Yi-Hsuan Yang, Perfecto Herrera, Emilia Gomez
Juan S. Gómez-Cañón, Estefania Cano, Yi-Hsuan Yang, Perfecto Herrera, Emilia Gomez
Sequence-to-Sequence Piano Transcription with Transformers 246-253
Curtis Hawthorne, Ian Simon, Rigel Swavely, Ethan Manilow, Jesse Engel
Curtis Hawthorne, Ian Simon, Rigel Swavely, Ethan Manilow, Jesse Engel
A semi-automated workflow paradigm for the distributed creation and curation of expert annotations 262-269
Johannes Hentschel, Fabian C. Moss, Markus Neuwirth, Martin A Rohrmeier
Johannes Hentschel, Fabian C. Moss, Markus Neuwirth, Martin A Rohrmeier
BeatNet: CRNN and Particle Filtering for Online Joint Beat, Downbeat and Meter Tracking 270-277
Mojtaba Heydari, Frank Cwitkowitz, Zhiyao Duan
Mojtaba Heydari, Frank Cwitkowitz, Zhiyao Duan
Joint Estimation of Note Values and Voices for Audio-to-Score Piano Transcription 278-284
Yuki Hiramatsu, Eita Nakamura, Kazuyoshi Yoshii
Yuki Hiramatsu, Eita Nakamura, Kazuyoshi Yoshii
De-centering the West: East Asian Philosophies and the Ethics of Applying Artificial Intelligence to Music 301-309
Rujing Huang, Bob L. T. Sturm, Andre Holzapfel
Rujing Huang, Bob L. T. Sturm, Andre Holzapfel
A Benchmarking Initiative for Audio-domain Music Generation using the FreeSound Loop Dataset 310-317
Tun Min Hung, Bo-Yu Chen, Yen Tung Yeh, Yi-Hsuan Yang
Tun Min Hung, Bo-Yu Chen, Yen Tung Yeh, Yi-Hsuan Yang
EMOPIA: A Multi-Modal Pop Piano Dataset For Emotion Recognition and Emotion-based Music Generation 318-325
Hsiao-Tzu Hung, Joann Ching, Seungheon Doh, Nabin Kim, Juhan Nam, Yi-Hsuan Yang
Hsiao-Tzu Hung, Joann Ching, Seungheon Doh, Nabin Kim, Juhan Nam, Yi-Hsuan Yang
Piano Sheet Music Identification Using Marketplace Fingerprinting 326-333
Kevin Ji, Daniel Yang, Timothy Tsai
Kevin Ji, Daniel Yang, Timothy Tsai
Learning a cross-domain embedding space of vocal and mixed audio with a structure-preserving triplet loss 334-341
Keunhyoung Kim, Jongpil Lee, Sangeun Kum, Juhan Nam
Keunhyoung Kim, Jongpil Lee, Sangeun Kum, Juhan Nam
Decoupling Magnitude and Phase Estimation with Deep ResUNet for Music Source Separation 342-349
Qiuqiang Kong, Yin Cao, Haohe Liu, Keunwoo Choi, Yuxuan Wang
Qiuqiang Kong, Yin Cao, Haohe Liu, Keunwoo Choi, Yuxuan Wang
Artist Similarity Using Graph Neural Networks 350-357
Filip Korzeniowski, Sergio Oramas, Fabien Gouyon
Filip Korzeniowski, Sergio Oramas, Fabien Gouyon
“Finding Home”: Understanding How Music Supports Listeners’ Mental Health through a Case Study of BTS 358-365
Jin Ha Lee, Arpita Bhattacharya, Ria Antony, Nicole Santero, Anh Le
Jin Ha Lee, Arpita Bhattacharya, Ria Antony, Nicole Santero, Anh Le
Cross-cultural Mood Perception in Pop Songs and its Alignment with Mood Detection Algorithms 366-373
Harin Lee, Frank Höger, Marc Schönwiesner, Minsu Park, Nori Jacoby
Harin Lee, Frank Höger, Marc Schönwiesner, Minsu Park, Nori Jacoby
A unified model for zero-shot music source separation, transcription and synthesis 381-388
Liwei Lin, Gus Xia, Qiuqiang Kong, Junyan Jiang
Liwei Lin, Gus Xia, Qiuqiang Kong, Junyan Jiang
Pitch-Informed Instrument Assignment using a Deep Convolutional Network with Multiple Kernel Shapes 389-395
Carlos Lordelo, Emmanouil Benetos, Simon Dixon, Sven Ahlbäck
Carlos Lordelo, Emmanouil Benetos, Simon Dixon, Sven Ahlbäck
SpecTNT: a Time-Frequency Transformer for Music Audio 396-403
Wei-Tsung Lu, Ju-Chiang Wang, Minz Won, Keunwoo Choi, Xuchen Song
Wei-Tsung Lu, Ju-Chiang Wang, Minz Won, Keunwoo Choi, Xuchen Song
AugmentedNet: A Roman Numeral Analysis Network with Synthetic Training Examples and Additional Tonal Tasks 404-411
Néstor Nápoles López, Mark R H Gotham, Ichiro Fujinaga
Néstor Nápoles López, Mark R H Gotham, Ichiro Fujinaga
MINGUS: Melodic Improvisation Neural Generator Using Seq2Seq 412-419
Vincenzo Madaghiele, Pasquale Lisena, Raphael Troncy
Vincenzo Madaghiele, Pasquale Lisena, Raphael Troncy
User-centered evaluation of lyrics-to-audio alignment 420-427
Ninon Lizé Masclef, Andrea Vaglio, Manuel Moussallam
Ninon Lizé Masclef, Andrea Vaglio, Manuel Moussallam
A Modular System for the Harmonic Analysis of Musical Scores using a Large Vocabulary 435-442
Andrew Mcleod, Martin A Rohrmeier
Andrew Mcleod, Martin A Rohrmeier
A deep learning method for enforcing coherence in Automatic Chord Recognition 443-451
Gianluca Micchi, Katerina Kosta, Gabriele Medeot, Pierre Chanquion
Gianluca Micchi, Katerina Kosta, Gabriele Medeot, Pierre Chanquion
Modeling beat uncertainty as a 2D distribution of period and phase: a MIR task proposal 452-459
Martin A Miguel, Diego Fernandez Slezak
Martin A Miguel, Diego Fernandez Slezak
A case study of deep enculturation and sensorimotor synchronization to real music 460-467
Olof Misgeld, Torbjörn L Gulz, Jūra Miniotaitė, Andre Holzapfel
Olof Misgeld, Torbjörn L Gulz, Jūra Miniotaitė, Andre Holzapfel
Symbolic Music Generation with Diffusion Models 468-475
Gautam Mittal, Jesse Engel, Curtis Hawthorne, Ian Simon
Gautam Mittal, Jesse Engel, Curtis Hawthorne, Ian Simon
DarkGAN: Exploiting Knowledge Distillation for Comprehensible Audio Synthesis With GANs 484-492
Javier Nistal, Stefan Lattner, Gaël Richard
Javier Nistal, Stefan Lattner, Gaël Richard
Phase-Aware Joint Beat and Downbeat Estimation Based on Periodicity of Metrical Structure 493-499
Takehisa Oyama, Ryoto Ishizuka, Kazuyoshi Yoshii
Takehisa Oyama, Ryoto Ishizuka, Kazuyoshi Yoshii
Agreement Among Human and Automated Transcriptions of Global Songs 500-508
Yuto Ozaki, John M Mcbride, Emmanouil Benetos, Peter Pfordresher, Joren Six, Adam Tierney, Polina Proutskova, Emi Sakai, Haruka Kondo, Haruno Fukatsu, Shinya Fujii, Patrick E. Savage
Yuto Ozaki, John M Mcbride, Emmanouil Benetos, Peter Pfordresher, Joren Six, Adam Tierney, Polina Proutskova, Emi Sakai, Haruka Kondo, Haruno Fukatsu, Shinya Fujii, Patrick E. Savage
Automatic Recognition of Texture in Renaissance Music 509-516
Emilia Parada-Cabaleiro, Maximilian Schmitt, Anton Batliner, Bjorn W. Schuller, Markus Schedl
Emilia Parada-Cabaleiro, Maximilian Schmitt, Anton Batliner, Bjorn W. Schuller, Markus Schedl
Is Disentanglement enough? On Latent Representations for Controllable Music Generation 517-524
Ashis Pati, Alexander Lerch
Ashis Pati, Alexander Lerch
Pulse clarity metrics developed from a deep learning beat tracking model 525-530
Nicolás Pironio, Diego Fernandez Slezak, Martin A Miguel
Nicolás Pironio, Diego Fernandez Slezak, Martin A Miguel
On the Veracity of Local, Model-agnostic Explanations in Audio Classification: Targeted Investigations with Adversarial Examples 531-538
Verena Praher, Katharina Prinz, Arthur Flexer, Gerhard Widmer
Verena Praher, Katharina Prinz, Arthur Flexer, Gerhard Widmer
Is there a “language of music-video clips” ? A qualitative and quantitative study 539-546
Laure Prétet, Gaël Richard, Geoffroy Peeters
Laure Prétet, Gaël Richard, Geoffroy Peeters
Tabla Gharana Recognition from Audio music recordings of Tabla Solo performances 547-554
Gowriprasad R, Venkatesh V, Hema A Murthy, R Aravind, Sri Rama Murty K
Gowriprasad R, Venkatesh V, Hema A Murthy, R Aravind, Sri Rama Murty K
Navigating noise: Modeling perceptual correlates of noise-related semantic timbre categories with audio features 555-561
Lindsey Reymore, Emmanuelle Beauvais-Lacasse, Bennett Smith, Stephen Mcadams
Lindsey Reymore, Emmanuelle Beauvais-Lacasse, Bennett Smith, Stephen Mcadams
Quantitative User Perceptions of Music Recommendation List Diversity 562-568
Kyle Robinson, Dan Brown
Kyle Robinson, Dan Brown
CRASH: Raw Audio Score-based Generative Modeling for Controllable High-resolution Drum Sound Synthesis 579-585
Simon Rouard, Gaëtan Hadjeres
Simon Rouard, Gaëtan Hadjeres
Curriculum Learning for Imbalanced Classification in Large Vocabulary Automatic Chord Recognition 586-593
Luke O Rowe, George Tzanetakis
Luke O Rowe, George Tzanetakis
Deep Embeddings and Section Fusion Improve Music Segmentation 594-601
Justin Salamon, Oriol Nieto, Nicholas J. Bryan
Justin Salamon, Oriol Nieto, Nicholas J. Bryan
Multi-Task Learning of Graph-based Inductive Representations of Music Content 602-609
Antonia Saravanou, Federico Tomasi, Rishabh Mehrotra, Mounia Lalmas
Antonia Saravanou, Federico Tomasi, Rishabh Mehrotra, Mounia Lalmas
DadaGP: A Dataset of Tokenized GuitarPro Songs for Sequence Models 610-617
Pedro Pereira Sarmento, Adarsh Kumar, Cj Carr, Zack Zukowski, Mathieu Barthet, Yi-Hsuan Yang
Pedro Pereira Sarmento, Adarsh Kumar, Cj Carr, Zack Zukowski, Mathieu Barthet, Yi-Hsuan Yang
Does Track Sequence in User-generated Playlists Matter? 618-625
Harald Victor Schweiger, Emilia Parada-Cabaleiro, Markus Schedl
Harald Victor Schweiger, Emilia Parada-Cabaleiro, Markus Schedl
A Differentiable Cost Measure for Intonation Processing in Polyphonic Music 626-633
Simon J Schwär, Sebastian Rosenzweig, Meinard Müller
Simon J Schwär, Sebastian Rosenzweig, Meinard Müller
Improving Music Performance Assessment With Contrastive Learning 634-641
Pavan M Seshadri, Alexander Lerch
Pavan M Seshadri, Alexander Lerch
Tracing Affordance and Item Adoption on Music Streaming Platforms 642-649
Dougal Shakespeare, Camille Roth
Dougal Shakespeare, Camille Roth
Computational analysis of melodic mode switching in raga performance 657-664
Nithya Nadig Shikarpur, Asawari Keskar, Preeti Rao
Nithya Nadig Shikarpur, Asawari Keskar, Preeti Rao
SinTra: Learning an inspiration model from a single multi-track music segment 665-672
Qingwei Song, Qiwei Sun, Dongsheng Guo, Haiyong Zheng
Qingwei Song, Qiwei Sun, Dongsheng Guo, Haiyong Zheng
Musical Tempo Estimation Using a Multi-scale Network 682-689
Xiaoheng Sun, Qiqi He, Gao Yongwei, Wei Li
Xiaoheng Sun, Qiqi He, Gao Yongwei, Wei Li
On the Integration of Language Models into Sequence to Sequence Architectures for Handwritten Music Recognition 690-696
Pau Torras, Arnau Baró, Lei Kang, Alicia Fornés
Pau Torras, Arnau Baró, Lei Kang, Alicia Fornés
Kiite Cafe: A Web Service for Getting Together Virtually to Listen to Music 697-704
Kosetsu Tsukuda, Keisuke Ishida, Masahiro Hamasaki, Masataka Goto
Kosetsu Tsukuda, Keisuke Ishida, Masahiro Hamasaki, Masataka Goto
Toward an Understanding of Lyrics-viewing Behavior While Listening to Music on a Smartphone 705-713
Kosetsu Tsukuda, Masahiro Hamasaki, Masataka Goto
Kosetsu Tsukuda, Masahiro Hamasaki, Masataka Goto
The Words Remain the Same: Cover Detection with Lyrics Transcription 714-721
Andrea Vaglio, Romain Hennequin, Manuel Moussallam, Gael Richard
Andrea Vaglio, Romain Hennequin, Manuel Moussallam, Gael Richard
Supervised Metric Learning For Music Structure Features 730-737
Ju-Chiang Wang, Jordan B. L. Smith, Wei-Tsung Lu, Xuchen Song
Ju-Chiang Wang, Jordan B. L. Smith, Wei-Tsung Lu, Xuchen Song
Learning Pitch-Class Representations from Score-Audio Pairs of Classical Music 746-753
Christof Weiss, Johannes Zeitler, Tim Zunner, Florian Schuberth, Meinard Müller
Christof Weiss, Johannes Zeitler, Tim Zunner, Florian Schuberth, Meinard Müller
Training Deep Pitch-Class Representations With a Multi-Label CTC Loss 754-761
Christof Weiss, Geoffroy Peeters
Christof Weiss, Geoffroy Peeters
Emotion Embedding Spaces for Matching Music to Stories 777-785
Minz Won, Justin Salamon, Nicholas J. Bryan, Gautham Mysore, Xavier Serra
Minz Won, Justin Salamon, Nicholas J. Bryan, Gautham Mysore, Xavier Serra
CollageNet: Fusing arbitrary melody and accompaniment into a coherent song 786-793
Abudukelimu Wuerkaixi, Christodoulos Benetatos, Zhiyao Duan, Changshui Zhang
Abudukelimu Wuerkaixi, Christodoulos Benetatos, Zhiyao Duan, Changshui Zhang
Aligning Unsynchronized Part Recordings to a Full Mix Using Iterative Subtractive Alignment 810-817
Daniel Yang, Kevin Ji, Timothy Tsai
Daniel Yang, Kevin Ji, Timothy Tsai
ADTOF: A large dataset of non-synthetic music for automatic drum transcription 818-824
Mickael Zehren, Marco Alunno, Paolo Bientinesi
Mickael Zehren, Marco Alunno, Paolo Bientinesi
Learn by Referencing: Towards Deep Metric Learning for Singing Assessment 825-832
Huan Zhang, Yiliang Jiang, Tao Jiang, Hu Peng
Huan Zhang, Yiliang Jiang, Tao Jiang, Hu Peng
