How Far Can Chord-Symbol Time-Series Adaptation Carry Genre Identity? Capabilities and Boundaries in Multi-Genre Chord-Symbol Modeling

Hugging Face Daily Papers Papers

Summary

This paper evaluates how small adaptation interfaces (LoRA, IA3, BitFit, prefix tuning, full fine-tuning) extend a frozen Music Transformer to eleven target genres for chord-symbol time-series modeling. Results show consistent harmonic prediction improvement but limited genre identity representation, concluding that chord symbols alone are insufficient to capture complete genre identity.

Harmony is a compact symbolic layer where mathematical pitch relations, acoustic consonance, and musical convention meet. This report treats chord-symbol sequences not as a complete representation of music, but as an interpretable, controllable time series for genre-local harmonic modeling. Starting from a frozen pop-jazz Music Transformer checkpoint, I evaluate how far small adaptation interfaces can extend the model to eleven target genres: blues, bossa nova, Bach chorales, country, electronic, folk, funk, gospel, hip-hop, R&B/soul, and rock. The main evaluation compares LoRA, IA3, BitFit, prefix tuning, and full fine-tuning over 11 genres and 3 seeds, a complete 165-cell grid. All five methods improve over the frozen base on held-out chord prediction, with macro gains from +2.89 to +3.61 points; LoRA and IA3 score highest, but Wilcoxon tests with Holm and Benjamini-Hochberg correction do not support a decisive winner. A matched-data-size control sharpens this: when genres are sub-sampled to a common corpus size, IA3 stays on top but LoRA's full-data edge disappears and it falls to last, indicating the small gaps are partly data-driven. A control-token baseline is also strong, and wrong-genre adapters often beat the frozen base, suggesting much of the effect comes from lightweight conditioning over a reusable harmonic base rather than one particular adapter family. Additional diagnostics (rank sweeps, wrong-genre rotation, a base-checkpoint ablation, chord-only genre classification, generated-output statistics, real-song evaluation, and duplicate analysis) support a bounded conclusion: chord-symbol adaptation reliably improves genre-local harmonic prediction, but chord symbols alone do not carry complete genre identity. The report therefore avoids claims about perceived genre authenticity or full musical quality, which require controlled listener or musician evaluation.
Original Article
View Cached Full Text

Cached at: 06/08/26, 11:15 AM

Paper page - How Far Can Chord-Symbol Time-Series Adaptation Carry Genre Identity? Capabilities and Boundaries in Multi-Genre Chord-Symbol Modeling

Source: https://huggingface.co/papers/2606.07334

Abstract

Small adaptation interfaces extend a frozen Music Transformer model to multiple genres, showing consistent improvement in harmonic prediction but limited genre identity representation.

Harmony is a compact symbolic layer where mathematical pitch relations, acoustic consonance, andmusical conventionmeet. This report treatschord-symbol sequencesnot as a complete representation of music, but as an interpretable, controllable time series forgenre-local harmonic modeling. Starting from a frozen pop-jazzMusic Transformercheckpoint, I evaluate how far small adaptation interfaces can extend the model to eleven target genres: blues, bossa nova, Bach chorales, country, electronic, folk, funk, gospel, hip-hop, R&B/soul, and rock. The main evaluation comparesLoRA,IA3,BitFit,prefix tuning, andfull fine-tuningover 11 genres and 3 seeds, a complete 165-cell grid. All five methods improve over the frozen base on held-out chord prediction, with macro gains from +2.89 to +3.61 points;LoRAandIA3score highest, but Wilcoxon tests with Holm and Benjamini-Hochberg correction do not support a decisive winner. A matched-data-size control sharpens this: when genres are sub-sampled to a common corpus size,IA3stays on top butLoRA’s full-data edge disappears and it falls to last, indicating the small gaps are partly data-driven. A control-token baseline is also strong, and wrong-genre adapters often beat the frozen base, suggesting much of the effect comes from lightweight conditioning over a reusable harmonic base rather than one particular adapter family. Additional diagnostics (rank sweeps, wrong-genre rotation, a base-checkpoint ablation, chord-only genre classification, generated-output statistics, real-song evaluation, and duplicate analysis) support a bounded conclusion: chord-symbol adaptation reliably improves genre-localharmonic prediction, but chord symbols alone do not carry complete genre identity. The report therefore avoids claims about perceived genre authenticity or full musical quality, which require controlled listener or musician evaluation.

View arXiv pageView PDFProject pageGitHub0Add to collection

Get this paper in your agent:

hf papers read 2606\.07334

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper17

#### PearlLeeStudio/TheArtist-MusicTransformer-lora-bossa Updatedabout 1 hour ago • 145 • 1 #### PearlLeeStudio/TheArtist-MusicTransformer-pop-baseline Updatedabout 1 hour ago • 251 #### PearlLeeStudio/TheArtist-MusicTransformer-ft-pop80 Updatedabout 1 hour ago • 282 #### PearlLeeStudio/TheArtist-MusicTransformer-ft-pop67 Updatedabout 1 hour ago • 260 Browse 17 models citing this paper## Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2606.07334 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2606.07334 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

Similar Articles

Under the Hood: Building a Real-Time Chord Recognizer

Lobsters Hottest

This article explains the technical architecture of a real-time chord recognizer, detailing a four-stage pipeline using pitch-class bitmasks, candidate generation, score normalization, and musical heuristics.

ADAPTOOD: Uncertainty-Aware Fine-Tuning for Out-of-Distribution ECG Time Series Models

arXiv cs.LG

ADAPTOOD is a novel framework that uses data uncertainty to quantify distribution shift severity and guide fine-tuning of ECG time series models for out-of-distribution settings. It combines uncertainty estimation with low-rank model updates and adaptive hyperparameter optimization, achieving up to 7% higher accuracy and 12.9% higher precision than existing OOD adaptation methods.