ArtifactNet: Detecting AI-Generated Music via Forensic Residual Physics
Summary
ArtifactNet is a lightweight neural network framework that detects AI-generated music by analyzing codec-specific artifacts in audio signals, achieving F1=0.9829 on a new 6,183-track benchmark (ArtifactBench) with 49x fewer parameters than competing methods. The approach uses forensic physics principles to extract codec residuals through a bounded-mask UNet and compact CNN, with codec-aware training reducing cross-codec drift by 83%.
View Cached Full Text
Cached at: 04/20/26, 08:27 AM
Paper page - ArtifactNet: Detecting AI-Generated Music via Forensic Residual Physics
Source: https://huggingface.co/papers/2604.16254
Abstract
ArtifactNet uses a lightweight neural network framework to detect AI-generated music by analyzing codec-specific artifacts in audio signals, achieving superior performance compared to existing methods through codec-aware training and efficient architecture design.
We present ArtifactNet, a lightweight framework that detects AI-generated music by reframing the problem as forensic physics – extracting and analyzing the physical artifacts that neural audio codecs inevitably imprint on generated audio. A bounded-mask UNet (ArtifactUNet, 3.6M parameters) extracts codec residuals from magnitude spectrograms, which are then decomposed via HPSS into 7-channel forensic features for classification by a compact CNN (0.4M parameters; 4.0M total). We introduce ArtifactBench, a multi-generator evaluation benchmark comprising 6,183 tracks (4,383 AI from 22 generators and 1,800 real from 6 diverse sources). Each track is tagged with bench_origin for fair zero-shot evaluation. On the unseen test partition (n=2,263), ArtifactNet achieves F1 = 0.9829 with FPR = 1.49%, compared to CLAM (F1 = 0.7576, FPR = 69.26%) and SpecTTTra (F1 = 0.7713, FPR = 19.43%) evaluated under identical conditions with published checkpoints. Codec-aware training (4-way WAV/MP3/AAC/Opus augmentation) further reduces cross-codec probability drift by 83% (Delta = 0.95 → 0.16), resolving the primary codec-invariance failure mode. These results establish forensic physics – direct extraction of codec-level artifacts – as a more generalizable and parameter-efficient paradigm for AI music detection than representation learning, using 49x fewer parameters than CLAM and 4.8x fewer than SpecTTTra.
View arXiv page (https://arxiv.org/abs/2604.16254) View PDF (https://arxiv.org/pdf/2604.16254) Project page (https://demo.intrect.io/) Add to collection (https://huggingface.co/login?next=%2Fpapers%2F2604.16254)
Models citing this paper1
intrect/artifactnet Audio Classification• Updated about 5 hours ago (https://huggingface.co/intrect/artifactnet)
Datasets citing this paper1
intrect/artifactbench Viewer• Updated about 6 hours ago • 4.4k • 59 (https://huggingface.co/datasets/intrect/artifactbench)
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2604.16254 in a Space README.md to link it from this page.
Collections including this paper0
No Collection including this paper
Add this paper to a collection (https://huggingface.co/new-collection) to link it from this page.
Similar Articles
APEX: Large-scale Multi-task Aesthetic-Informed Popularity Prediction for AI-Generated Music
APEX is a large-scale multi-task learning framework that predicts both popularity and aesthetic quality of AI-generated music using frozen audio embeddings. The model demonstrates strong generalization across different generative architectures by jointly predicting engagement signals and perceptual quality dimensions.
The BEST local AI music generator is here! Free & unlimited
ACE-Step 1.5 XL is an open-source music generator that surpasses Suno & Udio in quality and speed, running unlimited on a 12 GB GPU with ~120× real-time generation.
MuseNet
OpenAI released MuseNet, a deep neural network based on GPT-2 architecture that generates 4-minute musical compositions with 10 instruments by learning patterns from hundreds of thousands of MIDI files. The model can combine multiple music styles and blend them in novel ways.
Understanding the source of what we see and hear online
OpenAI announces tools and research efforts to help verify content authenticity, including text watermarking, metadata approaches, and expanded image detection with C2PA metadata integration for tracking AI-generated and edited content.
Deezer says 44% of new music uploads are AI-generated, most streams are fraudulent
Deezer reports 44% of new uploads are AI-generated, mostly for fraudulent streaming, while keeping them out of recommendations; Lyria 3, Suno, Udio cited as enablers.