Sony Group’s blueprint for AI music detection tech is promising. Right here’s what it’s engaged on…

0
MBW-REACTS_Sony.jpg


MBW Reacts is a collection of analytical commentaries from Music Enterprise Worldwide written in response to main latest leisure occasions or information tales. Solely MBW+ subscribers have limitless entry to those articles.


Final week, Nikkei Asia reported that researchers at Sony Group had been engaged on expertise to determine copyrighted music embedded in AI-generated tracks.

The story was broadly picked up, with protection framing the event as a type of next-generation detection software that would assist songwriters declare compensation from AI builders.

However the broader underlying analysis by the staff at Sony AI seems to go significantly additional than that framing suggests.

In a weblog submit printed in December, Sony AI highlighted three papers accepted at main educational conferences in 2025 for AI and audio analysis.

The analysis, in response to the weblog submit, is targeted on “musical integrity within the age of machine studying, exploring attribution, recognition, and safety,” and is “a part of a rising physique of labor exploring how AI can unlearn what doesn’t belong to it, how connections between musical segments could be recognized, and the way efficient present audio authentication strategies are.”

As we famous final week, this work is a part of Sony AI’s broader analysis, and the corporate has not introduced any explicit product or business rollout.

Sony AI, in response to its about web page, was established as a division of Japan-headquartered tech and leisure big Sony Group in April 2020 to “pursue groundbreaking analysis in AI and robotics to unleash human creativeness and creativity with AI”. Sony AI has workplaces in North America, Europe, India, and Japan.

Right here’s what Sony AI’s researchers are engaged on…

1. Attribution: ‘Unlearning’ can hint which songs formed an AI mannequin’s output, even when nothing sounds alike

Sony AI’s weblog submit introduces the primary problem as attribution, or “understanding which coaching knowledge influenced what an AI system creates.”

Because the weblog places it, “when an unlicensed generative mannequin composes a brand new music from a textual content immediate, it doesn’t embrace any file of attribution. However Sony AI’s researchers imagine it could actually nonetheless be decided.”

The paper, titled, Massive-Scale Coaching Knowledge Attribution for Music Generative Fashions through Unlearning was accepted on the NeurIPS 2025 Artistic AI Monitor. It proposes a technique for figuring out which songs in an AI mannequin’s coaching knowledge most affected a particular generated output. Slightly than evaluating generated tracks in opposition to a catalog of present music, it really works by selectively “forgetting” the generated observe from the mannequin, then measuring which coaching songs are most affected by that elimination.

To check the strategy, the researchers ran it in opposition to different strategies. The so-called “unlearning” technique produced sharper outcomes, with affect concentrated in a small variety of coaching tracks, whereas similarity-based strategies confirmed broader, much less targeted patterns. When used to determine a recognized coaching observe, the system achieved excellent identification whereas the mannequin’s total high quality remained unchanged.

The authors describe their work as the primary to discover attribution on a text-to-music mannequin skilled on a big, various dataset. They body it as a sensible framework for making use of unlearning-based attribution at scale.

Conclusion: By “unlearning” a generated observe and observing ripple results, this technique can pinpoint which coaching songs influenced an AI’s output, even when the output doesn’t clearly resemble them. As Sony AI’s weblog notes, “by displaying what occurs when fashions neglect, Sony AI’s researchers hope to assist recognise the works of the unique artists.”

Learn the total paper right here


2. Recognition: Phase-level matching can catch the type of borrowing AI really does

Sony AI’s weblog frames the second strand as recognition, or mapping “the relationships between works.”

Because the weblog explains: “Two songs is probably not equivalent, however they may nonetheless share a melody, rhythm, or phrasing that hyperlinks them throughout eras or objects in a given catalogue.”

The paper, accepted at ICML 2025, introduces CLEWS [Supervised Contrastive Learning from Weakly-Labeled Audio Segments for Musical Version Matching]. The system detects when two recordings are totally different variations of the identical piece. The important thing innovation is that it really works with 20-second audio snippets fairly than entire tracks. Because the authors word, the segments that matter in real-world circumstances are a lot shorter than full music size.

On two public benchmarks, CLEWS outperformed all present strategies. Whereas competing programs noticed steep accuracy drops with shorter audio clips, CLEWS maintained excessive accuracy down to only 10 seconds. The paper lists plagiarism and near-duplicate detection amongst its functions.

Conclusion: CLEWS can determine shared musical materials between recordings on the phase stage, even in brief clips. As Sony AI’s weblog places it, this type of fine-grained detection “might help copyright safety and content material monitoring programs, serving to determine near-duplicates or unauthorised variations which may slip previous conventional matching instruments.”

You may learn the total paper right here


3. Safety: Can Audio watermarking survive AI compression

Sony AI’s weblog frames the third strand, safety, round a blunt query: “Can present watermarking strategies stand up to real-world transformations?”

Because the weblog notes: “As audio compression turns into more and more powered by neural networks… the very indicators that watermarking programs depend on to show authenticity are being erased.”

The paper, accepted at INTERSPEECH 2025, introduces RAW-Bench [Robust Audio Watermarking Benchmark], a framework that assessments how properly watermarking algorithms maintain up in opposition to 20 real-world distortions together with compression, background noise, reverb, and time stretching. The researchers examined 4 publicly obtainable algorithms on a dataset spanning music, speech, and environmental sounds.

The important thing discovering issues neural audio codecs, the AI-powered compression instruments used to shrink audio recordsdata. In opposition to the Descript Audio Codec, each watermarking algorithm scored zero on full-message accuracy — which means not a single watermark was totally recovered intact. Even after retraining two algorithms to withstand these assaults, each nonetheless scored zero on this measure. Some algorithms managed partial bit restoration, however at ranges too low to be virtually helpful.

The reason is easy: watermarks disguise info inside audio, whereas neural codecs strip out something inaudible. Since codecs usually come final within the processing chain, they get the final phrase.

Conclusion: Present audio watermarking can not survive AI-powered compression. As Sony AI’s weblog suggests, “future watermarking programs might must collaborate with codecs fairly than battle in opposition to them, embedding id in ways in which persist by way of transformation fairly than being filtered out by it.”

Learn the total paper right here.


The larger image

Collectively, these three papers describe a layered technical framework: attribution traces affect on the mannequin stage, recognition maps relationships on the fragment stage, and watermarking benchmarks reveal the place present protections fall quick.

Sony AI says that its researchers “are serving to outline how balancing innovation with duty can work in the way forward for generative music: with AI that remembers its sources, hears its connections, and safeguards its sign”.

Trying forward, Sony AI’s analysis on this space doesn’t look like slowing down.

In a separate weblog submit printed in February, Sony’s AI analysis unit mentioned it can have greater than 10 papers accepted at ICLR 2026, spanning “generative modeling, diffusion, multimodal illustration studying, and creator-focused AI programs.”

Among the many matters listed is “AI-assisted music post-production.”Music Enterprise Worldwide

Leave a Reply

Your email address will not be published. Required fields are marked *