Who Is Really Making AI Music — The Songwriter, Not the Machine

We analyzed roughly 650,000 AI music generations from across the AI music space, spanning late 2025 into 2026, to answer one question the "AI slop" headlines skip: on the creation side, when a person sits down with an AI music tool, who actually authors the music? The data is consistent. People bring their own lyrics, mark up the song's structure by hand, cast specific voices, and come back to do it again. One-click, walk-away generation accounts for a small share of activity. In the sample, the tool functions in the role of a studio that performs a song the person has already written.

Three figures from the dataset set the scope.

~41%

of prompts run over 1,000 characters

roughly the length of a full lyric sheet

452,000+

occurrences of "chorus" across prompts

structure marked up section by section

~17.6×

more vocal words than instrumental ones

voice and performers specified by the user

The narrative the data tests against

The public number for AI music describes a different side of the industry. In April 2026, Deezer reported that about 44% of the tracks uploaded to its platform every day are now AI-generated — nearly 75,000 a day — and that an estimated 85% of the streams those tracks pull are fraudulent. Those figures produced the headlines about a flood of slop and bots gaming the royalty pool.

The Deezer figure measures one variable: what gets uploaded to a streaming catalog. It is a consumption-side measurement — output landing in a library, much of it routed at scale to harvest royalties. It does not measure what happens upstream, when an individual opens a tool and makes something. Our dataset measures that second step. The two are frequently conflated, and the distinction is the whole question.

The two figures measure different things: what a person authors, and what reaches a streaming catalog.

Two measurements, not one

Deezer's 44% measures what reaches a streaming service — the consumption side. The ~650,000-generation dataset measures what people make with a tool — the creation side. The two are distinct measurements.

Finding 1 — people supply the words

If the tool were writing the songs, prompts would be short: a genre, a mood, a "make me something sad." In the sample, that is the minority case. Roughly 41% of all prompts run past 1,000 characters — the single largest bucket — while one-line prompts under 50 characters account for about 9%.

~41%

of prompts exceed 1,000 characters

largest bucket — a full set of verses, chorus, and bridge

~9%

of prompts are under 50 characters

the short one-line case is a minority

A prompt over a thousand characters is not a description of a song; it is the text of one — verses, chorus, and bridge that a person wrote and pasted in. The data shows that the most common use of the tool is to hand it a finished lyric and ask it to perform the arrangement. The authorship of the words sits with the user.

Finding 2 — people arrange the structure by hand

Once the words exist, the next decisions are structural: where the chorus lands, when the bridge breaks, how the track opens and closes. The data shows users making those decisions explicitly.

The most frequent terms across the 650,000 prompts are not moods or genres but structural tags. chorus appears over 452,000 times and verse over 410,000, with outro, bridge, pre-chorus, and intro all ranking near the top. This is the markup of a song being charted section by section before it is recorded. Where the interface mode is known, users select the advanced mode — finer control over the arrangement — more often (about 46%) than the one-tap simple mode (about 38%). Given a choice between an undirected generation and a controlled one, the sample leans toward control.

What a 1,000-character prompt typically contains: a full lyric, marked up section by section, with voice and instruments named. Illustration.

Finding 3 — people specify the sound and the voice

Users also describe the sound, and they are specific. The most common descriptive words concern voice and instrumentation: vocal appears over 130,000 times, alongside male, female, emotional, warm, soft, piano, guitar, and bass. Vocal terms outnumber requests for purely instrumental tracks by roughly 17.6 to 1.

The data indicates a casting and instrumentation decision made by the user — a warm male vocal, an emotional piano line, a defined feel — with the tool executing the specification rather than choosing it.

Finding 4 — the long tail is occasion-driven, not batch output

The slop narrative implies anonymous, high-volume output. The distribution of song types points the other way. Regular songs are the bulk of activity, but the long tail is where the personal detail concentrates: covers, raps, birthday songs, jingles, lullabies, songs for game characters, 8-bit tracks, beats. Each maps to a specific person and a specific occasion.

Song type as a share of all generations. Regular songs dominate; the long tail — covers, raps, birthday songs, jingles — is where occasion-driven, personal creation concentrates.

Source: Aggregate cross-platform sample of ~650,000 AI music generations, late 2025–2026.

The distribution is more consistent with individuals making songs for particular moments than with the batch output of a content farm.

Finding 5 — the activity is global and written in mother tongues

About 93% of prompts are written in Latin-script languages, but that reflects the writing system, not the music. Underneath, the sessions cluster into roughly ten distinct musical worlds: English-language pop; instrumental and cinematic scores; Spanish-language ballads and devotionals; Brazilian sertanejo and funk; Southeast Asian dangdut and koplo; Eastern European and Balkan songs. The sample includes wedding songs in Javanese, prayers in Spanish, and birthday tributes that switch between Russian, Armenian, and English mid-verse.

Languages and genres cluster into roughly ten groups; the six named here are among the largest. Illustration.

This distribution is consistent with an old behavior — making a song for someone — running on a new tool, rather than with automated catalog-filling.

Adoption: the activity is growing

The activity is expanding. Over the window examined, weekly generation volume grew more than 20-fold, and daily volume roughly tripled within a single quarter.

23×

weekly generation volume

growth over roughly 18 months

~3×

daily volume

within a single quarter

Sustained growth at that rate is consistent with repeat usage rather than one-time curiosity. The data indicates users returning because the tool handles the production work that previously required a studio, a band, and a budget, while the inputs only a person can supply — the words, the intent, the occasion — stay with the user.

Context: sixty years of the same objection

"AI music has no soul" is approximately the review every new music tool has drawn for sixty years, and the prior cases show which part of the objection holds.

Each new tool drew the same verdict on arrival; AI music is the current instance.

In 1968, Wendy Carlos's Switched-On Bach was both a hit and a scandal: critics called the Moog synthesizer cold and not a real instrument, and musicians' unions warned it would put orchestras out of work. It became foundational to modern pop and electronic music. The Roland TR-808 drum machine drew nearly the same charges in 1980 — no soul, it will replace drummers — then became the rhythmic basis of hip-hop, house, and pop, prominent enough that Kanye West titled an album 808s & Heartbreak after it.

The record also shows where critics were partly right. When hip-hop producers began sampling other people's records in the 1980s, "anyone can loop a break" was the cheap dismissal, but the copyright disputes underneath were real and the lawsuits reshaped the craft. New tools do sometimes cause measurable harm, and AI music's streaming-fraud problem is a current instance. The relevant point is narrower: those harms are separate from the question of who authors a given track.

The Auto-Tune case is the clearest. Cher's "Believe" made it famous in 1998 and T-Pain made it ubiquitous; the backlash followed — cheating, robot voice, no feeling — and in 2009 Jay-Z released "D.O.A. (Death of Auto-Tune)" to declare it finished. It was not. It split into two simultaneous uses, a correction tool and a deliberate aesthetic running from T-Pain to Bon Iver to Kanye, and in both the person remained the author while the tool executed the intent. Across these cases, the objection about machine soul was separate from the question of authorship, which turned on whether a person was using the tool to say something.

Conclusion: what the data says about authorship

On the creation side, the evidence is unambiguous. Across hundreds of thousands of sessions, the human writes the lyrics, maps the structure, casts the voice, and chooses the occasion the song is for; the tool performs the arrangement. In the sample, the author is the person, and the AI executes the brief.

The "AI slop" account is real, but it describes a comparatively small population gaming streaming platforms — a consumption-side problem — not the population making music with these tools. For that population, the data describes a studio-shaped role for the tool: it supplies the production that once required a studio, a band, and a budget, and leaves authorship with the person. That distinction is the premise behind Lacuna.