Who Is Really Making AI Music — The Songwriter, Not the Machine

We analyzed roughly 650,000 AI music generations from across the AI music space, spanning late 2025 into 2026, to answer one question the "AI slop" headlines skip: on the creation side, when a person sits down with an AI music tool, who actually authors the music? The data is consistent. People bring their own lyrics, mark up the song's structure by hand, cast specific voices, and come back to do it again. One-click, walk-away generation accounts for a small share of activity. In the sample, the tool functions in the role of a studio that performs a song the person has already written.
Three figures from the dataset set the scope.
The narrative the data tests against
The public number for AI music describes a different side of the industry. In April 2026, Deezer reported that about 44% of the tracks uploaded to its platform every day are now AI-generated — nearly 75,000 a day — and that an estimated 85% of the streams those tracks pull are fraudulent. Those figures produced the headlines about a flood of slop and bots gaming the royalty pool.
The Deezer figure measures one variable: what gets uploaded to a streaming catalog. It is a consumption-side measurement — output landing in a library, much of it routed at scale to harvest royalties. It does not measure what happens upstream, when an individual opens a tool and makes something. Our dataset measures that second step. The two are frequently conflated, and the distinction is the whole question.
Finding 1 — people supply the words
If the tool were writing the songs, prompts would be short: a genre, a mood, a "make me something sad." In the sample, that is the minority case. Roughly 41% of all prompts run past 1,000 characters — the single largest bucket — while one-line prompts under 50 characters account for about 9%.
A prompt over a thousand characters is not a description of a song; it is the text of one — verses, chorus, and bridge that a person wrote and pasted in. The data shows that the most common use of the tool is to hand it a finished lyric and ask it to perform the arrangement. The authorship of the words sits with the user.
Finding 2 — people arrange the structure by hand
Once the words exist, the next decisions are structural: where the chorus lands, when the bridge breaks, how the track opens and closes. The data shows users making those decisions explicitly.
The most frequent terms across the 650,000 prompts are not moods or genres but structural tags. chorus appears over 452,000 times and verse over 410,000, with outro, bridge, pre-chorus, and intro all ranking near the top. This is the markup of a song being charted section by section before it is recorded. Where the interface mode is known, users select the advanced mode — finer control over the arrangement — more often (about 46%) than the one-tap simple mode (about 38%). Given a choice between an undirected generation and a controlled one, the sample leans toward control.
Finding 3 — people specify the sound and the voice
Users also describe the sound, and they are specific. The most common descriptive words concern voice and instrumentation: vocal appears over 130,000 times, alongside male, female, emotional, warm, soft, piano, guitar, and bass. Vocal terms outnumber requests for purely instrumental tracks by roughly 17.6 to 1.
The data indicates a casting and instrumentation decision made by the user — a warm male vocal, an emotional piano line, a defined feel — with the tool executing the specification rather than choosing it.
Finding 4 — the long tail is occasion-driven, not batch output
The slop narrative implies anonymous, high-volume output. The distribution of song types points the other way. Regular songs are the bulk of activity, but the long tail is where the personal detail concentrates: covers, raps, birthday songs, jingles, lullabies, songs for game characters, 8-bit tracks, beats. Each maps to a specific person and a specific occasion.
The distribution is more consistent with individuals making songs for particular moments than with the batch output of a content farm.
Finding 5 — the activity is global and written in mother tongues
About 93% of prompts are written in Latin-script languages, but that reflects the writing system, not the music. Underneath, the sessions cluster into roughly ten distinct musical worlds: English-language pop; instrumental and cinematic scores; Spanish-language ballads and devotionals; Brazilian sertanejo and funk; Southeast Asian dangdut and koplo; Eastern European and Balkan songs. The sample includes wedding songs in Javanese, prayers in Spanish, and birthday tributes that switch between Russian, Armenian, and English mid-verse.
This distribution is consistent with an old behavior — making a song for someone — running on a new tool, rather than with automated catalog-filling.
Adoption: the activity is growing
The activity is expanding. Over the window examined, weekly generation volume grew more than 20-fold, and daily volume roughly tripled within a single quarter.
Sustained growth at that rate is consistent with repeat usage rather than one-time curiosity. The data indicates users returning because the tool handles the production work that previously required a studio, a band, and a budget, while the inputs only a person can supply — the words, the intent, the occasion — stay with the user.
Context: sixty years of the same objection
"AI music has no soul" is approximately the review every new music tool has drawn for sixty years, and the prior cases show which part of the objection holds.
In 1968, Wendy Carlos's Switched-On Bach was both a hit and a scandal: critics called the Moog synthesizer cold and not a real instrument, and musicians' unions warned it would put orchestras out of work. It became foundational to modern pop and electronic music. The Roland TR-808 drum machine drew nearly the same charges in 1980 — no soul, it will replace drummers — then became the rhythmic basis of hip-hop, house, and pop, prominent enough that Kanye West titled an album 808s & Heartbreak after it.
The record also shows where critics were partly right. When hip-hop producers began sampling other people's records in the 1980s, "anyone can loop a break" was the cheap dismissal, but the copyright disputes underneath were real and the lawsuits reshaped the craft. New tools do sometimes cause measurable harm, and AI music's streaming-fraud problem is a current instance. The relevant point is narrower: those harms are separate from the question of who authors a given track.
The Auto-Tune case is the clearest. Cher's "Believe" made it famous in 1998 and T-Pain made it ubiquitous; the backlash followed — cheating, robot voice, no feeling — and in 2009 Jay-Z released "D.O.A. (Death of Auto-Tune)" to declare it finished. It was not. It split into two simultaneous uses, a correction tool and a deliberate aesthetic running from T-Pain to Bon Iver to Kanye, and in both the person remained the author while the tool executed the intent. Across these cases, the objection about machine soul was separate from the question of authorship, which turned on whether a person was using the tool to say something.
Conclusion: what the data says about authorship
On the creation side, the evidence is unambiguous. Across hundreds of thousands of sessions, the human writes the lyrics, maps the structure, casts the voice, and chooses the occasion the song is for; the tool performs the arrangement. In the sample, the author is the person, and the AI executes the brief.
The "AI slop" account is real, but it describes a comparatively small population gaming streaming platforms — a consumption-side problem — not the population making music with these tools. For that population, the data describes a studio-shaped role for the tool: it supplies the production that once required a studio, a band, and a budget, and leaves authorship with the person. That distinction is the premise behind Lacuna.