A Guide to Music Generation Prompting with Elevenlabs

An unofficial AI song generation manual
Recent Project ExamplesQuick StartBasic API CallMinimum Viable CompositionJSON Structure Deep DiveGlobal Styles (Apply to Entire Song)Section AnatomyPerformance Notation in LyricsStyle EngineeringThe Power of Negative StylesBPM Sweet SpotsVocal Style MatrixAvoiding AI Weirdness❌ COMMON AI FAILS✅ PREVENTION STRATEGIES1. Natural Language Only2. Syllable Counting3. Concrete Imagery4. Musical DirectionsAdvanced Techniques1. Hook Engineering2. Energy Mapping3. Transition Smoothing4. Multilingual Integration5. Genre Fusion FormulaGenre TemplatesDark ElectronicHyperpop ChaosSultry R&BComedy RapPro Tips from TestingTroubleshootingIssue: Melody sounds "off"Issue: Energy drops mid-songIssue: Vocals unclearIssue: Wrong genre bleeding inIssue: Boring/repetitiveQuality ChecklistNext-Level Composition PsychologyHow to Write Songs That Don't Sound AI-GeneratedThe Uncanny Valley of AI MusicWhat Makes a Song Sound "AI"?Anti-Pattern #1: The Word Salad❌ AI-Generated Garbage✅ Human-Sounding AlternativeAnti-Pattern #2: The Rhythm Wrecker❌ Broken Scansion✅ Rhythmically ConsistentAnti-Pattern #3: The Fake Slang❌ AI Trying to Be Cool✅ Actual Slang That WorksAnti-Pattern #4: The Mood Whiplash❌ Emotional Chaos✅ Emotional JourneyThe Comedy ProblemHow to Be Funny Without Being Cringe❌ AI "Humor"✅ Actually FunnyThe Innuendo MatrixSubtlety LevelsDouble Meaning MasterclassSong Structure PsychologyThe Attention CurveThe Energy Map That WorksLyrical Frameworks That Never Fail1. The Story Arc2. The List Song3. The Conversation4. The ProgressionVocal Delivery SpecificationsWhat Actually WorksWhat Sounds RoboticThe Spanish Meat Cookbook 🌶️Actual Spanish Meats (Use These)Temperature/Preparation TermsMarx Foodservice Specific GuidelinesWhat WorksWhat Doesn'tTesting Your LyricsThe Speak-Aloud TestThe Cringe TestThe Energy TestCommon FixesProblem: "It sounds like a robot wrote this"Problem: "The rhythm is off"Problem: "It's not funny"Problem: "Too vulgar/obvious"Problem: "Energy is flat"The Ultimate Checklist
page icon
Good songs tell stories, create feelings, and make people move. They don't explain themselves, apologize for existing, or try too hard to be clever.

Recent Project Examples

Quick Start

Basic API Call

curl -X POST <https://api.elevenlabs.io/v1/music/detailed> \\ -H "Content-Type: application/json" \\ -H "xi-api-key: YOUR_KEY" \\ -d @your_composition.json \\ --output song.mp3

Minimum Viable Composition

{ "composition_plan": { "positive_global_styles": ["electronic pop", "125 BPM"], "negative_global_styles": ["slow", "acoustic"], "sections": [{ "section_name": "Full Song", "positive_local_styles": ["energetic"], "negative_local_styles": ["boring"], "duration_ms": 60000, "lines": ["Your lyrics here"] }] } }

JSON Structure Deep Dive

Global Styles (Apply to Entire Song)

"positive_global_styles": [ "genre", // Primary: "electronic pop", "dark techno", "hyperpop" "BPM", // Critical: "125 BPM", "140 BPM", "90 BPM" "vocal_style", // Character: "sultry", "aggressive", "theatrical" "production", // Sound: "bass-heavy", "synth-driven", "minimal" "mood" // Feeling: "chaotic", "mysterious", "playful" ]

Section Anatomy

{ "section_name": "Chorus", // Standard: Intro, Verse, Pre-Chorus, Chorus, Bridge, Outro "duration_ms": 24000, // 3000-120000ms per section "positive_local_styles": [], // Section-specific additions "negative_local_styles": [], // Section-specific exclusions "lines": [] // Lyrics with performance notes }

Performance Notation in Lyrics

"lines": [ "Main lyric line here", "[whispered] Soft delivery", "(ad-lib) Yeah! Uh!", "[falsetto] High note section", "(Marx! Marx!) Background chants", "[gasping] Breathless delivery" ]

Style Engineering

The Power of Negative Styles

Negative styles are MORE POWERFUL than you think. Use them to prevent:
  • Genre bleeding: "negative": ["country", "folk", "jazz"]
  • Energy drops: "negative": ["slow", "ballad", "ambient"]
  • Vocal issues: "negative": ["monotone", "spoken word", "whispered"]

BPM Sweet Spots

  • 70-90: Slow burn, dark, menacing
  • 90-110: Hip-hop, trip-hop, groovy
  • 120-128: Pop, house, standard dance
  • 140: Dubstep, trap, aggressive
  • 160-180: Drum & bass, hardcore

Vocal Style Matrix

Style
Use For
Avoid With
theatrical
Drama, humor
Minimal production
breathy
Intimate, sexy
Heavy bass
aggressive
Punk, metal
Slow tempos
falsetto
Emotional peaks
Low-energy sections
chanting
Hooks, hypnotic
Complex lyrics

Avoiding AI Weirdness

❌ COMMON AI FAILS

  1. Non-words: "Satisfize", "Chorizo-ing"
  1. Awkward scansion: Lines that don't fit the beat
  1. Random melodic jumps: Unsingable intervals
  1. Energy mismatches: Soft vocals over aggressive beats

✅ PREVENTION STRATEGIES

1. Natural Language Only

// BAD "Marx-ifying your senses" "Chorizolicious fever dream" // GOOD "Marx is calling out your name" "Chorizo fever in my brain"

2. Syllable Counting

Count syllables to match rhythm:
  • 4/4 time = multiples of 4 or 8 syllables work best
  • Leave space for breath between phrases
  • Test by speaking lyrics in rhythm

3. Concrete Imagery

// VAGUE (AI struggles) "Feelings of desire" // SPECIFIC (AI succeeds) "Behind the freezer door"

4. Musical Directions

Be explicit about delivery:
"positive_local_styles": [ "on-beat vocals", // Prevents off-beat weirdness "clear enunciation", // Prevents mumbling "melodic stability" // Prevents random pitch jumps ]

Advanced Techniques

1. Hook Engineering

Create earworms through:
  • Repetition: Same phrase 2-3 times
  • Call-response: Question → Answer pattern
  • Sonic branding: Recurring sound/chant (e.g., "Marx! Marx!")

2. Energy Mapping

Intro ▁▁▂▂ (20% energy) Verse 1 ▂▂▃▃ (40% energy) Pre-Chorus ▃▃▄▅ (60% building) Chorus ▅▅▆▇ (85% peak) Verse 2 ▃▃▄▄ (50% energy) Bridge ▄▅▆▇ (70-90% climb) Final ▇▇██ (100% maximum) Outro ▆▄▂▁ (Fade out)

3. Transition Smoothing

Add transition cues:
"lines": [ "...", "(building to chorus)", // AI understands this "...", "[drums intensify]" // Production cue ]

4. Multilingual Integration

"lines": [ "English main line", "¡Español for emphasis!", "(French whisper: oh là là)" ]

5. Genre Fusion Formula

"positive_global_styles": [ "primary_genre", // 60% influence "secondary_genre", // 30% influence "accent_genre" // 10% spice ]

Genre Templates

Dark Electronic

{ "positive_global_styles": [ "dark techno", "140 BPM", "industrial", "heavy bass", "distorted vocals", "menacing" ], "negative_global_styles": [ "happy", "major key", "acoustic", "soft" ] }

Hyperpop Chaos

{ "positive_global_styles": [ "hyperpop", "150 BPM", "autotuned", "glitchy", "chaotic", "maximalist production" ], "negative_global_styles": [ "minimal", "organic", "subtle", "relaxed" ] }

Sultry R&B

{ "positive_global_styles": [ "R&B", "90 BPM", "sultry vocals", "smooth bass", "intimate", "late-night vibes" ], "negative_global_styles": [ "aggressive", "fast", "harsh", "childish" ] }

Comedy Rap

{ "positive_global_styles": [ "hip-hop", "95 BPM", "comedic delivery", "bouncy beat", "clear enunciation", "playful" ], "negative_global_styles": [ "serious", "mumble rap", "dark", "aggressive" ] }

Pro Tips from Testing

  1. Test Incrementally: Generate 30-second tests before full songs
  1. Version Control: Number your JSONs (v1, v2, v3...)
  1. A/B Testing: Make two versions with one variable changed
  1. Streaming Mode: Use /v1/music/stream for real-time preview
  1. Batch Generation: Queue multiple versions overnight
  1. Documentation: Comment your JSONs with // (API ignores these)

Troubleshooting

Issue: Melody sounds "off"

Fix: Add "melodic stability" and "on-beat vocals" to positive styles

Issue: Energy drops mid-song

Fix: Add "consistent energy" globally, avoid "dynamic range"

Issue: Vocals unclear

Fix: Use "clear enunciation" and avoid "reverb-heavy"

Issue: Wrong genre bleeding in

Fix: Be aggressive with negative styles, list everything to avoid

Issue: Boring/repetitive

Fix: Vary your local styles per section, add "progressive arrangement"

Quality Checklist

Before generating, verify:
BPM specified in global styles
All sections total 60-180 seconds
Lyrics scan naturally when spoken
Performance directions in brackets/parentheses
Negative styles prevent unwanted genres
Energy arc makes sense
No made-up words or AI-speak
Concrete, specific imagery

Next-Level Composition Psychology

The AI responds to:
  • Confidence: Bold, declarative style descriptions work better
  • Specificity: "125 BPM" > "fast tempo"
  • Cultural references: It knows genre conventions
  • Production terminology: Use real music production terms
  • Emotional clarity: One clear mood > mixed emotions
Remember: The AI wants to make good music. Give it clear, professional direction and it will deliver.
Document Version 1.0 | Based on extensive testing with ElevenLabs Music API# 🚫 Songwriting Anti-Patterns & Solutions

How to Write Songs That Don't Sound AI-Generated

The Uncanny Valley of AI Music

What Makes a Song Sound "AI"?

  1. Prosody Violations - Words that don't match the rhythm
  1. Semantic Drift - Lyrics that lose coherent meaning
  1. Emotional Inconsistency - Mood swings that don't make sense
  1. Melodic Randomness - Notes that don't follow musical logic
  1. Energy Mismatches - Production that fights the vocals

Anti-Pattern #1: The Word Salad

❌ AI-Generated Garbage

"Mystical sensations flowing through existence Dancing particles of emotional persistence Universe calling with synthetic dreams Reality fracturing at quantum seams"

✅ Human-Sounding Alternative

"Late night in your apartment City lights through the window You're dancing in the kitchen To a song on the radio"
Rule: Use concrete scenes, not abstract concepts

Anti-Pattern #2: The Rhythm Wrecker

❌ Broken Scansion

"I'm desperately wanting to find you" (10 syllables) "Come to me" (3 syllables) "The universe is calling out for our love" (11 syllables) "Yeah" (1 syllable)

✅ Rhythmically Consistent

"I've been searching for you" (6) "Come and find me too" (5) "Every star above us" (6) "Knows our love is true" (5)
Rule: Count syllables, maintain patterns

Anti-Pattern #3: The Fake Slang

❌ AI Trying to Be Cool

"Vibing on that fleek sauce Getting all algorithmically boss My neural network's on fire Quantum entangled desire"

✅ Actual Slang That Works

"Got me feeling some type of way Your moves are crazy, I must say This energy's off the charts You're playing games with my heart"
Rule: Use real slang or none at all

Anti-Pattern #4: The Mood Whiplash

❌ Emotional Chaos

Verse: "I'm so depressed and lonely" Chorus: "PARTY TIME! WOO! YEAH!" Bridge: "Contemplating existence..." Outro: "RAGE AGAINST THE MACHINE!"

✅ Emotional Journey

Verse: "Starting to feel the rhythm" Chorus: "Now we're dancing freely" Bridge: "This moment's everything" Outro: "Never want this night to end"
Rule: Emotions should evolve, not randomly switch

The Comedy Problem

How to Be Funny Without Being Cringe

❌ AI "Humor"

  • Random references ("Banana hammock Tuesday!")
  • Forced wordplay ("Meat me at the meat meet")
  • Over-explaining jokes ("That's funny because...")

✅ Actually Funny

  • Unexpected truth ("Your mom likes my playlist")
  • Clever innuendo (implied, not stated)
  • Callback humor (reference earlier lines)
  • Situational absurdity (believable but weird)

The Innuendo Matrix

Subtlety Levels

Level 1 - Too Obvious ❌ "I want to put my meat in your mouth"
Level 2 - Just Right ✅ "Serving up something hot tonight"
Level 3 - Too Vague ❌ "Things are happening with stuff"

Double Meaning Masterclass

Surface Meaning
Hidden Meaning
Line Example
Food service
Sexual tension
"Order up, coming hot"
Temperature
Arousal
"Thermometer's rising"
Workplace
Roleplay
"Working overtime tonight"
Spanish food
Passion
"That chorizo heat"

Song Structure Psychology

The Attention Curve

0-8 sec: Hook them (MUST grab attention) 8-20 sec: Set scene (establish world) 20-45 sec: Build tension (create need) 45-60 sec: Release (satisfy with chorus) 60-90 sec: Develop (add complexity) 90-120 sec: Climax (peak energy) 120+ sec: Resolution (satisfying end)

The Energy Map That Works

Intro: ████░░░░░░ 40% Verse 1: ██████░░░░ 60% Pre-Chorus: ███████░░░ 70% Chorus: █████████░ 90% Verse 2: ██████░░░░ 60% Chorus: █████████░ 90% Bridge: ████████░░ 80% Final: ██████████ 100% Outro: ████░░░░░░ 40%

Lyrical Frameworks That Never Fail

1. The Story Arc

  • Setup: Where/Who/When
  • Conflict: The problem/desire
  • Rising: Things intensify
  • Climax: Peak moment
  • Resolution: How it ends

2. The List Song

  • Verse 1: List examples
  • Chorus: The main point
  • Verse 2: More examples
  • Bridge: The twist

3. The Conversation

  • Verse 1: You said...
  • Verse 2: I said...
  • Chorus: We both know...
  • Bridge: But really...

4. The Progression

  • Verse 1: Beginning
  • Verse 2: Middle
  • Bridge: Transformation
  • Final: End state

Vocal Delivery Specifications

What Actually Works

"[whispered]" - Start of intimate sections "(ad-lib: yeah)" - Between main lines "[falsetto]" - Emotional peaks only "(harmony: ooh)" - Background vocals "[spoken]" - Breakdowns, not verses

What Sounds Robotic

"[randomly yelling]" "(constant ad-libs every line)" "[switching delivery mid-word]" "(unclear what this means)"

The Spanish Meat Cookbook 🌶️

Actual Spanish Meats (Use These)

  • Chorizo - Spicy sausage
  • Jamón ibérico - Premium ham
  • Morcilla - Blood sausage
  • Lomo - Pork loin
  • Cecina - Cured beef
  • Sobrasada - Spreadable sausage
  • Butifarra - Catalan sausage

Temperature/Preparation Terms

  • "Sizzling" > "Hot meating"
  • "Grilled to perfection" > "Meat-ified"
  • "Slow-cooked" > "Meat processing"
  • "Marinated" > "Meat-soaked"

Marx Foodservice Specific Guidelines

What Works

  • Marx as a chant/hook
  • Specific location references (break room, freezer)
  • Actual food service terminology
  • College setting details
  • Workplace hierarchy humor

What Doesn't

  • Over-explaining the company
  • Making up food service terms
  • Forcing "Marx" into weird compounds
  • Generic workplace references

Testing Your Lyrics

The Speak-Aloud Test

  1. Read your lyrics out loud
  1. Do they sound like something a human would say?
  1. Can you say them in rhythm?
  1. Do they make sense without music?

The Cringe Test

  1. Would you be embarrassed if someone found these lyrics?
  1. Would a stranger understand the humor?
  1. Is the innuendo clever or just vulgar?
  1. Would this work at a party?

The Energy Test

  1. Map energy levels 1-10 for each section
  1. Does the progression make sense?
  1. Are transitions smooth?
  1. Is the climax actually climactic?

Common Fixes

Problem: "It sounds like a robot wrote this"

Fix: Add specific details, real places, actual emotions

Problem: "The rhythm is off"

Fix: Count syllables, emphasize the right words

Problem: "It's not funny"

Fix: Use surprise, not randomness

Problem: "Too vulgar/obvious"

Fix: Imply, don't state; suggest, don't show

Problem: "Energy is flat"

Fix: Vary section dynamics, add builds and drops

The Ultimate Checklist

Before sending to ElevenLabs:
Every line sounds natural when spoken
Syllables match the beat structure
Energy progression makes sense
Humor lands without explanation
No made-up words or compounds
Specific, concrete imagery
Emotional consistency throughout
Performance directions are clear
No "AI-isms" or robot speak
Page (1)