Lyrics

When writing custom lyrics they should be structured in a specific format.

Structure

Lyrics are structured by sections inside of square brackets with lyrics written below. All sections should be separated by line breaks.

For example:

[verse 1]
The first line of the verse
The second line of the verse
The third line of the verse

[chorus]
The first line of the chorus
The second line of the chorus
The third line of the chorus

[guitar solo]

[outro]

Stylistic Reading

The lyrics in each section can contain additional hints for how you want them to be sung:

  • Use CAPITAL LETTERS to indicate strong emphasis
  • Use looooonger words or dots... to indicate a pause or an elongated note
  • Put background vocals in parentheses: "Sing it again (again)"
  • Put additional information in parentheses: "Sing it again (guitars intensify)"

General Guidelines

In order to create an optimal set of lyrics there are number of best practices to consider.

  • Keep the number of lines per section to 3-5.
  • Keep lines between 6-10 syllables.
  • If it is difficult for a human to sing then it is also difficult for the AI to sing. Try to make your lyrics flow naturally.

Common Section Tags

A few common tags you may consider using for your sections:

CategoryTagDescription
Basic Structure[Intro]Opening, establish atmosphere
[Verse] / [Verse 1]Verse, narrative progression
[Pre-Chorus]Pre-chorus, build energy
[Chorus]Chorus, emotional climax
[Bridge]Bridge, transition or elevation
[Outro]Ending, conclusion
Dynamic Sections[Build]Energy gradually rising
[Drop]Electronic music energy release
[Breakdown]Reduced instrumentation, space
Instrumental Sections[Instrumental]Pure instrumental, no vocals
[Guitar Solo]Guitar solo
[Piano Interlude]Piano interlude
Special Tags[Fade Out]Fade out ending
[Silence]Silence

Providing Hints To Sections

You can provide additional hints by adding information to the section through either a hyphen or an additional set of brackets like so:

[intro - building energy]

[verse 1][whispered]

Do not stack too many instructions per bracket or the AI model will struggle to understand. As a rule of thumb, try to only give one single direction to each section.

Some examples of instructions that can be passed are:

TypeTagEffect
Vocal[raspy vocal]Raspy, textured vocals
Vocal[whispered]Whispered
Vocal[falsetto]Falsetto
Vocal[powerful belting]Powerful, high-pitched singing
Vocal[spoken word]Rap/recitation
Vocal[harmonies]Layered harmonies
Vocal[call and response]Call and response
Vocal[ad-lib]Improvised embellishments
Energy & emotion[high energy]High energy, passionate
Energy & emotion[low energy]Low energy, restrained
Energy & emotion[building energy]Increasing energy
Energy & emotion[explosive]Explosive energy
Energy & emotion[melancholic]Melancholic
Energy & emotion[euphoric]Euphoric
Energy & emotion[dreamy]Dreamy
Energy & emotion[aggressive]Aggressive