Announcing OCTAVE, the first language model built for TTSAvailable now

Creator studio

The easiest way to create social media content like commercials, podcasts, or videos.
TTS Projects

Trusted By

Design Works Logo
Lge Logo
Woven Logo
Softbank Logo
Humana Logo
Aecho AI Logo
Betteryou Logo
Nestwork Logo
Innovax Systems Logo
Jammy Chat Logo
Aura Health Logo
Wonsulting Logo
Memorang Logo
Flourish Logo
Climb Together Logo
Design Works Logo
Lge Logo
Woven Logo
Softbank Logo
Humana Logo
Aecho AI Logo
Betteryou Logo
Nestwork Logo
Innovax Systems Logo
Jammy Chat Logo
Aura Health Logo
Wonsulting Logo
Memorang Logo
Flourish Logo
Climb Together Logo
Design Works Logo
Lge Logo
Woven Logo
Softbank Logo
Humana Logo
Aecho AI Logo
Betteryou Logo
Nestwork Logo
Innovax Systems Logo
Jammy Chat Logo
Aura Health Logo
Wonsulting Logo
Memorang Logo
Flourish Logo
Climb Together Logo
Sentra Logo
Althea Logo
Study Fetch Logo
Tone AI Logo
Thumos Logo
Ream Logo
New Computer Logo
Everfriends AI Logo
Mynd Logo
Pressmaster AI Logo
Nancy AI Logo
Parrot Prep Logo
Stimuler Logo
Quantanite Logo
Sentra Logo
Althea Logo
Study Fetch Logo
Tone AI Logo
Thumos Logo
Ream Logo
New Computer Logo
Everfriends AI Logo
Mynd Logo
Pressmaster AI Logo
Nancy AI Logo
Parrot Prep Logo
Stimuler Logo
Quantanite Logo
Sentra Logo
Althea Logo
Study Fetch Logo
Tone AI Logo
Thumos Logo
Ream Logo
New Computer Logo
Everfriends AI Logo
Mynd Logo
Pressmaster AI Logo
Nancy AI Logo
Parrot Prep Logo
Stimuler Logo
Quantanite Logo

Projects

Structure, edit, and generate long-form audio with precision with our Projects interface, currently in preview. Add multiple chapters, assign unique voices to different sections, and specify acting instructions for specific phrases.

Projects Bg

Any emotion or speaking style, on command

Octave is the first TTS system that can take natural language instructions to change emotional delivery and speaking style. Give directions like "sound sarcastic" or "whisper fearfully." For the first time, creators have total control.

Create a voice with any prompt with Voice Design

Create any AI voice you can imagine, like a "sarcastic medieval peasant," with a brief prompt or evocative script

"sarcastic medieval peasant"

Full prompt: "The speaker is a medieval peasant with a cockney accent, raspy voice, dripping with sarcasm."

00:00
00:00

"literature professor"

Full prompt: "A retired Black female literature professor who analyzes poetry with precise academic language and references to her own published criticism."

00:00
00:00

"charming cowboy"

Full prompt: "The speaker is a grizzled old cowboy with a folksy Texan drawl Southern accent, speaking in a charismatic tone with a deep but relaxed vibe."

00:00
00:00

"sitcom inner monologue"

Full prompt: "The star of a popular sitcom, with frequent inner monologues about her life."

00:00
00:00

"dungeon master"

Full prompt: "A know-it-all dungeons and dragons dungeon master speaking excitedly with a lisp."

00:00
00:00

"warm English narrator"

Full prompt: "The speaker is a sophisticated British female narrator with a gentle, warm voice, recounting the ending of a classic romance novel."

00:00
00:00

"unserious movie trailer guy"

Full prompt: "The speaker is an American, deep middle-aged male film trailer narrator for a film about chickens."

00:00
00:00

"raspy evil vampire"

Full prompt: "A villainous undead vampire, with a horrifying raspy voice, and a slight Transylvanian accent."

00:00
00:00

"reminiscing"

Full prompt: "A middle-aged African American man, reminiscing with a slightly gravelly voice and a tone of hard-earned wisdom."

00:00
00:00

"nature documentary narrator"

Full prompt: "The speaker is a distinguished British narrator, whose voice carries a deep sense of wisdom and curiosity."

00:00
00:00

Specify any character or personality

In a blind comparison study with over 100 human raters, Octave’s outputs were favored over outputs from ElevenLabs Voice Design in terms of audio quality, naturalness, and how well speech generations matched descriptions of the desired voice, across 120 diverse prompts.

Find the perfect voice in our Voice library

Find a pre-made voice that aligns with your narrative from our Voice Library, with new voices added weekly.

Rose (1)

The first LLM built for text to speech

Unlike conventional TTS that merely “reads” words, Octave is a speech-language model that understands what words mean in context, unlocking a new level of expressiveness and nuance—and new AI voice capabilities. Octave can predict the tune, rhythm, and timbre of speech, inferring when to whisper secrets, shout triumphantly, or calmly explain a fact. 

Blog Placeholder Orange Light

Developer Tools

Developer-first APIs that allow you to take advantage of Octave TTS’ context-aware nature, control the expression of your speaker, create long-form content, and create your own voices.

Dev Tools

References

Image (21)

Octave TTS: the first text-to-speech system that understands what it’s saying

Today we’re launching Octave (Omni-capable text and voice engine), the first LLM for text-to-speech. Unlike conventional TTS that merely “reads” words, Octave is a speech-language model that understands what words mean in context, unlocking a new level of expressiveness and nuance—and new AI voice capabilities.

Learn more
Documentation Rounded

Developer Documentation

Explore our documentation with concise guides, hands-on tutorials, and an in-depth API reference—crafted to support your integration.

Explore the docs
Redline

Octave TTS Prompting Guide

While other text-to-speech models simply “read” words, Octave Text-to-Speech (TTS) is built on a language model, enabling it to interpret the meaning of text. With Octave, you can customize voices for any character, guide emotional delivery, and bring stories to life with human-like expression. The Octave speech-language model (speech LM) is a state-of-the-art voice AI model trained on data that captures the nuances of human vocal expression. It can interpret plot twists, emotional cues, and character traits within a script or prompt, transforming them into lifelike speech. To help you create the best possible samples and fully leverage the capabilities of this speech LM, we’ve compiled the following tips and tricks.

Learn more

00/00