Announcing OCTAVE, the first language model built for TTSAvailable now

The foundational voice AI model for any interface

Real-time, customizable voice intelligence powered by empathic AI

KoraAuraItoStellaFinnDacherWhimsyKoraAuraItoStellaFinnDacherWhimsyExpressivityFemininitySpeedPitchAccentExtroversionRaspinessFormalityRhythmEnunciationHume LLMWeb SearchTool UseExternal LLMCustom LLMTTS InjectionNPCAssistantCoachAgentTutorClinicianApp UITwilioTypescriptPythonReactAPI

Trusted By

Design Works Logo
Lge Logo
Woven Logo
Softbank Logo
Humana Logo
Aecho AI Logo
Betteryou Logo
Nestwork Logo
Innovax Systems Logo
Jammy Chat Logo
Aura Health Logo
Wonsulting Logo
Memorang Logo
Flourish Logo
Climb Together Logo
Design Works Logo
Lge Logo
Woven Logo
Softbank Logo
Humana Logo
Aecho AI Logo
Betteryou Logo
Nestwork Logo
Innovax Systems Logo
Jammy Chat Logo
Aura Health Logo
Wonsulting Logo
Memorang Logo
Flourish Logo
Climb Together Logo
Design Works Logo
Lge Logo
Woven Logo
Softbank Logo
Humana Logo
Aecho AI Logo
Betteryou Logo
Nestwork Logo
Innovax Systems Logo
Jammy Chat Logo
Aura Health Logo
Wonsulting Logo
Memorang Logo
Flourish Logo
Climb Together Logo
Sentra Logo
Althea Logo
Study Fetch Logo
Tone AI Logo
Thumos Logo
Ream Logo
New Computer Logo
Everfriends AI Logo
Mynd Logo
Pressmaster AI Logo
Nancy AI Logo
Parrot Prep Logo
Stimuler Logo
Quantanite Logo
Sentra Logo
Althea Logo
Study Fetch Logo
Tone AI Logo
Thumos Logo
Ream Logo
New Computer Logo
Everfriends AI Logo
Mynd Logo
Pressmaster AI Logo
Nancy AI Logo
Parrot Prep Logo
Stimuler Logo
Quantanite Logo
Sentra Logo
Althea Logo
Study Fetch Logo
Tone AI Logo
Thumos Logo
Ream Logo
New Computer Logo
Everfriends AI Logo
Mynd Logo
Pressmaster AI Logo
Nancy AI Logo
Parrot Prep Logo
Stimuler Logo
Quantanite Logo

Flagship Model: EVI 2

Our latest voice-to-voice model converses rapidly and fluently with users, understands users' tone of voice, and generates the right tone of voice. Capable of emulating a wide range of personalities, accents, and speaking styles, it can be tailored to each application and user.

Our model capabilities

Multimodal emotional intelligence

EVI 2 merges language and voice into a single model trained specifically for emotional intelligence, enabling it to emphasize the right words, laugh or sigh at appropriate times, and much more, guided by prompting to suit your use case.

Voice customization without the risks

Create synthetic voices unique to any app or user, without voice cloning. Our novel approach lets you modulate EVI 2’s voice along dimensions like timbre, pitch, nasality, perceived gender, and more, then control its tone and speaking style with your prompt.

Support for any LLM and tool

Let EVI 2 generate all the language or integrate any external LLM or tool. EVI 2 will seamlessly incorporate any output into the conversation without sacrificing on expressiveness, personality, speaking style, or instruction-following capabilities.

Explore EVI 2's capabilities

Compelling personalities (Aura) with EVI 2
"Hey Aura..."
Empathically expressive speech with EVI 2
"I’m launching something I'm excited about…"
Compelling personalities (Whimsy) with EVI 2
"Hey Whimsy..."
Rapping on command with EVI 2
"Can you freestyle rap about yourself?"
Prompting rate of speech with EVI 2
"Can you speak faster from now on?"
Nonverbal vocalizations with EVI 2
"Could you laugh maniacally for us?"
Inventing new vocal expressions with EVI 2
"Now can you make a sound of joy and enthusiasm?"
Emergent multilingual capabilities with EVI 2
"Can you speak Spanish?"
Compelling personalities (Stella) with EVI 2
"Hey Stella..."

00/00

Frequently asked questions

Resources

Reg (1)

What is emotion science?

How can artificial intelligence achieve the level of emotional intelligence required to understand what makes us happy? As AI becomes increasingly integrated into our daily lives, the need for AI to understand emotional behaviors and what they signal about our intentions and preferences has never been more critical.

Learn more
Expuni

Are emotional expressions universal?

Do people around the world express themselves in the same way? Does a smile mean the same thing worldwide? And how about a chuckle, a sigh, or a grimace? These questions about the cross-cultural universality of expressions are among the more important and long-standing in behavioral sciences like psychology and anthropology—and central to the study of emotion.

Learn more
introducing-voice-control-square

Introducing Voice Control

We’re introducing Voice Control, a novel interpretability-based method that brings precise control to AI voice customization without the risks of voice cloning. Our tool gives developers control over 10 voice dimensions, labeled “masculine/feminine,” “assertiveness,” “buoyancy,” “confidence,” “enthusiasm,” “nasality,” “relaxedness,” “smoothness,” “tepidity,” and “tightness.” Unlike prompt-based approaches, Voice Control enables continuous adjustments along these dimensions, allowing for precise control and making voice modifications reproducible across sessions.

Learn more

00/00