Just a couple of weeks after ElevenLabs debuted its generative AI voice engine that lets you create a voice using text prompts, Hume AI is now offering a series of AI voice bots within an easy-to-use app wrapper that you can use from a web browser.
The app pulls from the company’s own speech-language model, EVI 2, with additional LLMs used on a ‘supplemental’ basis, including Claude 3.5 Haiku from Anthropic, and feels positioned as a competitor to OpenAI’s ChatGPT Advanced Voice model (which just hit Mac and Windows).
And, while I’m impressed by just how simple it is to get started, there’s definitely some fine-tuning to be done with some aspects of the app so far.
I’ve been testing it with some general prompts and have found some of it to be genuinely impressive, while others fall behind.
Hands-on with the Hume AI app
Introducing the new Hume App Featuring brand new assistants that combine voices and personalities generated by our speech-language model, EVI 2, with supplemental LLMs and tools like the new Claude 3.5 Haiku from @AnthropicAI. pic.twitter.com/Tej3f7mBFWNovember 4, 2024
The fun of the Hume AI app is that it compartmentalizes multiple voices, each with its own tone and style to make it feel like you’re choosing to speak to different ‘people’ for different topics.
There’s one for quick, chatbot-style answers, for example, while another is focused on philosophical advice. Each functions the same way – you click, and speak through the mic, and there’s no Hume account required if you want to give it a go.
I asked the Quick Answers chatbot how tall the Eiffel Tower is, and got, well, a quick answer, followed by additional information about how it’s been added to over time, and how big certain sections of it are.
I asked the Storytelling one for a story about a car, and while I wasn’t expecting a Pixar-rivalling epic, it tripped over itself multiple times. It repeated lines, and even changed voice at one point which was pretty jarring, but it was happy to receive additional prompts to help direct the flow of the story (sadly, the story it provided, of a car called Cara looking for a power source, is unlikely to win any awards any time soon.
On the other hand, there’s some overlap between some of the voices, and I actually found that a nice way of acknowledging there’s no one true answer.
I asked the Spirituality voice how I could better live in the moment, and it suggested feeling the breeze through my hair, the sun on my skin, and that I try eating a mango, bizarrely.
The same prompt on the Deeper Questions bot prompted smelling my morning coffee, and the way sunlight hits a desk for some reason. Interestingly, the Deeper Questions bot, like the Storytelling one, kept repeating some lines of dialog.
I’m definitely curious to see how things expand from here, and I think Hume has a solid base to build from if it can fix those minor teething issues.