ElevenLabs, the viral AI-powered platform for creating synthetic voices, has raised a new round of cash.
Today, the startup announced the closing of a $19 million Series A round co-led by entrepreneurs Nat Friedman and Daniel Gross along with Andreessen Horowitz. Other attendees included heavyweight Creator Ventures, SV Angel, Instagram co-founder Mike Krieger, Oculus co-founder Brendan Iribe, Deepmind and Inflection AI co-founder Mustafa Suleyman, and OReilly Media founder Tim OReilly.
A source familiar with the matter tells TechCrunch that the tranche values ElevenLabs at $99 million post-money, a respectable figure, especially considering the startup launched just over a year ago.
This investment will be used to continue buildingElevenLabs cutting-edge research hub for voice AI and to launch a range of add-on products to support specific vertical markets such as publishing, gaming, entertainment and conversational applications, said the co-founder and CEO Mati Staniszewski to TechCrunch via email.
ElevenLabs, which has made headlines in recent months for both good and abhorrent reasons, was founded by Staniszewski, who formerly worked at Palantir, and his childhood friend Piotr Dabkowski, a former Google employee. Inspired by the mediocre dubbing of American films they saw growing up in Poland, their native country, the two set out to design a platform that could do better by leveraging artificial intelligence, of course.
ElevenLabs can transform text into speech using synthetic voices, cloned voices or completely new artificial voices that mimic the sounds of people of various genders, ages and ethnicities. The company’s AI TTS models are language agnostic, allowing enterprise customers to refine them and create their own proprietary speech models.
Coinciding with the rise in Series A, 15-employee ElevenLabs is launching Projects, a workflow for editing and creating long-form spoken content. With Projects, users can generate dialogue segments and even audiobooks without having to leave the platform.
For business-to-business partners, our technology can be used in areas such as creating multilingual, scalable audiobooks, voicing characters in video games, voicing digital items, assisting the blind to access content online writings and AI radio feed, said Staniszewski.
ElevenLabs, which launched in beta at the end of January, caught on rather quickly thanks to the extremely high quality of the generated entries, fast generation times, and generous free tier. But as mentioned above, the publicity hasn’t always been good, especially when bad actors have started exploiting the platform for their own ends.
4chan, the infamous message board known for its conspiratorial content, used the ElevenLabs tool to share hate messages that impersonated celebrities like actress Emma Watson. Elsewhere, The.Verges James Vincent was able to leverage ElevenLabs to clone the voices of targets in seconds by generating audio samples containing everything from threats of violence to expressions of racism and transphobia.
In response, ElevenLabs said it would introduce a number of new security measures, such as limiting voice cloning to paid accounts, banning users who repeatedly violate its terms of service, and providing a new AI detection tool.
The detection tool launches today. Called the AI Speech Classifier and available as an API to select partners, it’s designed to detect if an uploaded audio sample contains AI-generated content from ElevenLabs.
Ensuring that AI platforms can be adopted safely is a key challenge for the entire AI-generated industry, including text, image and voice platforms, Staniszewski said. We need to ensure that people are educated about the nature of the generative media landscape and know that such content is out there, we are committed to building tools to help people detect AI-generated content, in the interest of transparency.
A voluntary detection tool, assuming it even works as advertised, will not necessarily deter bad behavior. But there’s another elephant in the room that ElevenLabs hasn’t addressed: the existential threat its technology poses to voice actors.
Motherboard writes about how voice actors are increasingly being asked to sign off the rights to their voices so that customers can use AI to generate synthetic versions that could eventually replace them at times for no additional compensation. Internal emails seen by The New York Times, meanwhile, indicate that Activision Blizzard, one of the world’s largest game publishers, is working on tools for AI-assisted voice cloning.
It would appear that ElevenLabs sees this as the natural progression of things, advertising its work with publishers like Storytel and media platforms like TheSoul Publishing and MNTN for audiobooks, video games and radio content. (Storytel and TheSoul Publishing are strategic investors.) The company says it has more than one million registered users in the creative, entertainment and publishing spaces who have created a decade of audio content.
ElevenLabs plans to eventually extend its AI models to voice dubbing, following in the footsteps of startups like Papercup and Deepdub and building what it calls a foundation for being able to transfer emotion and intonation from one language to another.
This will make it possible to dub any video into any language in an engaging, effective and scalable way, all while maintaining the original voice of the speaker, writes ElevenLabs in a press release. [We are] is already conducting a series of tests with industry partners to enable AI voice acting at scale.
With $21 million in the bank ($2 million of which came from a pre-seed round in January), ElevenLabs’ aftermath is damn laser-focused on beating its rivals in the burgeoning generative voice space. They include incumbents such as Amazon, Google and Microsoft as well as startups such as Murf, Tavus, Resemble AI, Respeecher, Play.ht and Lovo.
#Speech #Generation #Platform #ElevenLabs #Raises #19M #Launches #Tracking #Tool