- Thread Author
- #1
Is anyone still doubting the all-consuming, unstoppable nature of AI? Maybe I should make an 'Anti-AI Copers' Sequel...
Previously AI Voice Cloning tech did not have the quality and expressiveness to be taken seriously. Solutions like XTTS or Coqui-tts served as a temporary foothold in the field with noticeable audio artifacts. Open-source competition lagged behind proprietary solutions such as OpenAI's Whisper models or ElevenLabs, yet enough inconsistencies were present to justify hiring actual people to voice act.
More recently developments in AI, far from entering a Winter, have only continued to improve in quality. It turns out that you can reliably train AIs on any format of information so long as it can be tokenized and assigned embeddings (excitingly there have been recent advancements utilizing AI in Chemistry); audio is no exception. The leaderboard for TTS models can be found here.
In the past few days there have been two watershed moments for AI text to speech and voice cloning, that of the OpenAudio model and now the updated ElevenLabs V3.
RIP voice actors or anyone that uses their voice for a living!
@MongoloidJoe how do you feel that ASMR artists might be getting replaced by robots in the near future?
Previously AI Voice Cloning tech did not have the quality and expressiveness to be taken seriously. Solutions like XTTS or Coqui-tts served as a temporary foothold in the field with noticeable audio artifacts. Open-source competition lagged behind proprietary solutions such as OpenAI's Whisper models or ElevenLabs, yet enough inconsistencies were present to justify hiring actual people to voice act.
More recently developments in AI, far from entering a Winter, have only continued to improve in quality. It turns out that you can reliably train AIs on any format of information so long as it can be tokenized and assigned embeddings (excitingly there have been recent advancements utilizing AI in Chemistry); audio is no exception. The leaderboard for TTS models can be found here.
In the past few days there have been two watershed moments for AI text to speech and voice cloning, that of the OpenAudio model and now the updated ElevenLabs V3.
RIP voice actors or anyone that uses their voice for a living!
@MongoloidJoe how do you feel that ASMR artists might be getting replaced by robots in the near future?