Podcast recording and enhancing platform Podcastle is now becoming a member of different corporations within the AI-powered, text-to-speech race by releasing its personal AI mannequin referred to as Asyncflow v1.0. An API for builders may also be out there, permitting them to immediately combine the text-to-speech mannequin of their apps.
Due to the brand new mannequin, the corporate is ready to provide greater than 450 AI voices that may narrate your textual content. The startup stated that it developed the expertise and mannequin in such a means that its coaching and inference prices are low, giving it a bonus in opposition to opponents.
With the transfer, Podcastle joins quite a lot of startups, together with ElevenLabs, Speechify, and WellSaid, which have developed expertise and AI fashions to transform any sort of textual content right into a voice clip narrated by AI. This expertise spans use instances like advertising and marketing, commercial, content material creation, schooling, and company coaching.
Podcastle’s founder, Arto Yeritsyan, advised TechCrunch that the corporate had at all times wished to construct a text-to-speech mannequin, however the price of coaching and information necessities for that had been very excessive.
“We wished to construct a strong text-to-speech mannequin since our inception. Nonetheless, the prices of improvement had been very excessive. Due to latest giant language mannequin developments, we had been in a position to attain a breakthrough final 12 months to get to a spot the place we might construct a high-quality voice mannequin while not having a ton of knowledge,” Yeritsyan stated.
The corporate was additionally aided in its efforts by its $13.5 million Collection A fundraise final 12 months.
Yeritsyan stated that whereas Podcastle expenses round $40 per 500 minutes of text-to-speech conversion, ElevenLabs expenses $99 for a similar.
Podcastle’s voice cloning function is getting an improve, as effectively, to create a faster course of for coaching.
Earlier, the coaching course of concerned studying roughly 70 completely different sentences. Now, it simply wants just a few seconds of recording from you to create a clone of your voice. The brand new course of additionally used Podcastle’s Magic Mud AI, which was launched final 12 months, to enhance audio recording high quality.

In our testing, the voice created with the brand new course of sounded a bit robotic, although it mimicked our tone. The corporate stated that, over time, it’s going to enhance the function. Plus, you possibly can prepare completely different samples of your voice to get completely different outcomes.
Podcastle stated that other than prices, having instruments for audio, video, podcasts, and AI-powered narration below one redesigned website will give it an edge over opponents. Yeritsyan stated that whereas nearly all of the customers use Podcastle to work on audio content material, video is catching as much as it as effectively.