Stability AI launches its first text-to audio AI platform, allow users to create songs by prompting lyrics

by The Technical Blogs


Stability AI, a London-based generative AI company, has recently unveiled a new text-to-audio AI platform called “Stable Audio.” The platform, powered by artificial intelligence, represents the company’s first foray into music and sound generation. It can produce songs of up to 90 seconds in length, making it suitable for a variety of projects, including commercials, audiobooks, and video games.

The company has been one of the prominent leaders in the AI world. However, until now, it was mostly known for AI-generated visuals. However, with the introduction of its first text-to-audio generative AI platform, it is in direct competition with other industry leaders, including OpenAI, Google, and Meta.

Reportedly, the Stable Audio platform uses a diffusion model, the same AI model that powers the company’s more popular image platform, Stable Diffusion. However, in the case of its text-to audio based Stable Audio, the model has been trained with audio data instead of images. This allows users to generate songs or background audio of any length, making it a versatile tool for a variety of projects.

Additionally, the Stable Audio platform addresses the limitations of conventional audio diffusion models by undergoing music-specific training and incorporating text metadata that specifies the starting and ending times of a song. This allows users to generate songs of any length, which is a valuable feature for music production.

Previously, audio diffusion models could only generate audio clips of fixed durations. This limited their ability to produce complete songs. Stability AI has improved the model to provide users of Stable Audio with greater flexibility in determining the length of the generated song, granting them more control over the creative process.

“Stable Audio represents the cutting-edge audio generation research by Stability AI’s generative audio research lab, Harmonai,” the company said in a statement, as reported by The Verge. “We continue to improve our model architectures, datasets, and training procedures to improve output quality, controllability, inference speed, and output length.”

As per the company’s statement, Stable Audio platform has been trained using an extensive dataset of over 800,000 audio files, including music, sound effects, and individual instrument stems. The dataset also includes text metadata from AudioSparx, a stock music licensing company. This comprehensive dataset covers a staggering 19,500 hours of diverse sounds. Stability AI notes that it has secured the necessary permissions to utilise copyrighted materials through its partnership with a licensing company.

For the users who want to utilise the platform, Stability Audio is offering three distinct pricing tiers for users who want to use the platform.

  • The free version grants users the ability to generate up to 45 seconds of audio for a maximum of 20 tracks per month.
  • The Professional level is priced at $11.99 and allows users to create 500 tracks, each of which can be up to 90 seconds in duration.
  • The Enterprise subscription is available for companies seeking customised usage plans and pricing structures.

Notably, in the free version, users are restricted from using the audio they generate with Stable Audio for commercial purposes.

Meanwhile, text-to-audio generation is not a new concept. There have been many prominent players in the field of generative AI who have been experimenting with this idea for some time. For instance, in August, Meta unveiled AudioCraft, a suite of generative AI models designed to create natural-sounding speech, sound, and music based on prompts. However, AudioCraft is currently only available to researchers and select audio professionals. Google also launched MusicLM a few weeks also, which allows individuals to generate audio, but it is also limited to researchers.

Published On:

Sep 14, 2023


Source link

Related Posts

Leave a Comment

Recent Posts

Pigeons swarm Las Vegas neighborhood, nesting at church Study finds adult female elk are badass and can’t be... Vacancy: some more elephants needed in the bush THE TECHNICAL BLOGS

Our Policies

Userful Links

Shop Stores

Copyright @2020  All Right Reserved - Designed and Developed by DSF SEO COMPANY