Stable Audio is Coming — Use Artificial Intelligence to Create Music (for free)

Today, Stability AI, a company known for open source AI tools and models such as Stable Diffusion and StableLM, launched its first music and sound generation AI product - StableAudio. The music industry is notoriously difficult to break into. Even if you have the talent and drive, you still need the skills and resources to create and produce music. But what if you don’t need any of this? What if you could create music with just creativity and a good AI prompt?

StableAudio is an AI tool that can generate music from scratch. You just need to provide some simple instructions and the AI ​​will do the rest. The official link is here: https://stableaudio.com/

What is StableAudio?

StableAudio is an original AI tool that uses generative AI technology to create high-quality music and sound effects. To use StableAudio, you simply provide a descriptive text prompt and the desired audio length. For example, you can enter "post-rock, guitar, drum kit, bass, strings, upbeat, uplifting, melancholy, smooth, raw, epic, sentimental, 125 BPM" to generate a 95-second post-rock style track. StableAudio is great for musicians who want to create samples in their music. You can use it to create sound effects, background music, or even your own original compositions.

try it yourself

Go to the StableAudio dashboard and register:

bdf1cdd51c3c45793578cb7afd7f22e9.jpegStableAudio

Then, go to the Generate Music dashboard to start generating your own music:

8b1d96fbfefbf08dbf632d5c6bdb7d73.jpegStableAudio

Enter your prompt and set the duration. Please note that the maximum audio length for free subscriptions is 20 seconds.

Click the right arrow button to start audio generation.

87c44c9ff71af62635cc29fe7fd15a28.jpegStableAudio

In the meantime, you can explore the provided examples in StableAudio's User Guide section:

ceb45a384d19a44841c4fa926fe7cdac.jpegStableAudio

how it works

Here are some key technical details of how StableAudio works:

3ec80753ca92bab740228c689108d253.jpegStableAudio technical background

  • VAE compresses stereo audio into a data-compressive, noise-resistant, and reversible lossy latent encoding, making generation and training faster than using raw audio samples directly.

  • Text encoders are used to extract features from text cues. These features are then used to tune the diffusion model.

  • The diffusion model is a U-Net-based model that uses a combination of residual, self-attention, and cross-attention layers to denoise the input and reconstruct the desired audio.

Another important piece of information is that the StableAudio model uses a dataset of over 800,000 audio files, including music, sound effects, and single instrument tracks. This equates to more than 19,500 hours of audio.

final thoughts

Overall, I'm very impressed with this new AI tool. The quality of the audio is comparable to audio created by human professionals. StableAudio is a game-changing tool that could disrupt the entire music and sound industry.

·  END  ·

HAPPY LIFE

86747748292ed739a79ad2c8c86fbee8.png

This article is for learning and communication only. If there is any infringement, please contact the author to delete it.

Guess you like

Origin blog.csdn.net/weixin_38739735/article/details/134680372