MusicGen one-click music style migration

"  Imagine that you can create brisk country music, lingering blues, epic orchestral music as you like... There will be no obstacles on the road of video BGM creation!. "

01

What is MusicGen

      Based on the powerful Transformer model, Meta MusicGen follows in the footsteps of language models such as ChatGPT and adopts cutting-edge AI technology to predict and generate music clips. Just like a language model predicts the next letter in a sentence, MusicGen predicts the next piece of music given a piece of music.

To accomplish this feat, Meta's researchers leveraged the EnCodec audio tokenizer, which breaks down audio data into smaller units for efficient processing. The brilliance of MusicGen is its ability to handle both textual descriptions and musical cues, allowing for a seamless blend of artistic expression.

Training MusicGen involved using a massive dataset consisting of 20,000 hours of licensed music. The team draws on an internal collection of 10,000 high-quality recordings, supplemented with music data from reputable sources such as Shutterstock and Pond5. This meticulous training process ensures that MusicGen has the ability to create music that resonates with the listener.

Trial address: MusicGen - a Hugging Face Space by facebook

02


MusicGen Online Experience

        First of all, we prepare some BGM, such as some passionate, melancholy, and quiet music, and then we open the above link

cc5069a6908e462c25b0e7f0ee62154d.png

Then drag in your own music and transform into a passionate style

71732802d018bef9adf08a1954af889b.png

    The sound effect is quite shocking, with a taste of original music, but the generated music is more powerful and passionate

03


MusicGen Local Deployment

If you're not satisfied with the 15-second duration of a huggingface link, you can try a local deployment. Of course, local deployment has relatively high requirements for graphics cards. The official requirement is 16GB of video memory.

5dd6b7cfa7e90a18346b3fc61cb599f0.png

首先我们打开git仓库facebookresearch/audiocraft: Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning. (github.com)

Then follow the installation guide to install the corresponding environment and dependent packages

# Best to make sure you have torch installed first, in particular before installing xformers.
# Don't run this if you already have PyTorch installed.
pip install 'torch>=2.0'
# Then proceed to one of the following
pip install -U audiocraft  # stable release
pip install -U git+https://[email protected]/facebookresearch/audiocraft#egg=audiocraft  # bleeding edge
pip install -e .  # or if you cloned the repo locally

After starting, enter the following interface, you can adjust the duration and style

985122a8c2478b0980e5ab6832fc8320.png

Then we use the same sound source to generate a desired style of music

39eecb06cb54a9f0748b603c8ebcf39b.png

The graphics card is a bit slow, and the time is relatively long. For this effect, anyway, I don’t have the courage to listen to the music after it is generated. I feel really depressed after listening to it.

MusicGen has a powerful music learning ability, it has studied tens of thousands of musical instruments, and is well versed in music theory and musical form. The musical works produced are like the hands of human musicians. What are you waiting for? Hurry up and experience MusicGen, let the joy of creation come back to your heart!

If there is a problem with the environment configuration, you can follow the official account and reply to AudioCraft to get the local one-click start integration package

Guess you like

Origin blog.csdn.net/wutao22/article/details/131799393