" Imagine that you can create brisk country music, lingering blues, epic orchestral music as you like... There will be no obstacles on the road of video BGM creation!. "
01
—
What is MusicGen
Based on the powerful Transformer model, Meta MusicGen follows in the footsteps of language models such as ChatGPT and adopts cutting-edge AI technology to predict and generate music clips. Just like a language model predicts the next letter in a sentence, MusicGen predicts the next piece of music given a piece of music.
To accomplish this feat, Meta's researchers leveraged the EnCodec audio tokenizer, which breaks down audio data into smaller units for efficient processing. The brilliance of MusicGen is its ability to handle both textual descriptions and musical cues, allowing for a seamless blend of artistic expression.
Training MusicGen involved using a massive dataset consisting of 20,000 hours of licensed music. The team draws on an internal collection of 10,000 high-quality recordings, supplemented with music data from reputable sources such as Shutterstock and Pond5. This meticulous training process ensures that MusicGen has the ability to create music that resonates with the listener.
Trial address: MusicGen - a Hugging Face Space by facebook
02
—
MusicGen Online Experience
First of all, we prepare some BGM, such as some passionate, melancholy, and quiet music, and then we open the above link
Then drag in your own music and transform into a passionate style
The sound effect is quite shocking, with a taste of original music, but the generated music is more powerful and passionate
03
—
MusicGen Local Deployment
If you're not satisfied with the 15-second duration of a huggingface link, you can try a local deployment. Of course, local deployment has relatively high requirements for graphics cards. The official requirement is 16GB of video memory.
首先我们打开git仓库facebookresearch/audiocraft: Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning. (github.com)
Then follow the installation guide to install the corresponding environment and dependent packages
# Best to make sure you have torch installed first, in particular before installing xformers.
# Don't run this if you already have PyTorch installed.
pip install 'torch>=2.0'
# Then proceed to one of the following
pip install -U audiocraft # stable release
pip install -U git+https://[email protected]/facebookresearch/audiocraft#egg=audiocraft # bleeding edge
pip install -e . # or if you cloned the repo locally
After starting, enter the following interface, you can adjust the duration and style
Then we use the same sound source to generate a desired style of music
The graphics card is a bit slow, and the time is relatively long. For this effect, anyway, I don’t have the courage to listen to the music after it is generated. I feel really depressed after listening to it.
MusicGen has a powerful music learning ability, it has studied tens of thousands of musical instruments, and is well versed in music theory and musical form. The musical works produced are like the hands of human musicians. What are you waiting for? Hurry up and experience MusicGen, let the joy of creation come back to your heart!
If there is a problem with the environment configuration, you can follow the official account and reply to AudioCraft to get the local one-click start integration package