Overnight, "AI Stefanie Sun" became popular all over the Internet!

 Datawhale dry goods 

Latest: AI application , editor: Xinzhiyuan

7744ca135db612dc3471a0cebd465df8.jpeg

[Guide] Recently, an "unpopular singer" relied on an AI substitute to cover Chinese music songs and became popular all over the Internet.

Overnight, "AI Stefanie Sun" became popular all over the Internet.

On Station B, AI Stefanie Sun covered JJ Lin's "She Said", Jay Chou's "Love in BC", Zhao Lei's "Chengdu" and so on, which made many netizens fall into a deep trap.

abd881221fcb632ab53d8a5f1bef9c03.gif

"Unpopular singer" Stefanie Sun has just become a popular singer in 2023, setting off a star-chasing carnival among many people.

A netizen said, "After listening to AI Stefanie Sun all night, I can't get out..."

29353324f96ffa1402522e623080bd5b.png

These cover songs are self-made and uploaded by UP masters such as Eternity丨L and Roster_x through open source projects.

(The author seems to have deliberately added a second of blank space in the "Peninsula Iron Box" to make up 5 minutes and 20 seconds)

ff7a635dbe805e68521399325a60e4be.png

UP Master: Eternity丨L

In addition to AI Stefanie Sun, there are also AI Jay Chou, AI Wang Xinling, AI Lin Zhixuan...

42d4e44c22a6d8ca96c5cf70f5f28467.png

Perhaps many people never dreamed that in 2023, the Chinese music scene would be revived in this form.

"AI Stefanie Sun" opens online

Some time ago, a TikTok netizen used AI to create a song "Heart on My Sleeve", which quickly became popular on the Internet, attracting more than 10 million people to watch.

Netizens who have listened to this song said that they surprised me, it was crazy!

The song was written with the voices of two American pop musicians, Drake and The Weeknd. First train the AI ​​through the voice of the singer, and then use the AI ​​to create.

3f5a0966e2b08b08536f14fc4df9c69c.png

In China, the Chinese music songs sung by AI on Station B have gradually become the focus of many people's attention. Stars such as Stefanie Sun, Wang Xinling, and Jay Chou have "come back".

And the most popular is Stefanie Sun, who directly became the new darling of AI with the title of "Queen of Voice".

ab3705802fe61ff4ece8553d82536089.png

UP master: Roster_x

Someone even made the Cantonese version of "Love Comes Too Late" by AI Stefanie Sun.

However, for AI music production, it is not a new thing in the entire music industry. It's just that the popularity of generative AI has lowered the threshold for AI cover songs again.

For example, at the beginning of the year, Google also launched the text-to-music model MusicLM, by treating the music generation process as a layered sequence-to-sequence modeling task, and generating high-fidelity music at a frequency of 24 kHz.

93ea5345ee497048c96e88afd2fc36c2.png

For many fans, the AI ​​cover satisfies many of their fantasies to a certain extent.

There are also some fans who have trained the AI ​​​​of late classic old singers, including Ah Sang, Leslie Cheung, Yao Beina, Teresa Teng and so on.

This may be a kind of digital immortality, a way to bring long-lost voices back to people's hearts.

14c7c0e44e4a248fce16b753096df0c2.png

Midjourney's super ability to produce realistic drawings made people exclaim that the painter was about to lose his job. For AI cover, is the singer also going to be replaced?

After a UP master @阿张Rayzhang sang Killer Queen with the AI ​​trained by his own timbre, he felt terrible for a moment.

7264605e1d8782c8302dcf6c943f780b.gif

After urgently recording a video, he attached the title "Will the AI ​​singer make the cover area collectively unemployed? I was killed by the AI ​​version of me!".

Some netizens said that they are the first batch of AI victim painters, and they feel that no profession can escape.

327392e3cad8826797f95cd5b7adaafe.png

Some people also said that some parts of the cover are not like it at all.

You must know that for AI cover songs, rich training data for specific artist timbres are also needed, so that the works generated by AI are more realistic.

a4f69e92e57b50af297aea75464d2757.jpeg

As far as the current technology is concerned, although the singer's singing, skills and style cannot be completely imitated, the timbre can basically be completely reproduced.

But the real everyone cannot be replaced.

05fa4df7a391ba34dc56b575ff979a88.png

Although AI cover songs are popular, the other side of music created by AI is the imminent copyright issue.

After the "Heart on My Sleeve" created by AI became popular on TikTok, the full version was uploaded to Apple Music, Spotify, YouTube and other platforms.

In this regard, American singer Drake expressed his dissatisfaction on Ins, "This is the last straw (that broke the camel's back)." Currently, the song has been removed due to copyright infringement.

According to the Financial Times, Universal Music Group, which owns the copyrights of superstars such as Taylor Swift and Bob Dylan, is urging Spotify and Apple to prevent AI tools from grabbing lyrics and melodies from their artists' copyrighted songs.

But some artists are not stingy with their own voices. Grimes, Musk’s ex-girlfriend, said online,

"Anyone can use my voice AI to generate songs." However, another 50% of the copyright has to be paid.

4512959ee96f52a03da296ff0d6d184e.png

And the author of the original project "so-vits-svc" behind the AI ​​cover of this fire is said to have deleted the project because too many people abused it.

SoVitsSvc: Singing voice conversion

fdd327a9527f5fc182e3db9d1327c6e1.png

Project address: https://github.com/svc-develop-team/so-vits-svc

The singing voice conversion model uses the SoftVC content encoder to extract the speech features of the source audio, and then feeds the vectors directly into VITS instead of converting to an intermediate text-based format. Therefore, both pitch and pitch can be preserved.

In addition, the project developers also solved the problem of sound interruption by using NSF HiFiGAN as a vocoder.

d3c00a78da01e7c58698f5a7f033134a.png

· Feature input is changed to Content Vec · The sampling rate is uniformly used at 44100Hz 

Due to the change of parameters and the simplification of the model structure, the GPU memory required for inference is significantly reduced. 

· Added option 1: automatic pitch prediction in vc mode, which means that there is no need to manually enter the pitch key when converting voices, and the pitch of male and female voices can be automatically converted. However, this mode causes a pitch shift when converting songs. 

Added option 2: Reduce timbre leakage through the k-means clustering scheme, making the timbre more similar to the target timbre. 

Add option 3: Add NSF-HIFIGAN enhancer, which can enhance the sound quality of some models with few training sets, but has a negative impact on the trained model, so it is turned off by default.

Pretrained model file

Put checkpoint_best_legacy_500.pt in the hubert directory.

Put G_0.pth and D_0.pth in the logs/44k directory.

preprocessing

0. Audio Slicing

Use the audio-slicer-GUI or audio-slicer-CLI tools to slice the original audio to 5-15 seconds.

It’s okay to be longer, but too long (such as 30 seconds) may cause “torch.cuda.OutOfMemoryError” during training or even preprocessing, commonly known as bursting video memory.

After slicing, remove long and short audio.

1. Resampled to 44100Hz and Mono

 
  
python resample.py

2. Automatically divide the data set into training set and validation set, and generate configuration files

 
  
python preprocess_flist_config.py

3. Generate hubert and f0

 
  
python preprocess_hubert_f0.py

After completing the above steps, the dataset directory will contain preprocessed data and the dataset_raw folder can be deleted.

Now, you can modify some parameters in the generated config.json -

keep_ckpts: Keep the last keep_ckpts model during training. Setting to 0 will keep all models, default is 3.

all_in_mem: Load all datasets into RAM. It can be enabled when the disk IO is too low on some platforms and the system memory is much larger than your dataset.

train

 
  
python train.py -c configs/config.json -m 44k

reasoning

The model needs to use "inference_main.py".

for example:

python inference_main.py -m "logs/44k/G_30400.pth" -c "configs/config.json" -s "nen" -n "君の知らない物語-src.wav" -t 0

Although the original project team has stopped maintaining, many netizens have forked and made some updates.

For example, the following graphical interface:

f9b120cc1091adbd886f797335e3a37f.png

Project address: https://github.com/voicepaw/so-vits-svc-fork

AI "Resurrection"

In addition to AI cover, many netizens have done similar projects before. For example, "AI-Talk" allows Musk and Jobs to have a conversation through time and space.

In the video, AI not only simulates their voices, but also simulates their dialogue ideas to a certain extent, making the communication process very smooth.

AI makes it possible for us to have a dialogue with the dead. Previously, the UP master of station B also resurrected the old lady with AI.

95d2c48490578593f7a31ee35754af06.gif

For the voice production of the old lady, the audio that has been in the past is directly uploaded, and the material basically comes from the past telephone recording, video video or WeChat voice.

And use the audio editing software AU to adjust, the direction of adjustment is mainly in noise reduction, human voice enhancement and so on.

e8017b35eb1e6688c88be360da6f39b4.gif

Then cut the clearer audio samples into short sentences of several seconds for easy annotation. Finally, the processed audio is packaged and put into the speech synthesis system.

Using the speech synthesis system, you can try to enter text-to-speech.

Netizens witness the hard work of science and technology

AI Stefanie Sun's song has reached the hearts of many netizens.

f8ad3f628e58d5ce968ae0eec63e10ea.png

Recently, I have been obsessed with AI "cover songs", from AI Kanye singing fine wine, down to Su Xiaoding singing the truth is true. But to be serious, Stefanie Sun's cover song is indeed the best in AI.

6a34be77d4c8c08af34f8a2fccdff9e0.png

AI Stefanie Sun, who has been addicted to station B these days, just listened to "A Game, A Dream".

913ea476afd69b0454e31739b5280939.png

After listening to the songs sung by AI, many netizens felt the horror of AI singers:

The power of technology is truly mind-boggling.

8069d166fbdb23fa493be3f9f03f7290.png

Deeply feel what is called the power of technology...

10036c7250f6e8bbc2ae8623a2963f40.png

This is AI life, digital soaring!

95ee62b239d250d2a1277db1f8018773.png

There are also netizens' nostalgia for the deceased singer.

cab3ca5f9a0f5c7d73d8872297cb5ef0.png

References:

https://github.com/svc-develop-team/so-vits-svc

https://www.bilibili.com/video/BV1io4y1w73k/?vd_source=eecf800392d116d832e90ad1c9ae70f6

bc7c04a3f7bd287f48278cd9964e6237.png

It's not easy to organize, so I like it three times

Guess you like

Origin blog.csdn.net/Datawhale/article/details/130591873
Recommended