Cracking YouTube Video Recommendation Algorithm Practice

If you are a content worker for a certain distribution channel (such as movies, dramas, TV shows, online videos), then the success or failure of the content depends on the operating logic of the distribution mechanism. For example, if you produce a TV show and you want it to become popular, then you have to know where to insert advertisements, how to promote the show, which channel to broadcast it on, how many families can watch the selected channel, etc. Etc., and so on.

If your distribution channel is YouTube, then the most important thing you should understand is how YouTube’s algorithm works. However, for all platforms in the world that are operated by algorithms, it is not an ordinary difficulty to figure this out.

YouTube does not make the variables used by their algorithm public. To understand how its algorithm works, even if the data is very limited, we have to look into this big black box. We cannot get any data on some variables that the algorithm relies on (such as thumbnails, title impressions, user access history, user behavior, session information, etc.). If we can get these data, it is equivalent to changing YouTube’s algorithm She took off her clothes for us to see, but, haha, no.

It seems that we have nothing, but we still want to use the data we have to roughly understand the logic of the algorithm. So, my former colleague (why "former" colleague? Because I recently resigned from Frederator, wow) Jeremy Rosen spent half a year analyzing the channel data owned and operated by Frederator, trying to figure out YouTube's algorithm.

Before we start, let’s make it clear: the algorithm referred to in this article includes multiple YouTube growth algorithms (Recommended, Suggest, Related, Search, Original Rating) MetaScore), etc.). These different algorithm products have different focuses, but they have one thing in common, that is, they have the same optimization goal, which is watch time.

Watch time

Let me make it clear first, "viewing time" does not mean the number of minutes watched. We have also discussed this concept before [1]. The viewing time is composed of the following indicators:

  • Visits
  • Visit and stay
  • Session starts
  • Upload frequency
  • Session length
  • session ended

Essentially each of the above is related to how well the channel and its videos perform, whether people visit frequently (start a session of page visits) and whether they stay for a long time.

To accumulate the value of any variable in the algorithm, someone must first visit your channel and video. For a video to be successful (success is defined as more than half of the subscribers visiting in the first 30 days), it needs to receive a large number of visits within the first few minutes, hours, and days after the video is released. We call this visits Velocity (View Velocity)

Access and access rate

We analyzed the access rate of Frederator and found that the cumulative number of visits during the entire life cycle is exponentially related to the percentage of visits by subscribed users in the first 48 hours.

Cracking YouTube video recommendation algorithm practice Cracking YouTube video recommendation algorithm practice

Percentage of subscribers who visited within 48 hours vs. average number of visits received

Based on this observation, we dug a little deeper and found that using this rate rule to predict whether a video will be successful can achieve an accuracy of 92%. In fact, there is a more direct correlation: between the percentage of subscribers who visited within 72 hours and the cumulative number of visits in the entire life cycle of the video.

Cracking YouTube video recommendation algorithm practice Cracking YouTube video recommendation algorithm practice

Percentage of subscribers who visited within 72 hours vs. cumulative number of visits over the entire lifetime

These two graphs and the correlation coefficient fully demonstrate that the number of visits and the visit rate have a direct and important impact on videos and channels. In addition, we have evidence that the reverse holds true. Poor access rates not only affect the video itself, but also the previous and next videos.

The figure below illustrates that if the access rate of Frederator's previous video within 48 hours is poor (less than 5% of subscribers access it), then the next uploaded video will also be affected.

Cracking YouTube video recommendation algorithm practice Cracking YouTube video recommendation algorithm practice

The relationship between the percentage of subscribers who visited the next video and the average percentage of subscribers who visited the previous two videos

This data confirms Matthew Patrick's theory: If a certain video is not clicked well, then YouTube will not give much weight to your next uploaded video so that it can be seen by your subscribers. [2]

It could also be that because the last video performed poorly, the number of visits to your channel will decrease, naturally resulting in fewer subscribers accessing it in a native way. No matter "why", the result is Jiang Zi.

Another negative impact of negative rates on new uploads is that there is evidence that this can also hurt your entire video library. The first picture below is the 7-day average percentage of subscribers who visited within 48 hours after the video was uploaded (Translator's Note: Several videos were uploaded in the past 7 days, and the percentage of subscribers who visited within 48 hours after each video was uploaded is recorded. Then take the average of these percentages) versus the total number of visits to the channel (Translator's Note: Reflects the performance of the entire video library). The second graph is the relationship between the percentage of overall subscribers who visited the video on a given day versus the overall number of visits on that day.

Cracking YouTube video recommendation algorithm practice Cracking YouTube video recommendation algorithm practice

Seven-day average "% of subscribers who visited the video within 48 hours" versus the daily total number of video visits for the entire channel

Cracking YouTube video recommendation algorithm practice Cracking YouTube video recommendation algorithm practice

The relationship between the seven-day average number of subscriber visits and the total number of visits

These icons all tell one thing: once the percentage of visitors to new uploads and the entire video library goes down, the overall number of visits to the channel will also go down. The takeaway for us is this: YouTube’s algorithm values ​​channels that reach their core audience and penalizes those that don’t.

Visit and stay

Another indicator that the algorithm attaches great importance to is View Duration.

Visit retention is how long users will stay on a single video page. This variable is heavily weighted and a clear tipping point can be seen in our data. For one of Frederator's channels, in the first 30 days, a video with an average viewing time of 8 minutes received 350% more views than a video with an average viewing time of 5 minutes. The figure below shows the relationship between the number of video visits to a Frederator channel and the average visit duration.

Cracking YouTube video recommendation algorithm practice Cracking YouTube video recommendation algorithm practice

The relationship between the average visit duration and the average number of visits during the entire life cycle. Note that data with a visit duration of more than eight minutes is not considered here.

We also found that the longer the visit, the better the video performance. The picture below shows the relationship between the number of visits within seven days for videos that lasted less than 5 minutes (1), between 5 and 10 minutes (5), and for more than 10 minutes (10).

Cracking YouTube video recommendation algorithm practice Cracking YouTube video recommendation algorithm practice

The relationship between the average number of visits and the average visit duration within seven days

The picture below also means the same thing, but it is stretched from 7 days to the entire life cycle.

Cracking YouTube video recommendation algorithm practice Cracking YouTube video recommendation algorithm practice

The relationship between the average number of visits and the average visit duration during the entire life cycle

Based on these findings, we can draw a simple conclusion: publishing long videos can improve visitation. Frederator has a channel about children's playgrounds and uploads three to four videos of different lengths (3 minutes, 10 minutes, 30 minutes, 70 minutes) every week. We found that within 48 hours after each video was posted, the 70-minute video The number of visits far exceeds that of other length videos, even if it is just reposting some old videos of cooking leftovers. Otherwise, the 70-minute video has the same average dwell time as other versions of the video.

Therefore, we recommend that companies only upload 70-minute videos per week. Using this strategy, the channel's average daily visits increased by 500,000, while the number of videos we uploaded decreased by 75% in the past 6 weeks. Okay, okay, I know you are stimulated, don't worship me.

Session start, session duration, session end

This research was all possible thanks to my previous article: "WTF is WatchTime?" [1]

As a quick refresher, Session Starts are how many times users start accessing YouTube from your video. This actually shows how important it is for subscribers to have access to you in the first 72 hours. Subscribers are the first people to see your video after it's published, and they're also the most likely to click on your channel icon because they're already familiar with your brand.

Session Duration is how long your content allows users to stay on the YouTube platform. When they access your video and after accessing it, they stay on the platform. In addition to the average user visit duration (Average View Duration) and the number of unique views (Unique Views), there is no better data.

Session Ends measure whether users often leave the YouTube platform after watching your videos. This is a negative indicator exploited by algorithms, but we simply don’t have the data.

A theory of algorithms

YouTube's algorithm is designed to focus on channel performance rather than individual video performance. But it uses a single video to improve channel performance.

The algorithm combines individual video-specific data with aggregated data from the channel to decide which video to recommend. The ultimate goal is still to gather the target audience for the channel.

YouTube does this because:

1. Let users return to the YouTube platform frequently

2. Let users stay on the platform as long as possible

Here are three charts to prove this theory.

The first graph is the relationship between the proportion of subscribers who visited within 48 hours versus the total visits over 7 days. This graph illustrates that if you start a platform session with a large number of users starting from your video, then your video will get a lot of traffic. After reaching a threshold, it will increase exponentially.

Cracking YouTube video recommendation algorithm practice Cracking YouTube video recommendation algorithm practice

Total visits within 7 days vs. % of subscribers who visited within 48 hours

The second graph shows the relationship between the average daily visits in the channel and the percentage of subscribers who visited within 5 days.

Cracking YouTube video recommendation algorithm practice Cracking YouTube video recommendation algorithm practice

The relationship between average daily visits and the percentage of subscribers who visited within 5 days

This means that if you can consistently get a large number of users to visit YouTube from you (on average over the past 5 days), then the algorithm will tilt the daily visits to your entire channel video library.

The last graph is the relationship between the average daily subscriber visits and the percentage of subscribers who visited within 5 days.

Cracking YouTube video recommendation algorithm practice Cracking YouTube video recommendation algorithm practice

The relationship between the average number of subscribers who visit per day and the percentage of subscribers who visited within 5 days

We believe that all this shows that there is a correlation between the consistency of channel performance and the number of visits. The number of visits is reflected in the percentage of subscriber visits, and YouTube will therefore tilt the traffic to you.

Suppose you have a gaming channel with 100,000 subscribers. You upload 6 videos every day, and each video is visited by 5% of subscribers. Your average number of subscribers per video will remain stable at 5%. This means that you will generate 30% of the number of subscriber visits every day (30,000/day, 600,000/month). Now suppose you have 1 million subscribers, then the number of daily visits is 300,000, and the number of visits per month is 6 million.

We don't think this bit of math is deceptive. This means that YouTube is selecting some channels to recommend based on some indicators, and then the algorithm only needs to help this channel increase traffic.

But, brave men, please stay, the above is just a theoretical analysis!

a scoring algorithm

Here we plan to hack YouTube’s algorithm and rebuild one. 15 semaphores and our estimated weights were used to reconstruct the scoring algorithm. The semaphores are listed below:

Cracking YouTube video recommendation algorithm practice Cracking YouTube video recommendation algorithm practice

Semaphores/factors used to develop scoring algorithms

The following pictures are the actual effects of these semaphores.

Cracking YouTube video recommendation algorithm practice Cracking YouTube video recommendation algorithm practice

The correlation trend between the three-day algorithm average score and the number of visits

Cracking YouTube video recommendation algorithm practice Cracking YouTube video recommendation algorithm practice

Correlation trend between algorithm scoring and visits

The picture below is more detailed.

Cracking YouTube video recommendation algorithm practice Cracking YouTube video recommendation algorithm practice

Three-day average algorithm score and daily visits

I know you are still curious, so here are the various weights we simulated:

Cracking YouTube video recommendation algorithm practice Cracking YouTube video recommendation algorithm practice

Simulation of weight distribution of various algorithms

Cracking YouTube video recommendation algorithm practice Cracking YouTube video recommendation algorithm practice

Simulation of the weight distribution of each semaphore in the viewing duration optimization algorithm

Cracking YouTube video recommendation algorithm practice Cracking YouTube video recommendation algorithm practice

Weight distribution of each semaphore for related recommendations and other algorithms

However, we have no other data, so we are not sure which regression method should be used when calculating correlation. We can only say that most signals and algorithms are very correlated. It is for this reason that we remain enthusiastic about the YouTube algorithm.

Thoughts on YouTube’s algorithm

According to our data, at least 6 rough conclusions can be drawn:

1. YouTube uses algorithms to determine how many views our videos and channels can receive.

2. Successful channels focus on a specific type of content or creativity.

3. Once the channel itself knows what type of content is successful, stop wavering.

4. It is impossible for content producers to succeed on the YouTube platform just by relying on money, so rich producers are unlikely to embrace YouTube wholeheartedly.

5. Personalized shows/channels will always be the dominant content type on YouTube because this is the “specific type of content” people are looking for.

6. If the newly created channel cannot be diverted outside the YouTube site, it will be difficult to grow for a long time.

As mentioned earlier, YouTube focuses more on improving channel access. This view is only our speculation. Channels are able to upload many videos to gain and retain a large target audience. If you want to succeed on YouTube, the advice we can give is: aim for a very vertical interest type, and then continue to produce videos of more than 10 minutes. They must be videos of this interest type you have chosen.

This is a private blog. I need to remind you that YouTube has a lot of algorithmic ammunition in reserve. I hope they will not regard this article as negative news about the algorithm. Through this research, I am even more grateful to YouTube and its algorithm engineers for designing these algorithms with foresight. After all, they still want to work hard so that one billion users in the world can watch videos the same way within a month. If you could stop and look back at all of this as a whole, you would be amazed at how elegantly YouTube’s algorithm is designed to do an incredibly good job of achieving business goals and protecting the health of the platform. Give them 32 likes!

About the Author:

Matt Gielen is the former Vice President of Programming and Audience Development at Frederator Networks.

The team that Matt manages is the world's largest animation production network company, Frederator Network Channel.

He also led the team to produce and program Frederator Networks’ own YouTube operating channels: Channel Frederator, The Leaderboard, Cinematica.

You can also follow him on twitter @mattgielen.

Translation postscript:

I first saw this article shared by @fengyoung on Facebook. I thought the title was interesting, so I read it. After reading it, I felt very inspired, so I decided to translate it so that more people can see it.

This article inspired me in three aspects:

1. From the perspective of algorithm designers on the YouTube platform, the purpose of designing various recommendation algorithms is to increase the viewing time of the channel, and to increase the viewing time of the channel is to allow users to visit the platform frequently. This is a win-win thinking. To put it bluntly: whoever can help the platform retain users will be supported by the platform.

2. The article concludes that only by creating vertical content can you survive on YouTube. There is no doubt that the more diverse the content on the platform, the healthier the platform. Although I agree with this conclusion, I did not see how the author reached this conclusion in this article. This is the biggest difference between YouTube and domestic video platforms. Domestic video platforms are seriously converging. Purchasing exclusive copyrights at high prices seems to be the only way out for domestic video platforms, and it is also a demonized way out. Looking at YouTube, they use algorithms to drive various channels Specialize in a certain vertical content and then match you with the most suitable users. This is a more ambitious game of content chess.

3. The author of this article has given us a revelation. Algorithms are not black boxes and can be hacked. Although this can only hack the tip of the iceberg, it is much brighter than operating blindly. The author's research method is to first clarify what the algorithm goal of a platform is. YouTube is watch time, then observe which indicators this goal is related to, and further see how each indicator can be improved.

Guess you like

Origin blog.csdn.net/yaxuan88521/article/details/133130640