An intelligent music recommendation system based on Mysql+Vue+Django's collaborative filtering and content recommendation algorithm - deep learning algorithm application (including all project source code) + data set


insert image description here

foreword

Based on the rich music data of Netease Cloud, this project uses collaborative filtering and content recommendation algorithms as the core methods, aiming to customize music recommendations for different users.

First of all, we make full use of the large amount of user data of Netease Cloud Music, including the user's listening history, favorite singers, favorite songs and other information. Through the collaborative filtering algorithm, we can analyze the similarity between different users and find user groups with similar music tastes.

Secondly, we introduce a content recommendation algorithm to conduct an in-depth analysis from the aspects of music characteristics, genres, and singer styles. This algorithm can more accurately recommend music works that match users' preferences and interests.

Combining the results of collaborative filtering and content recommendation, we create a personalized music recommendation list for each user. In this way, different users will be able to obtain songs that match their music preferences on the Netease Cloud Music platform, thereby enhancing their music experience.

The goal of this project is to give full play to the advantages of big data analysis and intelligent recommendation algorithms, and provide NetEase Cloud Music users with more personalized and diversified music recommendation services. This will bring more music discovery and enjoyment to users, and at the same time promote the development of the music platform and the improvement of user satisfaction.

overall design

This part includes the overall structure diagram of the system and the system flow chart.

System overall structure diagram

The overall structure of the system is shown in the figure.

insert image description here

System flow chart

The system flow is shown in the figure.

insert image description here

operating environment

This section includes the Python environment, the MySQL environment, and the VUE environment.

Python environment

A Python 3.6 or higher operating environment is required, and PyCharm IDE is recommended. In the Python package and the corresponding version MusicRecSys/MusicRec/z-others/files/requirement.txtfile, the installation command is:

pip install -r requirement.txt

The dependent packages that need to be installed are: Django 2.1, PyMySQL 0.9.2, jieba 0.39, xlrd 1.1.0, gensim 3.6.0

Check the IP address of the machine, and modify the local IP address and MySQL configuration information in the MusicRecSys/MusicRec/Music Rec/settings.py file ALLOWED_HOSTS.

Enter MusicRecSys/MusicRec, execute pythonmanage.pyrunserver0.0.0.0: 8000.

Django background access address ishttp://127.0.0.1:8000/admin/(admin, admin)

MySQL environment

Install the latest version of MySQL and the Navicat visualization tool, create and connect to the local user database on the command line, the download address is https://pan.baidu.com/s/1dYtKQxnoSZywuRfgCOfPRQ , the extraction code is: qw8k

Create a new musicrec database and import filesMusicRecSys/MusicRec/z-others/fles/musicrec.sql

VUE environment

Install node.js 10.13 or above and npm package manager (you can install Taobao mirror cnpm to improve speed), and VSCODEIDE is recommended.

The serverUrl being modified MusicRecSys/MusicRec-Vue/config/index.jsis the local IP address.

The serverUrl being modified MusicRecSys/MusicRec-Vue/src/assets/js/linkBase.jsis the local IP address.

Enter MusicRecSys/MusicRec-Vue, perform npminstall/npmrundevautomatic installation of the required dependency packages and use webpack to package and run.

Enter http://127.0.0.1:8001 in the browser to access the project interface.

module implementation

This project includes four modules: data request and storage, data processing, data storage and background, and data display. The functions and related codes of each module are introduced below.

1. Data request and storage

Get all data related to music and users by request. NetEase Cloud API address is https://api.imjad.cn . The song list data is selected as the starting point because the song list is related to users, songs, and singers, and contains the widest dimension of data, and it is the subjective behavior of users. The playlist URL is as follows.

https://music163.com/playlist?id=2308644764
https://music163.com/playlist?id=2470855457
https://music163.com/playlist?id=2291941158
https://music163.com/playlist?id=2452844647

Obtain the song list ID through URL processing, request the required data, store the data that failed each step of the request, and skip the data URLs that failed the request during subsequent data processing.

1) Song list information

The song list information is shown in the figure.

insert image description here

The song list information includes the song list ID, creator ID, name, creation time, update time, including the number of music, play times, share times, comments times, favorite times, tags and song list covers, etc.

2) Creator information

Creator information is shown in the figure.

insert image description here

Creator information includes user ID, nickname, birthday, gender, province, city, type, label, avatar link, user status, account status, djStatus, vipStatus, and signature.

3) Song music information

The song ID information is shown in the figure.

insert image description here

The song information is shown in the figure.

insert image description here
Song information includes song ID, song title, album ID, publication time, singer information, total number of comments, number of popular comments, size, song link.

4) The singer information corresponding to the song

Singer information is shown in the figure.

insert image description here

Singer information includes singer ID, song title, number of music works, number of MV works, number of albums, avatar links, etc. The data file structure is shown in the figure.

insert image description here

Finally, the required basic data and the failed request data are obtained, and the relevant codes for obtaining and storing the playlist information are as follows:

import requests
import traceback
#获取每个歌单的信息类
class PlayList:
    def __init__(self):
        self.playlist_file = "./data/playlist_url/playlist_id_name_all.txt"
        #获取出错的歌单ID保存文件
        self.error_id_file = "./data/error_playlist_ids.txt"
        #歌单创造者信息
        self.creator_mess = "./data/user_mess/"
        #每个歌单的json信息
        self.playlist_mess = "./data/playlist_mess/"
        #歌单包含的歌曲ID信息
        self.trackid_mess = "./data/song_mess/"
        self.ids_list = self.getIDs()
        self.url = "https://api.imjad.cn/cloudmusic/?type=playlist&id="
        #获得的歌单信息出错的歌单ID
        self.error_id = list()
        #由歌单url 获取歌单ID
    def getIDs(self):
        print("根据歌单链接获取歌单ID ...")
        ids_list = list()
        for line in open(self.playlist_file,"r",encoding="utf-8").readlines():
            try:
                id = line.strip().split("\t")[0].split("id=")[1]
                ids_list.append(id)
            except Exception as e:
                print(e)
                pass
        print("获取歌单ID完成 ...")
        return ids_list
#获取每个歌单的具体信息url #https://api.imjad.cn/cloudmusic/?type=playlist&id=2340739428
    def getEveryPlayListMess(self):
        print("获取每个歌单的具体信息")
        i = 0
        while self.ids_list.__len__() !=0 :
            i += 1
            id = self.ids_list.pop()
            url = self.url + str(id)
            try:
                print("%s - 歌单ID为:%s" % (i,id))
                r = requests.get(url)
                #解析信息
                self.getFormatPlayListMess(r.json())
            except Exception as e:
                #将出错ID写入记录及写入文件,出错时进行跳过
                print(e)
                traceback.print_exc()
                print("歌单ID为:%s 获取出错,进行记录" % id)
                self.error_id.append(id)
                pass
               #break
        self.writeToFile(self.error_id_file,",".join(self.error_id))
        print("歌单信息获取完毕,写入文件: %s" % self.playlist_mess)
    #每个歌单的内容进行格式化处理写入文件
    #需要获取的信息: 歌单信息、创建者信息、歌单音乐信息
    def getFormatPlayListMess(self,json_line):
    #创建者信息:用户ID、昵称、生日、性别、省份、城市、类型、标签、头像链接、用户状态、账号状态、djStatus,vipStatus、签名
        creator = json_line["playlist"]["creator"]
        c_list = (
            str(creator["userId"]),
            str(creator["nickname"]),
            str(creator["birthday"]),
            str(creator["gender"]),
            str(creator["province"]),
            str(creator["city"]),
            str(creator["userType"]),
            str(creator["expertTags"]),
            str(creator["avatarUrl"]),
            str(creator["authStatus"]),
            str(creator["accountStatus"]),
            str(creator["djStatus"]),
            str(creator["vipType"]),
            str(creator["signature"]).replace("\n","")
        )
        self.writeToFile(self.creator_mess + "user_mess_all.txt"," |=| ".join(c_list))
        #歌单信息
        #歌单ID、创建者ID、名字、创建时间、更新时间、播放次数、分享次数、评论次数、收藏次数、标签、歌单封面、描述
        playlist = json_line["playlist"]
        p_list = [
            str(playlist["id"]),
            str(playlist["userId"]),
            str(playlist["name"]).replace("\n",""),
            str(playlist["createTime"]),
            str(playlist["updateTime"]),
            str(playlist["trackCount"]),
            str(playlist["playCount"]),
            str(playlist["shareCount"]),
            str(playlist["commentCount"]),
            str(playlist["subscribedCount"]),
            str(playlist["tags"]),
            str(playlist["coverImgUrl"]),
            str(playlist["description"]).replace("\n","")
        ]
        self.writeToFile(self.playlist_mess + "pl_mess_all.txt"," |=| ".join(p_list))
        #歌单包含的歌曲信息
        t_list = list()
        trackids = json_line["playlist"]["trackIds"]
        for one in trackids:
            t_list.append(str(one["id"]))
        self.writeToFile(self.trackid_mess + "ids_all1.txt",str(playlist["id"])+"\t"+",".join(t_list))
    #写入文件
    def writeToFile(self,filename,one):
        fw = open(filename,"a",encoding="utf8")
        fw.write(str(one) + "\n")
        fw.close()
if __name__ == "__main__":  #主函数
    print("开始获取歌单信息 ..")
    pl = PlayList()
    pl.getEveryPlayListMess()
    print("歌单信息获取完毕 ... Bye !")

2. Data processing

This section contains calculation of songs, singers, user similarities and calculation of user recommendation sets.

Calculate the similarity of songs, singers and users

When the user specifies a tag when creating a playlist, the system thinks that the user has a preference for the tag, and it traverses all the playlists created by the user to give a tag vector.

For example, if there are 3 tags (Japanese, hip-hop, silence) in the system, and the tags used by user Zhang San in the song list are Japanese and hip-hop, then the corresponding tag vector is [1, 1, 0], according to the user's The label vector uses the Jaccard distance algorithm to calculate user similarity. The similarity calculation logic of playlists, singers, and songs is the same as the calculation logic of user similarity. The relevant code is as follows:

#计算用户相似度,全量用户存储数据量大,所以这里只存储了20个用户,并且要求相似度大于0.8
def getUserSim(self):
    sim = dict()
    if os.path.exists("./data/user_sim.json"):  #路径
        sim = json.load(open("./data/user_sim.json","r",encoding="utf-8"))
    else:
        i = 0
        for use1 in self.userTags.keys():
            sim[use1] = dict()
            for use2 in self.userTags.keys():
                if use1 != use2:
                    j_len = len (self.userTags[use1] & self.userTags[use2] )
                    if j_len !=0:
                        result = j_len / len(self.userTags[use1] | self.userTags[use2])
                        if sim[use1].__len__() < 20 or result > 0.8:
                            sim[use1][use2] = result
                        else:
                            #找到最小值并删除
                            minkey = min(sim[use1], key=sim[use1].get)
                            del sim[use1][minkey]
            i += 1
            print(str(i) + "\t" + use1)
        json.dump(sim, open("./data/user_sim.json","w",encoding="utf-8"))
    print("用户相似度计算完毕!")
    return sim
#将计算出的相似度转成导入mysql的格式
def transform(self):
    fw = open("./data/user_sim.txt","a",encoding="utf-8")
    for u1 in self.sim.keys():
        for u2 in self.sim[u1].keys():
            fw.write(u1 + "," + u2 + "," + str(self.sim[u1][u2]) + "\n")
    fw.close()
    print("Over!")

Calculate user recommendation set

This part mainly introduces the collaborative filtering algorithm for users to generate song recommendations for users, which is similar to the recommendation algorithms for song lists, users, and singers.

(1) Create the RecSong class

The relevant code is as follows:

class RecSong:
    def __init__(self):
        self.playlist_mess_file = "../tomysql/data/pl_mess_all.txt"
        self.playlist_song_mess_file = "../tomysql/data/pl_sing_id.txt"
        self.song_mess_file = "../tomysql/data/song_mess_all.txt"
# 在__init__(self)中指定了所使用的文件

2) Construct the corresponding relationship between users and songs

The user creates a playlist, which contains songs. When a user archives a song into the playlist, the score value of the song is considered to be 1; if the same song is archived multiple times, the score value of each archive is increased by 1. The relevant code is as follows:

#加载数据 =>用户对歌曲的对应关系
def load_data(self):
    #所有用户
    user_list = list()
    #歌单和歌曲对应关系
    playlist_song_dict = dict()
    for line in open(self.playlist_song_mess_file, "r", encoding="utf-8"):
        #歌单 \t 歌曲s
        playlist_id, song_ids = line.strip().split("\t")
        playlist_song_dict.setdefault(playlist_id, list())
        for song_id in song_ids.split(","):
            playlist_song_dict[playlist_id].append(song_id)
    #print(playlist_sing_dict)
    print("歌单和歌曲对应关系!")
    #用户和歌曲对应关系
    user_song_dict = dict()
    for line in open(self.playlist_mess_file, "r", encoding="utf-8"):
        pl_mess_list = line.strip().split(" |=| ")
        playlist_id, user_id = pl_mess_list[0], pl_mess_list[1]
        if user_id not in user_list:
            user_list.append(user_id)
        user_song_dict.setdefault(user_id, {
    
    })
        for song_id in playlist_song_dict[playlist_id]:
            user_song_dict[user_id].setdefault(song_id, 0)
            user_song_dict[user_id][song_id] += 1
    #print(user_song_dict)
    print("用户和歌曲对应信息统计完毕 !")
    return user_song_dict, user_list

3) Calculate user similarity

Recommending songs for users is based on collaborative filtering algorithm, which needs to calculate user similarity. The calculation is divided into two steps: building an inverted list and building a similarity matrix. The calculation formula is:

w u v = ∑ i ∈ N ( u ) ∩ N ( v ) 1 lg ⁡ ( 1 + ∣ N ( i ) ∣ ) ∣ N ( u ) ∥ N ( v ) ∣ {w_{u v}}=\frac{\sum_{i \in N(u) \cap N(v)} \frac{1}{\lg (1+|N(i)|)}}{\sqrt{|N(u) \| N(v)|}} wuv=N(u)N(v) iN(u)N(v)l g ( 1 + N ( i ) )1

The relevant code is as follows:

#计算用户之间的相似度,采用惩罚热门商品和优化复杂度的算法
def UserSimilarityBest(self):
    #得到每个item被哪些user评价过
    tags_users = dict()
    for user_id, tags in self.user_song_dict.items():
        for tag in tags.keys():
            tags_users.setdefault(tag,set())
            if self.user_song_dict[user_id][tag] > 0:
                tags_users[tag].add(user_id)
    #构建倒排表
    C = dict()
    N = dict()
    for tags, users in tags_users.items():
        for u in users:
            N.setdefault(u,0)
            N[u] += 1
            C.setdefault(u,{
    
    })
            for v in users:
                C[u].setdefault(v, 0)
                if u == v:
                    continue
                C[u][v] += 1 / math.log(1+len(users))
    #构建相似度矩阵
    W = dict()
    for u, related_users in C.items():
        W.setdefault(u,{
    
    })
        for v, cuv in related_users.items():
            if u==v:
                continue
            W[u].setdefault(v, 0.0)
            W[u][v] = cuv / math.sqrt(N[u] * N[v])
    print("用户相似度计算完成!")
    return W

4) Calculate the user's possible preference for the song

Traverse all similar users, and calculate the user's preference for songs that have not been archived. The relevant code is as follows:

#为每个用户推荐歌曲
def recommend_song(self):
    #记录用户对歌手的评分
    user_song_score_dict = dict()
    if os.path.exists("./data/user_song_prefer.json"):
        user_song_score_dict = json.load(open("./data/user_song_prefer.json", "r", encoding="utf-8"))
        print("用户对歌手的偏好从文件加载完毕!")
        return user_song_score_dict
    for user in self.user_song_dict.keys():
        print(user)
        user_song_score_dict.setdefault(user, {
    
    })
        #遍历所有用户
        for user_sim in self.user_sim[user].keys():
            if user_sim == user:
                continue
            for song in self.user_song_dict[user_sim].keys():
                user_song_score_dict[user].setdefault(song,0.0)
                user_song_score_dict[user][song] += self.user_sim[user][user_sim] * self.user_song_dict[user_sim][song]
    json.dump(user_song_score_dict, open("./data/user_song_prefer.json", "w", encoding="utf-8"))
    print("用户对歌曲的偏好计算完成!")
    return user_song_score_dict

5) Write to file

Sort the preferences of each song, and write the top 100 songs that are most likely to be archived by the user into a file, which is easy to import into the database for use by the system. The relevant code is as follows:

#写入文件
def write_to_file(self):
    fw = open("./data/user_song_prefer.txt","a",encoding="utf-8")
    for user in self.user_song_score_dict.keys():
        sort_user_song_prefer = sorted(self.user_song_score_dict[user].items(), key=lambda one:one[1], reverse=True)
        for one in sort_user_song_prefer[:100]:
            fw.write(user+','+one[0]+','+str(one[1])+'\n')
    fw.close()
    print("写入文件完成")

user_song_prefer.txtThe contents of the file are shown in the figure.
insert image description here

Similarly, the recommendation results of playlists, singers, and users are also calculated in a similar way.

3. Data storage and background

Create a new Django project and 5 templates in PyCharm, namely Homepage, Playlist, Artist, Song and User. Templates are text files used to separate presentation and content. The following uses the song list template as an example to introduce the purpose of each file. The Django project structure is shown in the figure.

insert image description here

Some files have large data and are imported using Navicat software tools, and the rest are imported using Python code to connect to the database. Taking the song list information import database as an example, the relevant code of the Model layer created in Django is as follows:

#歌单信息:歌单ID、创建者ID、名字、创建时间、更新时间、包含音乐数
#播放次数、分享次数、评论次数、收藏次数、标签、歌单封面、描述
class PlayList(models.Model):
    pl_id = models.CharField(blank=False, max_length=64, verbose_name="ID", unique=True)
    pl_creator = models.ForeignKey(User, related_name="创建者信息", on_delete=False)
    pl_name = models.CharField(blank=False, max_length=64, verbose_name="歌单名字")
    pl_create_time = models.DateTimeField(blank=True, verbose_name="创建时间")
    pl_update_time = models.DateTimeField(blank=True, verbose_name="更新时间")
    pl_songs_num = models.IntegerField(blank=True,verbose_name="包含音乐数")
    pl_listen_num = models.IntegerField(blank=True,verbose_name="播放次数")
    pl_share_num = models.IntegerField(blank=True,verbose_name="分享次数")
    pl_comment_num = models.IntegerField(blank=True,verbose_name="评论次数")
    pl_follow_num = models.IntegerField(blank=True,verbose_name="收藏次数")
    pl_tags = models.CharField(blank=True, max_length=1000, verbose_name="歌单标签")
    pl_img_url = models.CharField(blank=True, max_length=1000, verbose_name="歌单封面")
    pl_desc = models.TextField(blank=True, verbose_name="歌单描述")
    def __str__(self):
        return self.pl_id
    class Meta:
        db_table = 'playList'
        verbose_name_plural = "歌单信息"
#歌单信息写入数据库  
def playListMessToMysql(self):
    i=0
    for line in open("./data/pl_mess_all.txt", "r", encoding="utf-8"):
        pl_id, pl_creator, pl_name, pl_create_time, pl_update_time, pl_songs_num, pl_listen_num, \
        pl_share_num, pl_comment_num, pl_follow_num, pl_tags, pl_img_url, pl_desc = line.split(" |=| ")
        try:
            user = User.objects.filter(u_id=pl_creator)[0]
        except:
            user = User.objects.filter(u_id=pl_creator)[0]
        pl = PlayList(
            pl_id = pl_id,
            pl_creator = user,
            pl_name = pl_name,
            pl_create_time = self.TransFormTime(int(pl_create_time)/1000),
            pl_update_time = self.TransFormTime(int(pl_update_time)/1000),
            pl_songs_num = int (pl_songs_num),
            pl_listen_num = int( pl_listen_num ),
            pl_share_num = int( pl_share_num) ,
            pl_comment_num = int (pl_comment_num),
            pl_follow_num = int(pl_follow_num),
            pl_tags = str(pl_tags).replace("[","").replace("]","").replace("\'",""),
            pl_img_url = pl_img_url,
            pl_desc = pl_desc
        )
        pl.save()
        i+=1
        print(i)

After the execution is completed, the corresponding data table can be viewed in the background management of the database visualization management software Navicat and Django, as shown in Figure 1 and Figure 2.

insert image description here

Figure 1 Song list information table in Navicat

insert image description here

Figure 2 Song list information table in Django background management

Finally, all the data tables are obtained, as shown in the following two figures.

insert image description here

Figure 3 Django background management page 1

insert image description here

Figure 4 Django background management page 2

4. Data display

The functions implemented by the front end include: user login and selection of preferred songs and singers; recommendation for you (different user behaviors, different recommendations); when entering each page, the content-based recommendation algorithm recommends song lists for users, and the collaborative filtering algorithm recommends songs for users , singer; click to get detailed information, provide recommendations for individual playlists, songs, singers, and users; personalized rankings (sort the similarity from large to small); my footprints, showing user behavior in the site (single record when clicked).

(1) The Django background processes the front-end request to obtain the recommended tags and related codes of the View layer.

#首页推荐标签
"""
    由于标签个数原因,且歌单、歌手、歌曲共用一套标签,这里标签推荐基于
    1)用户进入系统时的选择
    2)用户在站内产生的单击行为
    3)热门标签进行补数
"""
def GetRecTags(request, base_click):
    #从接口中获取传入的歌手和歌曲ID
    sings = request.session["sings"].split(",")
    songs = request.session["songs"].split(",")
    #歌手标签
    sings_tags = getSingRecTags(sings, base_click)
    #歌曲标签
    songs_tags,pl_tags = getSongAndPlRecTags(songs, base_click)
    return {
    
    
        "code": 1,
        "data": {
    
    
            "playlist": {
    
    "cateid": 2, "tags": list(pl_tags)},
            "song": {
    
    "cateid": 3, "tags": list(songs_tags)},
            "sing": {
    
    "cateid": 4, "tags": list(sings_tags)},
        }
    }
#获得歌曲、歌单标签推荐
def getSongAndPlRecTags(songs, base_click):
    song_tags = list()
    pl_tags =  list()
    #base_click =1 表示用户是在站内产生行为后返回推荐,此时用户行为对象对应的标签排序在前
    #否则基于用户选择的标签排序在前
    if base_click == 1: #表示前端是基于单击行为进入为你推荐模块
        click_songs = UserBrowse.objects.filter(click_cate="3").values("click_id")
        if click_songs.__len__() != 0:
            for one in click_songs:
                filter_one = SongTag.objects.filter(song_id=one["click_id"])
                if filter_one.__len__() != 0 and filter_one[0].tag not in song_tags:
                    song_tags.append(filter_one[0].tag)
                #歌单tag
                pl_one = PlayListToSongs.objects.filter( song_id=filter_one[0].song_id )
                if pl_one.__len__() !=0:
                    for pl_tag_one in PlayListToTag.objects.filter(pl_id=pl_one[0].song_id):
                        if pl_tag_one.tag not in pl_tags:
                            pl_tags.append(pl_tag_one.tag)
        if songs.__len__() != 0:  #表示前端选择了相关歌曲
            for sing in songs:
                choose_one = SongTag.objects.filter(song_id=sing)
                if choose_one.__len__() != 0 and choose_one[0].tag not in song_tags:
                    song_tags.append(choose_one[0].tag)
                    #歌单tag
                    pl_one= layListToSongs.objects.filter(song_id=choose_one[0].song_id)
                    if pl_one.__len__() != 0:
                        for pl_tag_one in PlayListToTag.objects.filter(pl_id=pl_one[0].song_id):
                            if pl_tag_one.tag not in pl_tags:
                                pl_tags.append(pl_tag_one.tag)
        #print("songs_tags_by_click %s" % songs_tags_by_click)
        #print("pl_tags_by_click %s" % pl_tags_by_click)
    else:     #表示用户是首次进入为你推荐模块
        if songs.__len__() != 0:  #表示前端选择了相关歌曲
            for sing in songs:
                choose_one = SongTag.objects.filter(song_id=sing)
                if choose_one.__len__() != 0 and choose_one[0].tag not in song_tags:
                    song_tags.append(choose_one[0].tag)
                    #歌单tag
                    pl_one = PlayListToSongs.objects.filter(song_id=choose_one[0].song_id)
                    if pl_one.__len__() != 0:
                        for pl_tag_one in PlayListToTag.objects.filter(pl_id=pl_one[0].song_id):
                            if pl_tag_one.tag not in pl_tags:
                                pl_tags.append(pl_tag_one.tag)
            #print("songs_tags_by_choose: %s" % songs_tags_by_choose)
            #print("pl_tags_by_choose: %s" % pl_tags_by_choose)
    #如果click和choose的tag不够以hot来补充
    if song_tags.__len__() < 15:
        hot_tag_dict = dict()
        for one in SongTag.objects.all():
            hot_tag_dict.setdefault(one.tag, 0)
            hot_tag_dict[one.tag] += 1
        tag_dict_song = sorted(hot_tag_dict.items(), key=lambda k: k[1], reverse=True)[:15-song_tags.__len__()]
        for one in tag_dict_song:
            if one[0] not in song_tags:
                song_tags.append(one[0])
        #print("songs_tags_by_hot: %s" % songs_tags_by_hot)
    #如果 click 和 choose的tag不够,以 hot来补充
    if pl_tags.__len__() < 15:
        hot_tag_dict = dict()
        for one in PlayListToTag.objects.all():
            hot_tag_dict.setdefault(one.tag, 0)
            hot_tag_dict[one.tag] += 1
        tag_dict_pl = sorted(hot_tag_dict.items(), key=lambda k: k[1], reverse=True)[:15-pl_tags.__len__()]
        for one in tag_dict_pl:
            if one[0] not in pl_tags:
                pl_tags.append(one[0])
        #print("pl_tags_by_hot: %s" % pl_tags_by_hot)
    return song_tags,pl_tags

(2) Entering each page is a content-based recommendation algorithm to recommend song lists to users, and a collaborative filtering algorithm to recommend songs and singers. Here is an example of a song list:

def rec_right_playlist(request):  #推荐歌单
    user = request.GET.get("username")
    u_id = User.objects.filter(u_name=user)[0].u_id
    rec_all = UserPlayListRec.objects.filter(user=u_id).order_by("-sim")[:12]
    _list = list()
    for rec in rec_all:
        one = PlayList.objects.filter(pl_id=rec.related)[0]
        _list.append({
    
    
            "pl_id": one.pl_id,
            "pl_creator": one.pl_creator.u_name,
            "pl_name": one.pl_name,
            "pl_img_url": one.pl_img_url
        })
    return {
    
    "code": 1,
            "data": {
    
    
                "recplaylist": _list
            }
        }

(3) Get detailed information when clicking, and recommend individual playlists, songs, singers, and users based on tags. Here is an example for users:

def all(request):
    #接口传入的tag参数
    tag = request.GET.get("tag")
    #接口传入的page参数
    _page_id = int(request.GET.get("page"))
    print("Tag : %s, page_id: %s" % (tag,_page_id))
    _list = list()
    #全部用户
    if tag == "all":
        sLists = User.objects.all().order_by("-u_id")
     #拼接用户信息
        for one in sLists[(_page_id - 1) * 30:_page_id * 30]:
            _list.append({
    
    
                "u_id": one.u_id,
                "u_name": one.u_name,
                "u_img_url": one.u_img_url
            })
    #指定标签下的用户
    else:
        sLists = UserTag.objects.filter(tag=tag).values("user_id").order_by("user_id")
        for sid in sLists[(_page_id - 1) * 30:_page_id * 30]:
            one = User.objects.filter(u_id=sid["user_id"])
            if one.__len__() == 1:
                one = one[0]
            else:
                continue
            _list.append({
    
    
                "u_id": one.u_id,
                "u_name": one.u_name,
                "u_img_url": one.u_img_url
            })
    total = sLists.__len__()
    return {
    
    "code": 1,
            "data": {
    
    
                "total": total,
                "sings": _list,
                "tags": getAllUserTags()
            }
        }
#获取所有用户标签
def getAllUserTags():
    tags = set()
    for one in UserTag.objects.all().values("tag").distinct().order_by("user_id"):
        tags.add(one["tag"])
    return list(tags)
def one(request):  #处理用户请求
    u_id = request.GET.get("id")
    one = User.objects.filter(u_id=u_id)[0]
wirteBrowse(user_name=request.GET.get("username"),click_id=u_id,click_cate="5", user_click_time=getLocalTime(), desc="查看用户")
    return JsonResponse({
    
    
        "code": 1,
        "data": [
            {
    
    
                "u_id": one.u_id,
                "u_name": one.u_name,
                "u_birthday":one.u_birthday,
                "u_gender":one.u_gender,
                "u_province":one.u_province,
                "u_city":one.u_city,
                "u_tags":one.u_tags,
                "u_img_url": one.u_img_url,
                "u_sign":one.u_sign,
                "u_rec": getRecBasedOne(u_id),
                "u_playlist":getUserCreatePL(u_id)
            }
        ]
    })
#获取单个用户的推荐
def getRecBasedOne(u_id):
    result = list()
    sim_users = UserSim.objects.filter(user_id=u_id).order_by("-sim").values("sim_user_id")[:10]
    for user in sim_users:
        one = User.objects.filter(u_id= user["sim_user_id"])[0]
        result.append({
    
    
            "id": one.u_id,
            "name": one.u_name,
            "img_url": one.u_img_url,
            "cate":"5"
        })
    return result
#获取用户创建的歌单
def getUserCreatePL(uid):
    pls = PlayList.objects.filter(pl_creator__u_id=uid)
    result = list()
    for one in pls:
        result.append(
            {
    
    
                "pl_id": one.pl_id,
                "pl_name":one.pl_name,
                "pl_creator": one.pl_creator.u_name,
                "pl_create_time": one.pl_create_time,
                "pl_img_url": one.pl_img_url,
                "pl_desc":one.pl_desc
            }
        )
    return result
#用户浏览信息进行记录
"""
    user_name = models.CharField(blank=False, max_length=64, verbose_name="用户名")
    click_id = models.CharField(blank=False, max_length=64, verbose_name="ID")
    click_cate = models.CharField(blank=False, max_length=64, verbose_name="类别")
    user_click_time = models.DateTimeField(blank=False, verbose_name="浏览时间")
    desc = models.CharField(blank=False, max_length=1000, verbose_name="备注",default="Are you ready!")
"""
def wirteBrowse(user_name="",click_id="",click_cate="",user_click_time="",desc=""):
    if "12797496" in click_id: click_id = "12797496"
    UserBrowse(user_name=user_name,
               click_id=click_id,
               click_cate=click_cate,
               user_click_time = user_click_time,
               desc=desc).save()
    print("用户【 %s 】的行为记录【 %s 】写入数据库" % (user_name, desc))
#获取当前格式化的系统时间
def getLocalTime():
    return time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())

System test

This section includes system flow and test results.

(1) Select a user to enter the system. Each time a part is randomly returned from the database as a system user, and different users are used to distinguish behavioral preferences, as shown in the figure.

insert image description here

(2) Select singers and songs (3 or more, can be skipped). The process of user interaction with the system solves the cold start of the system. Of course, users can also skip directly without selecting a singer. At this time, the "Recommended singer tags for you" part of the system is the hot tag data. The interface is shown in the following two figures.

insert image description here

insert image description here

(3) According to the user's preference for creating playlists, recommending user-preferred playlists and songs. Click a label to view all songs under the corresponding label, and enter the song list, song, and artist recommendation pages respectively, as shown in the figure.

insert image description here

(4) Song list recommendation page. On the left is all the playlists classified by tags, and on the right is the playlists recommended by the content-based recommendation algorithm for users, as shown in the figure.

insert image description here

(5) Playlist details page. Contains the detailed information of the playlist and the songs in the playlist. On the right side is the playlist recommendation based on tag similarity, as shown in the figure.

insert image description here

(6) Song recommendation page. The left side shows all the songs classified by tags, and the right side shows the songs recommended for users based on the collaborative filtering algorithm, as shown in the figure.

insert image description here

(7) Song details page. Contains song information and lyrics, and the right side is the song list recommendation based on tag similarity, as shown in the figure.

insert image description here

(8) Singer recommendation page. On the left is all singers classified by tags, and on the right is the songs recommended for users based on the collaborative filtering algorithm, as shown in the figure.

insert image description here

(9) Singer details page. Contains singer information and songs, and the right side is the singer recommendation based on tag similarity, as shown in the figure.

insert image description here

(10) User recommendation page. On the left is all users classified by tags, and on the right is the users recommended for users based on the collaborative filtering algorithm, as shown in the figure.

insert image description here

(11) User details page. Contains the user's information and the created playlist, and the right side is the recommendation based on the tag similarity, as shown in the figure.

insert image description here

(12) Personalized Leaderboards. Based on the user's degree of preference (calculated results of the collaborative filtering algorithm), the display is sorted, and different users see different display interfaces, as shown in Figure 5 to Figure 8.

insert image description here

Figure 5 Personalized recommendation leaderboard page

insert image description here

Figure 6 Song list recommendation list page

insert image description here

Figure 7 Song recommendation list page

insert image description here

Figure 8 Singer recommendation list page

(13) MY FOOT. The behavior records generated by users in the system when browsing playlists, songs, and singers are shown in the figure.

insert image description here

Project source code download

See my blog resource download page for details


Other information download

If you want to continue to learn about artificial intelligence-related learning routes and knowledge systems, welcome to read my other blog " Heavy | Complete artificial intelligence AI learning-basic knowledge learning route, all materials can be downloaded directly from the network disk without paying attention to routines "
This blog refers to Github's well-known open source platform, AI technology platform and experts in related fields: Datawhale, ApacheCN, AI Youdao and Dr. Huang Haiguang, etc. There are about 100G related materials, and I hope to help all friends.

Guess you like

Origin blog.csdn.net/qq_31136513/article/details/132335950