学习小计: Kaggle Learn Embeddings

Embedding表示map f: X(高维) -> Y(低维),减小数据维度,方便计算+提高准确率。

参看Kaggle Learn:https://www.kaggle.com/learn/embeddings

官方DNN示例:

 

user_id_input = keras.Input(shape=(1,), name='user_id')
movie_id_input = keras.Input(shape=(1,), name='movie_id')
user_embedded = keras.layers.Embedding(df.userId.max()+1, user_embedding_size, 
                                       input_length=1, name='user_embedding')(user_id_input)
movie_embedded = keras.layers.Embedding(df.movieId.max()+1, movie_embedding_size, 

官方Matrix Factorization示例:

movie_embedding_size = user_embedding_size = 8

# Each instance consists of two inputs: a single user id, and a single movie id
user_id_input = keras.Input(shape=(1,), name='user_id')
movie_id_input = keras.Input(shape=(1,), name='movie_id')
user_embedded = keras.layers.Embedding(df.userId.max()+1, user_embedding_size, 
                                       input_length=1, name='user_embedding')(user_id_input)
movie_embedded = keras.layers.Embedding(df.movieId.max()+1, movie_embedding_size, 
                                        input_length=1, name='movie_embedding')(movie_id_input)

dotted = keras.layers.Dot(2)([user_embedded, movie_embedded])
out = keras.layers.Flatten()(dotted)

两种类型对比如下,简单模型(蓝色)的表现也相当好,两个模型都有明显的过拟合。

猜你喜欢

转载自www.cnblogs.com/xbit/p/10184472.html
今日推荐