DeepFM: A Factorization-Machine based Neural Network for CTR Prediction (2017)论文要点

Papers Link:  https://arxiv.org/pdf/1703.04247.pdf

 

FM Principle Reference:

Factorization Machines with libFM paper read   https://www.cnblogs.com/yaoyaohust/p/10225055.html

GBDT, FM, FFM is derived   https://www.cnblogs.com/yaoyaohust/p/7865379.html

 

Category type one-hot encoding characteristics, continuous characteristics represented directly or after discrete one-hot encoding.

The core idea is to take the output of the FM model weights as embedding cross-term use, FM and Deep components shared this embedding.

So do not pre-training (because the overall training), do not feature project (because FM), while the lower-order and higher-order interaction terms (as FM and NN).

 

评估:AUC,LogLoss(cross entropy)

Rapid training

Activation function: relu, tanh more common than sigmoid; and relu better than tanh (because of reduced sparsity)

Dropout: 0.6-0.9

Neurons per layer: 200-400

Optimal Hidden layer: 3

network shape: constant (width, "quite satisfactory")

 

Guess you like

Origin www.cnblogs.com/yaoyaohust/p/11272566.html