[Pytorch Basic Tutorial 40] DLRM Recommendation Algorithm Model Deployment

note

1. DLRM model

DLRM is an industrial recommendation algorithm model proposed by meta in 2020. The model structure is very simple, and no attention mechanism is used. It focuses more on the implementation of sparse feature scenarios in the recommendation system.

1. Feature engineering and embedding layer

  • The feature engineering part of the model is divided into two parts, one is a sparse matrix formed by one-hot encoding based on discrete attributes such as categories; the other is a dense matrix based on continuous attributes such as values;
  • sparse feature: Discrete category features, converted to dense embedding through the embedding layer; mapped to a dense continuous value through Embedding. Suppose the one-hot encoded vector is ei e_iei, except the i \mathrm{i} in the vectorThe i position is 1, and the embedding vector obtained after Embedding iswi w_iwiAs follows, where
    W ∈ R m × d W \in \mathbb{R}^{\mathrm{m} \times \mathrm{d}}WRm×d
    w i T = e i T W \mathrm{w}_{\mathrm{i}}^{\mathrm{T}}=\mathrm{e}_{\mathrm{i}}^{\mathrm{T}} \mathrm{W} wiT=eiTW
    • Feature crossover: similar to the FM layer feature crossover in deepfm
  • dense feature: The processing method selected in DLRM is to convert all continuous features into an embedding vector of the same dimension as the discrete feature through the MLP multi-layer perceptron, as shown in the yellow part of the figure below.

insert image description here

2. butterfly shuffle

In order to improve the parallelism of MLP and the efficient storage of embedding table, DLRM uses an all-to-all communication primitive, butterfly shuffle.
insert image description here

3. Model structure

eb_configs = [
    EmbeddingBagConfig(
        name=f"t_{
      
      feature_name}",
        embedding_dim=model_config.embedding_dim,
        num_embeddings=model_config.num_embeddings_per_feature[feature_idx],
        feature_names=[feature_name],
    )
    for feature_idx, feature_name in enumerate(
        model_config.id_list_features_keys
    )
]
# Creates an EmbeddingBagCollection without allocating any memory
ebc = EmbeddingBagCollection(tables=eb_configs, device=torch.device("meta"))

module = DLRM(
    embedding_bag_collection=ebc,
    dense_in_features=model_config.dense_in_features,
    dense_arch_layer_sizes=model_config.dense_arch_layer_sizes,
    over_arch_layer_sizes=model_config.over_arch_layer_sizes,
    dense_device=device,
)
summary(module)
模型结构:可通过torchinfo.summary展示
======================================================================
Layer (type:depth-idx)                        Param #
======================================================================
DLRM                                          --
├─SparseArch: 1-1                             --
│    └─EmbeddingBagCollection: 2-1            --
│    │    └─ModuleDict: 3-1                   11,388,433,600
├─DenseArch: 1-2                              --
│    └─MLP: 2-2                               --
│    │    └─Sequential: 3-2                   154,944
├─InteractionArch: 1-3                        --
├─OverArch: 1-4                               --
│    └─Sequential: 2-3                        --
│    │    └─MLP: 3-3                          606,976
│    │    └─Linear: 3-4                       257
======================================================================
Total params: 11,389,195,777
Trainable params: 11,389,195,777
Non-trainable params: 0
======================================================================

2. Model deployment

#!/bin/bash
torch-model-archiver --model-name dlrm,\
                     --version 1.0, \
                     --serialized-file "/root/test/torchrec_dlrm/dlrm.pt",\
                     --model-file dlrm_factory.py, \
                     --extra-file dlrm_model_config.py, \
                     --handler dlrm_handler.py, \
                     --force

# 打包推荐模型
python create_dlrm_mar.py
mkdir model_store
mv dlrm.mar model_store
# 启动服务
torchserve --start --model-store model_store --models dlrm=dlrm.mar
# curl测试model
curl -H "Content-Type: application/json" --data @sample_data.json http://127.0.0.1:8080/predictions/dlrm
#{
    
    
#  "score": -0.05748695507645607
#}

Reference

[1] https://github.com/facebookresearch/dlrm
[2] https://ai.facebook.com/blog/dlrm-an-advanced-open-source-deep-learning-recommendation-model/

Guess you like

Origin blog.csdn.net/qq_35812205/article/details/130935439