Tensorflow model deployment server, and use the interface call

The project needs to deploy the trained model in the production environment for prediction, and sort out the entire construction process and application by checking relevant information.

First, tensorflow serving officially provided by Tensorflow is deployed. At the same time, in order to avoid troublesome operations such as environment configuration, docker containers can be used.

1. Install Docker

2. Pull the TensorFlow Serving image through Docker

  • Modify according to your own version (I use 2.2.0)
docker pull tensorflow/serving:2.2.0
  • Check if the pull is successful
docker images

insert image description here

3. Pack the TensorFlow model into a PD format file

  • package function
tf.saved_model.save(model, './model/')
  • Pack the results and place them as shown in the figure [1 is the version number]insert image description here

4. Upload the entire model file to the server and deploy the model

  • Run the container [note that the source is changed to the position after the model is uploaded to the server]
docker run -p 8501:8501 --mount type=bind,source=/home/hl/model,target=/models/hadoop -t tensorflow/serving:2.2.0 -e MODEL_NAME=hadoop --model_base_path=/models/hadoop
  • Screenshot of successful operation
    insert image description here
  • test port

http://192.168.152.111:8501/v1/models/model
insert image description here

So far the model has been deployed successfully!

5. Write a simple program to call the model service through HTTP

  • code
import requests
import json
import numpy as np

# json序列化问题解决
class NpEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, np.integer):
            return int(obj)
        elif isinstance(obj, np.floating):
            return float(obj)
        elif isinstance(obj, np.ndarray):
            return obj.tolist()
        else:
            return super(NpEncoder, self).default(obj)


if __name__ == '__main__':
    list = [0.012133333333, 0.983450847726511, 0.738351259436398, 690, 59, 2.3292634]
    list = np.array(list).reshape(1,6)
    data = json.dumps({
    
    "instances":list},cls=NpEncoder)
    # headers = {"content-type":"applicaiton/json"}
    response = requests.post("http://192.168.152.111:8501/v1/models/model:predict",data=data)
    print(response.text)
    embeddings = json.loads(response.text)['predictions']
    print(embeddings[0][0])
  • run the program
    insert image description here

Successfully obtained the prediction results of the model!

Guess you like

Origin blog.csdn.net/lafsca5/article/details/126596340