Embedding machine learning models into web systems using flask

In this article, I mainly introduce how to embed a machine learning model into the web system. The main contents of this article include:

1. Build a simple web using flask
2. Embed the machine learning model into the web system
3. Update the model based on user feedback

It mainly includes three pages, a comment submission page, a classification result page, and a thank you page. When the user submits a comment and jumps to the result page, the background predicts whether the user's comment belongs to a positive or negative comment based on the existing model, returns which type of comment belongs to, and returns the probability of belonging to that type. Provide two user feedback result buttons, if the user clicks the correct button, the prediction is correct, otherwise the prediction is wrong, and the result is saved to the SQLlite database, and then jumps to the thank you page.

1. Project structure

db: SQLite database files are stored in the directory.

pkl: Stores the model file, stopwords.pk is the stop word file, and classifer.pkl is the model file.

static: It is a static file directory, mainly storing js and css files.

templates: is the template file directory, used to store html files.

app.py: The main file, including functions such as interface jumping and model prediction.

updatePkl.py: Update file for the model.

vectorizer.py: Convert reviews into feature vectors for easy prediction.

2. Interface description

The interface is relatively simple, and there are not too many styles to adjust, mainly to achieve functions.

1. User submit comments interface

Users can enter their own comments on this interface and submit them.

2. Classification result page

Users can view the classification results of their own comments through this page, and can give corresponding feedback. If the user does not confirm whether it is correct, the category to which this comment belongs will not be stored in the SQLite database.

3. Thank you page

Through this interface, you can jump to and submit comments interface. Use SQLiteStudio to view database save comments

Third, the realization of the function

1. Convert comments into feature vectors

import re
import pickle
from sklearn.feature_extraction.text import HashingVectorizer
from nltk.stem.porter import PorterStemmer
import warnings
warnings.filterwarnings("ignore")

#load stopwords
stop = pickle.load(open("pkl/stopwords.pkl","rb"))

#Remove HTML tags and punctuation, remove stop words
def tokenizer(text):
    #Remove HTML tags
    text = re.sub("<[^>]*>","",text)
    #Get all emojis
    emoticons = re.findall('(?::|;|=)(?:-)?(?:\)|\(|D|P)', text.lower())
    #remove punctuation
    text = re.sub("[\W]+"," ",text.lower())+" ".join(emoticons).replace("-","")
    #remove stop words
    tokenized = [word for word in text.split() if word not in stop]
    # extract stem
    porter = PorterStemmer()
    #return the list of words after removing stop words
    return [porter.stem(word) for word in tokenized]
#Get the feature vector of comments through HashingVectorizer
vect = HashingVectorizer(decode_error="ignore",n_features=2**21,preprocessor=None,tokenizer=tokenizer)

2. Main functions

import pickle
import sqlite3
import numpy as np
from flask import Flask,render_template,request
from wtforms import Form,TextAreaField,validators
from flask_web.vectorizer import vect

#Create a falsesk object
app = Flask(__name__)
#load classification model
clf = pickle.load(open("pkl/classifier.pkl","rb"))

#Create a comment database, run this method before app.py runs
def create_review_db():
    conn = sqlite3.connect("db/move_review.db")
    c = conn.cursor()
    #move_review mainly includes four fields, review_id (review ID, primary key auto-increment), review (review content), sentiment (review category), review_date (review date)
    c.execute("CREATE TABLE move_review (review_id INTEGER PRIMARY KEY AUTOINCREMENT,review TEXT"
              ",sentiment INTEGER,review_date TEXT)")
    conn.commit()
    conn.close()

# save the comments to the database
def save_review(review,label):
    conn = sqlite3.connect("db/move_review.db")
    c = conn.cursor()
    #Insert comments into database
    c.execute("INSERT INTO move_review (review,sentiment,review_date) VALUES "
              "(?,?,DATETIME('now'))",(review,label))
    conn.commit()
    conn.close()

#Get the classification result of the comment
def classify_review(review):
    label = {0:"negative",1:"positive"}
    #Convert comments into feature vectors
    X = vect.transform(review)
    #Get comment integer class label
    Y = clf.predict(X)[0]
    #Get the string class label of the comment
    label_Y = label[Y]
    #Get the probability of the category the comment belongs to
    test = np.max (clf.predict_test (X))
    return Y,label_Y,test


#Jump to the user submit comment interface
@app.route("/")
def index():
    # Verify that the text entered by the user is valid
    form = ReviewForm(request.form)
    return render_template("index.html",form=form)

#Jump to the comment classification result interface
@app.route("/main",methods=["POST"])
def main():
    form = ReviewForm(request.form)
    if request.method == "POST" and form.validate():
        #Get the comments submitted by the form
        review_text = request.form["review"]
        #Get the classification result of the comment, class label, probability
        Y,lable_Y,test = classify_review([review_text])
        #Save the probability as a decimal and convert it to a percentage
        proba = float("%.4f"%proba) * 100
        # Return the classification results to the interface for display
        return render_template("reviewform.html",review=review_text,Y=Y,label=lable_Y,probability=proba)
    return render_template("index.html",form=form)

#user thank you interface
@app.route("/tanks",methods=["POST"])
def tanks():
    # Determine whether the user clicks the correct button or the wrong button
    btn_value = request.form["feedback_btn"]
    #get comments
    review = request.form["review"]
    #Get the category label to which the comment belongs
    label_temp = int(request.form["Y"])
    #If correct, the class label is unchanged
    if btn_value == "Correct":
        label = label_temp
    else:
        #If wrong, the class label is opposite
        label = 1 - label_temp

    save_review(review,label)
    return render_template("tanks.html")

class ReviewForm(Form):
    review = TextAreaField("",[validators.DataRequired()])

if __name__ == "__main__":
    #Start service
    app.run()

3. Model update

import pickle
import sqlite3
import numpy as np
from flask_web.vectorizer import vect

#Update model method, update 10000 comments each time
def update_pkl(db_path,clf,batch_size=10000):
    conn = sqlite3.connect(db_path)
    c = conn.cursor()
    c.execute("SELECT * from review")
    # get all the comments
    results = c.fetchmany(batch_size)
    while results:
        data = np.array(results)
        #get comments
        X = data[:,1]
        #Obtain
        Y = int(data[:,2])
        classes = np.array([0,1])
        #Convert comments into feature vectors
        x_train = vect.transform(X)
        #update model
        clf.partial_fit(x_train,Y,classes=classes)
        results = c.fetchmany(batch_size)
    conn.close()
    return None

if __name__ == "__main__":
    #load model
    clf = pickle.load(open("pkl/classifier.pkl", "rb"))
    #update model
    update_pkl("db/move_review.db",clf)
    # save the model
    pickle.dump(clf,open("pkl/classifier.pkl","wb"),protocol=4)

Why run the update model from another file instead of updating the model directly after the user submits feedback?

If there are many users commenting at the same time, updating the model directly after the user submits the feedback may cause the model file to be damaged when the model file is updated. It is recommended to update the model file locally before uploading it to the server.