Use ChatGPT to develop a book recommendation WeChat applet (3)

1 Introduction

1.1 Implementation principle

It is still the same as the original principle, which is equivalent to interactive Q&A with ChatGPT, and then mapped to the applet . The following are the three major elements:

  • database modeling

First, a database needs to be established to store book information and user information. Information about each book may include title, author, publisher, ISBN number, publication date, price, etc. User information may include user name, password, email address, delivery address, etc.

  • Data Acquisition and Processing

In order to make the recommendation system more accurate, we need to collect users' reading history, purchase records and even search records. At the same time, it is also necessary to summarize and sort out the classification labels of new books. Then, according to the user's historical behavior and hobbies, the recommendation algorithm is used to generate a personalized book recommendation list. Common recommendation algorithms include content-based recommendation, collaborative filtering recommendation, deep learning recommendation, etc.

  • UI optimization

The book recommendation list is presented to the user, and the user can view and purchase books through the interface. The background updates relevant data and optimizes the recommendation algorithm based on user operations to improve the quality and accuracy of recommendations.

1.2 How to connect in the Mini Program

  1. Determine requirements: Clarify project goals, including the type of books to be recommended, user portraits, and recommendation algorithms.

  2. UI Design: According to project requirements and user habits, design the interface layout, color scheme, interaction method, etc. of the Mini Program.

  3. Database modeling: design the database according to the requirements, and develop it in the background. Book information, user information, historical data, etc. need to be stored in the database for query and analysis in the subsequent calculation process.

  4. User authorization: When the Mini Program starts, obtain user information and authorization information to ensure subsequent personalized recommendations.

  5. Recommendation algorithm: According to user portraits and historical behaviors, different recommendation algorithms are used to generate recommendation results.

  6. Page layout and rendering: According to the style and requirements of UI design, implement page layout and rendering on the front end, and display the data returned by the back end.

  7. Optimization and improvement: Based on user feedback, continuously optimize the algorithm and interactive experience to improve recommendation accuracy and user satisfaction.

  8. Release and launch: conduct testing, joint debugging and other links to ensure the stability and security of the Mini Program, and then release and launch it for operation and promotion.

1.3 Technical Architecture

The following is the technology stack that needs to be used, including its language and framework, as well as some key points:

technology stack language/framework technical points
Small program development framework Mini-program framework provided by WeChat official Quickly build the front-end interface to realize the interaction and display effect of the applet
front-end language WXML、WXSS、JavaScript,VUE 2.0+Uni-app Realize the front-end interaction logic and style design of the applet
backend development Node.js、Express.js,Python + Flask WEB Provide an API interface, process the requests sent by the front end, and interact with the database
database MySQL、MongoDB Store book information, user information and recommendation data, etc.
User Authentication and Authorization WeChat open interface Implement functions such as user login, authorization, and personal information acquisition
recommendation algorithm Content-based recommendation, collaborative filtering, deep learning recommendation, etc., Python + Request Calculate the books that the user likes through the algorithm, and recommend suitable books for the user according to the crawler crawling
Third-party API integration Douban reading API, library API, etc. Get book data from other sites or services

2 crawling data

2.1 Crawl information based on book title

Use requeststhe library to send a request to Douban HTTP GET, get the page containing the search results, and then use BeautifulSoupthe library to parse HTMLthe page and extract the book information. The specific implementation process is as follows:

  1. Construct a search link: use the input book title as a parameter to construct a search link. The cat parameter in the link specifies that the search result is a book type, and 1001 indicates a literature book. Use the requests library to send an HTTP GET request to the link to get the page containing the search results.

  2. Parse the search result page: Use the BeautifulSoup library to parse the search result page and extract the link to the first search result. Send the link an HTTP GET request to get a page containing book details.

  3. Parsing the book details page: use the BeautifulSoup library to parse the book details page, and extract information such as book name, author, type, publisher, and publication time.

  4. Return book information: save the extracted information into a dictionary as the return value of the function.

import requests
from bs4 import BeautifulSoup

def get_book_info(book_name):
    # 构造亚马逊搜索页面的URL
    url = f"https://www.amazon.com/s?k={
      
      book_name}"

    # 发送HTTP请求获取搜索页面内容
    response = requests.get(url)

    # 使用BeautifulSoup解析页面内容
    soup = BeautifulSoup(response.content, 'html.parser')

    # 从页面中获取第一个搜索结果的链接
    result_link = soup.find('a', {
    
    'class': 'a-link-normal s-no-outline'})

    # 发送HTTP请求获取搜索结果页面内容
    response = requests.get(result_link['href'])

    # 使用BeautifulSoup解析页面内容
    soup = BeautifulSoup(response.content, 'html.parser')

    # 解析页面中的书籍信息
    book_title = soup.find('span', {
    
    'id': 'productTitle'}).text.strip()
    book_author = soup.find('span', {
    
    'class': 'author'}).find('a').text.strip()
    book_genre = soup.find('a', {
    
    'class': 'a-link-normal a-color-tertiary'}).text.strip()
    book_publisher = soup.find('span', {
    
    'class': 'publisher'}).find('a').text.strip()
    book_publication_date = soup.find('span', {
    
    'class': 'a-text-normal'}).text.strip()

    # 返回书籍信息
    return {
    
    
        'title': book_title,
        'author': book_author,
        'genre': book_genre,
        'publisher': book_publisher,
        'publication_date': book_publication_date
    }

try to run

book_name = "Python编程:从入门到实践"
book_info = get_book_info(book_name)

print("书名:", book_info['title'])
print("作者:", book_info['author'])
print("类型:", book_info['genre'])
print("出版社:", book_info['publisher'])
print("出版时间:", book_info['publication_date'])

2.2 Crawl information according to the author

import requests
from bs4 import BeautifulSoup

def get_author_books_info(author):
    # 构造亚马逊图书搜索链接
    url = f"https://www.amazon.cn/s?k={
      
      author}&i=stripbooks"
    
    # 发送GET请求获取网页内容
    response = requests.get(url)
    response.encoding = "utf-8"
    
    # 使用BeautifulSoup解析网页内容
    soup = BeautifulSoup(response.text, "html.parser")
    
    # 获取作者的个人信息
    author_info = soup.find("div", {
    
    "class": "a-section a-text-left s-align-children-center"})
    
    # 获取搜索结果中的所有书籍信息
    book_items = soup.find_all("div", {
    
    "class": "s-result-item s-asin sg-col-0-of-12 sg-col-16-of-20 sg-col sg-col-12-of-16"})
    
    # 遍历每个书籍信息,解析出经典小说列表
    classic_books = []
    for item in book_items:
        book_title = item.find("span", {
    
    "class": "a-size-medium a-color-base a-text-normal"})
        if book_title:
            book_title = book_title.text.strip()
            if "经典" in book_title:
                classic_books.append(book_title)
    
    # 将结果以字典形式返回
    result = {
    
    "author_info": author_info.text.strip(), "classic_books": classic_books}
    return result

Run it:

author = "村上春树"
result = get_author_books_info(author)
print("作者信息:", result["author_info"])
print("经典小说列表:", result["classic_books"])

2.3 Crawl information according to type

import requests
from bs4 import BeautifulSoup

def get_book_info(book_type):
    """
    获取亚马逊中指定作品类型的经典小说和相关作者信息

    :param book_type: str, 作品类型
    :return: tuple, 经典小说和相关作者列表
    """
    # 构造查询链接
    url = f"https://www.amazon.cn/s?k={
      
      book_type}&i=stripbooks&rh=n%3A658390051&page=1&qid=1620160259&ref=sr_pg_1"

    # 发送GET请求获取网页内容
    response = requests.get(url)
    html = response.content

    # 解析网页内容,获取经典小说和相关作者信息
    soup = BeautifulSoup(html, 'html.parser')
    books = soup.select(".s-result-item")
    classic_books = []
    related_authors = []
    for book in books:
        book_title = book.select(".a-link-normal")[0].get_text().strip()
        book_author = book.select(".a-size-base.a-link-normal")[0].get_text().strip()
        if "经典" in book.select(".a-size-base.a-color-secondary")[0].get_text():
            classic_books.append(book_title)
        related_authors.append(book_author)

    # 返回经典小说和相关作者信息
    return (classic_books, related_authors)

Run it:

# 获取科幻小说类别的经典小说和相关作者信息
book_type = "科幻"
classic_books, related_authors = get_book_info(book_type)

# 输出结果
print("经典小说:")
for book in classic_books:
    print(book)
print("\n相关作者:")
for author in related_authors:
    print(author)

3 Discussion

The main point of this part is to write a crawler to crawl information based on specific content and return it. According to your own network conditions, you can replace it with a website such as Douban.

Guess you like

Origin blog.csdn.net/weixin_48093827/article/details/130489678