Python implements downloading of tomato novel content

I've been having trouble with novels lately, because they're all about novels.

Today I will use Python to download and save the novel called Tomatoes.

Need to prepare

Environmental use

  • Python 3.8
  • Pycharm 2023

Module usage

  • requests
  • re
  • parcel

requests is a third-party module. Just win + R and enter cmd, and then enter the command pip install requests to install. The other two are built-in modules and do not need to be installed.

If you don’t have software and pycharm permanent jihuo code, you can pick up the business card at the end of the article~

Source code

import requests
import re
import parsel
from prettytable import PrettyTable
from tqdm import tqdm
 
 
while True:
    headers = {
    
    
        'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36'
    }
    key = input('请输入你要下载的小说: 输入00退出 ')
    if key == '00':
        break
    tb = PrettyTable()
    tb.field_names = ['序号', '书名', '作者', '类型', '最新章节', 'ID']
    num = 0
    info = []
    print('正在检索中, 请稍后.....')
    for page in tqdm(range(30)):
        search_url = 'https://大家自己替换一下地址.com/api/author/search/search_book/v1'
        search_params = {
    
    
            'filter': '127,127,127,127',
            'page_count': '10',
            'page_index': page,
            'query_type': '0',
            'query_word': key,
        }

        search_data = requests.get(url=search_url, params=search_params, headers=headers).json()
        for i in search_data['data']['search_book_data_list']:
            book_name = i['book_name']
            author = i['author']
            book_id = i['book_id']
            category = i['category']
            last_chapter_title = i['last_chapter_title']
            dit = {
    
    
                'book_name': book_name,
                'author': author,
                'category': category,
                'last_chapter_title': last_chapter_title,
                'book_id': book_id,
            }
            info.append(dit)
            tb.add_row([num, book_name, author, category, last_chapter_title, book_id])
            num += 1
 
    print(tb)
    book = input('请输入你要下载小说序号: ')
    url = f'https://大家自己替换一下.com/page/{info[int(book)]["book_id"]}'
    response = requests.get(url=url, headers=headers)
    html_data = response.text

    name = re.findall('<div class="info-name"><h1>(.*?)</h1', html_data)[0]
    selector = parsel.Selector(html_data)
    css_name = selector.css('.info-name h1::text').get()
    href = selector.css('.chapter-item a::attr(href)').getall()
    print(f'{name}, 小说正在下载, 请稍后....')
    for index in tqdm(href):
        chapter_id = index.split('/')[-1]
        link = f'https://替换掉了.com/api/novel/book/reader/full/v1/?device_platform=android&parent_enterfrom=novel_channel_search.tab.&aid=2329&platform_id=1&group_id={chapter_id}&item_id={chapter_id}'
        json_data = requests.get(url=link, headers=headers).json()['data']['content']
        title = re.findall('<div class="tt-title">(.*?)</div>', json_data)[0]
        content = '\n'.join(re.findall('<p>(.*?)</p>', json_data))
        with open(f'{name}.txt', mode='a', encoding='utf-8') as f:

            f.write(title)
            f.write('\n')
            f.write(content)
            f.write('\n')

Effect

Search and download


It is still very simple. I have packed the complete code and video explanation. You can pick up the business card at the end of the article.

Okay, this sharing ends here, see you next time~

Guess you like

Origin blog.csdn.net/ooowwq/article/details/133860021