Demonstration video: Python realizes the company's annual report import and export judgment analysis complete code data self-collection in the comment area_哔哩哔哩_bilibili # Step read the text in the pdf for keyword matching to determine whether it is an import and export company # This is just a test. In fact, many annual reports can be run. The pdf is very large and it is very difficult to judge.
Data Display:
# python实现公司年报进出口判断分析
# 步骤 读取pdf中的文字进行关键词匹配判断是否是进出口公司
# 这里只是测试 其实可以运行很多个 年报pdf很大 判断起来很费劲的
import os
import PyPDF2
import re
import pandas as pd
from tqdm import tqdm
import jieba
# pip install PyPDF2 -i https://pypi.tuna.tsinghua.edu.cn/simple
def re_pipei_word(text):
res = re.findall('[\u4e00-\u9fa5]', str(text))
# print(res) #['也', '像', '疼']
# print("".join(res)) 也像疼
return "&