QT案例实战1 - 从零开始编写一个OCR工具软件 (8) Pdf读取展示/截图/文字识别

一、PDF读取展示功能

QT提供了pdf、PdfWidgets模块，需要安装模块，然后在项目的CMakeLists.txt文件内，添加如下代码，以便在项目中使用。

官方提供了完整的pdf展示示例，示例1：基于小部件的 PDF 查看器，允许滚动页面。示例2：一个允许滚动页面的 Qt Quick PDF 查看器。

https://doc.qt.io/qt-6/qtpdf-examples.htmlhttps://doc.qt.io/qt-6/qtpdf-examples.html CMakeLists.txt文件。

find_package(Qt6 REQUIRED COMPONENTS Pdf)
find_package(Qt6 REQUIRED COMPONENTS PdfWidgets)

target_link_libraries(QtFFmpegApp2 PRIVATE Qt6::Pdf)
target_link_libraries(QtFFmpegApp2 PRIVATE Qt6::PdfWidgets)

二、功能初步设计

1、功能简述

左侧进行pdf的展示，右侧是ocr识别的结果的展示区域。

2、pdfview控件

需要把控件提升为QPdfView，在设计的左侧菜单是看不到的。

    m_document = new QPdfDocument(this);

    //目录，我这里不需要目录，所以没有展示
    //QPdfBookmarkModel *bookmarkModel = new QPdfBookmarkModel(this);
    //bookmarkModel->setDocument(m_document);
    //ui->bookmarkView->setModel(bookmarkModel);

    ui->pdfView->setZoomMode(QPdfView::FitToWidth);
    //ui->pdfView->setZoomFactor(0.8);
    ui->pdfView->setDocument(m_document);

3、读取pdf

点击选择pdf按钮的按钮，选择了pdf之后调用open方法进行打开。

void PdfViewWindow::on_pushButton_clicked()
{
    QUrl toOpen = QFileDialog::getOpenFileUrl(this, tr("Choose a PDF"), QUrl(), "Portable Documents (*.pdf)");
       if (toOpen.isValid())
           open(toOpen);
}

4、展示pdf

这里只要把本地pdf文件的路径传递给QPdfDocument，就可以打开pdf文件。

void PdfViewWindow::open(const QUrl &docLocation)
{
    if (docLocation.isLocalFile()) {
        m_document->load(docLocation.toLocalFile());
        const auto documentTitle = m_document->metaData(QPdfDocument::Title).toString();
        setWindowTitle(!documentTitle.isEmpty() ? documentTitle : QStringLiteral("PDF Viewer"));
        ui->label->setText("共" + QString::number(ui->pdfView->pageNavigation()->pageCount()) + "页");
        ui->lineEdit->setText(QString::number(1));
    } else {
        QMessageBox::critical(this, tr("Failed to open"), tr("%1 is not a valid local file").arg(docLocation.toString()));
    }
}

5、截图/文字识别

QPdfDocument给我们提供了render方法，让我们可以获取截图。获得到图片之后启动新线程，进行ocr识别，这里的线程和之前图片ocr的是同一个，处理方式也是一摸一样。

/**
 * @brief PdfViewWindow::on_pushButton_3_clicked
 * 文字提取
 */
void PdfViewWindow::on_pushButton_3_clicked()
{
    QString filePath = QString("%1\\screen.jpg").arg(qApp->applicationDirPath().replace("/", "\\"));

    //渲染到图片
    //size = m_document->pageSize(1);//QSize(600, 800)
    int page = ui->pdfView->pageNavigation()->currentPage();
    QSize origin = m_document->pageSize(page).toSize();
    QSize *newsize = new QSize(origin.width()*2, origin.height()*2);
    QImage image = m_document->render(page, *newsize);
    image.save(filePath, "jpg");

    //清空右侧文本框
    ui->textEdit->setText("");

    //实例化loading窗口
    loading = new LoadingDialog(this);
    loading->setVisible(true);

    //启动线程
    m_thread  =  new MyThreadForTextRecognition;
    m_thread->init(filePath.toStdString(), QString("%1\\screen_action.jpg").arg(qApp->applicationDirPath().replace("/", "\\")).toStdString(), ui->comboBox->currentIndex(), ui->checkBox_3->isChecked(), ui->comboBox_2->currentIndex());
    connect(m_thread, &MyThreadForTextRecognition::getRecognitionText,this,&PdfViewWindow::getRecognitionText);
    connect(m_thread, &MyThreadForTextRecognition::recognitionFinish,this,&PdfViewWindow::recognitionFinish);
    m_thread->start();
}