After reading this Python operation PPT summary, from now on using Python to play with the Office family bucket, there is no pressure!

1. Guide

Hello everyone, today is still the Python office automation basic series , we explained in detail before

Today, this article will be based on a third-party library to pptxexplain in detail how to use Python to operate the last digit of the Office family bucket- PPT.

Two, installation

pptxIt is a non-standard library and needs to be installed on the command line

pip install python-pptx

It should be noted that it is when it is installed python-pptx, and it is both when it is actually called pptx. This is the docxsame as the module

Three, pre-knowledge

1. Basic structure

Look at the composition of ppt structure, it will be much more complicated than word. Of course, this is also related to the highly customizable scalability of pptimage

In simple terms, a PPT file presentation, the basic structure for the display 文件presentation-幻灯片页slide-形状shapecomposition, it is necessary to distinguish the shape, the shape of the text contains or does not contain the shape of the text (plain pictures, etc.).

If it is a shape that contains text, you can get the internal text box. A text box can be regarded as a small word document, including段落paragraph - 文字块run

Now make a small summary of the structure of the Office three-piece suitimage

2. Templates and placeholders

image

As shown in the figure above, we can preset various layouts through the slide master. When creating a new slide later, just click the layout to generate the required basic format with one click.

Next, let’s talk about Placeholderimagethe style settings of placeholders, including font, font size, color, etc. Input text in a specific placeholder can be directly converted into a specific style

3. Basic ideas for creating PPT files

  • Create a PPT
  • Determine a layout from the slide master
  • Fill in different content in different placeholders
  • Add additional content such as pictures and tables
  • Make changes to the style

Four, Python reads PPT

1. Open the PPT file

from pptx import Presentation
# 这里给出需要打开的文件路径
file_path = r'...'
pptx = Presentation(file_path)

2. Get the slide page

pptx.slides可以获得一个列表,包括所有的幻灯片页slide 对象

for slide in pptx.slides: 
    print(slide)

3. 获取形状

只要熟悉了类似 Excel 和 Word 的多级结构, PPT 的结构就很好理解了。每一个幻灯片页都有一个或者多个形状shape

for slide in pptx.slides: 
    for shape in slide.shapes: 
        print(shape)

4. 获取文本框内容

要获取文字内容,很容易就联系到文字在形状 shape 的下级结构了 从 Word 中的学习我们也可以推知,文字的承载单位是 段落 paragraph文字块 run

很自然可以想到用下列的代码获取文字

for slide in pptx.slides: 
    for shape in slide.shapes: 
        for paragraph in shape.paragraphs: 
            print(paragraph.text)

或者

for slide in pptx.slides:
    for shape in slide.shapes:
        for paragraph in shape.paragraphs:
            for run in paragraph.runs:
                print(run.text)

但这里出现了一个问题:每个形状里一定有文字吗?image从上图可以看到,蓝色椭圆的形状里是没有任何文字的,中间的大虚线框有文字

一个形状中有没有文字,关键就在于它有没有包含文本框text_frame,下面是与文本框有关的操作:

  • shape.has_text_frame 判断形状中是否有文字框
  • shape.text_frame 获取文字内容

在PPT中,文字框才是文字的载体,因此获取文字的代码如下:

for slide in pptx.slides: 
    for shape in slide.shapes: 
        if shape.has_text_frame: 
            text_frame = shape.text_frame 
            print(text_frame.text)

到这里,我们需要对先前对 PPT 结构的认识进行修正:image

5. 获取段落和文字块

每一个文本框都可以看成是一个小的 Word 文件,里面有段落和文字块两级结构:

for slide in pptx.slides: 
    for shape in slide.shapes: 
        if shape.has_text_frame: 
            text_frame = shape.text_frame 
            for paragraph in text_frame.paragraphs: 
                for run in paragraph.runs: 
                    print(run.text)

五、写入 PPT

创建全新 PPT 的代码可以类比创建 Word 文件的代码,实例化的过程中不给予具体路径则为创建空白文件

1. 创建幻灯片页

image其中占位符编号是区分占位符的依据,也是写入内容的依据

2. 往占位符填写内容

指定占位符编号就可以在具体位置写入特定内容

slide.placeholders[占位符编号].text = '...' 

六、修改 PPT 样式

1. 段落样式修改

可以同python-docx模块对段落样式的导入进行类比image具体的方法上二者也有很多相似:

  • .add_run():添加新的文字块
  • .line_spacing:段内行间距
  • .runs :段落内的所有文字块
  • .space_after :段后距
  • .space_before :段前距

2. 文字样式修改

文字样式方法和 Word 中的使用是相同的:

  • .font.name :字体名称
  • .font.bold :是否加粗
  • .font.italic :是否斜体
  • .font.color :字体颜色
  • .font.size:字体大小

But there is one point that needs to be distinguished: in the python-pptxtext style method is based on paragraphs, that is paragraph.font.xxxx, in the python-docxtext style method is based on text blocksimage

Write at the end

So it is a python-pptxsummary of the common methods of  module operation PPT, there is absolutely no problem in handling daily office, more detailed codes can be consulted in official documents.


Guess you like

Origin blog.51cto.com/15064626/2597996