The basic building blocks of iText7 advanced tutorial-1. Introducing fonts

    In this chapter, we start to talk about some examples of using different fonts to display the title and author. Here we will introduce some categories, such as FontProgramand PdfFont.

The content of this chapter is too long, please watch it patiently, Chinese fonts are related, please refer to my blog and personal website

The code, font, and operating system of the examples on the official website may be different from our local environment, so the results may be different, please refer to the actual situation

1. Create a PdfFont object

    As shown in Figure 1 below, we can see that three different fonts are used to create a PDF document with a title and author. The three fonts are: Helvetica, Times-Bold and Times-Roman. In Adobe Acrobat Pro reader, These fonts are replaced with: ArialMT, TimesNewRomanPS-BoldMT and TimesNewRomanPSMT.

itext7-h-1-1

Figure 1. Standard Type 1 font

    The actual fonts will contain MT. MT is the abbreviation of the Monotype Imaging Holdings, IncD, the font supplier. These actual fonts are packaged with Windows. If you open the same file on a Linux machine, the actual font will be used Other fonts. This situation occurs especially when embedded fonts are not used . The reader will search the operating system for the fonts necessary to display the document. If a specific replacement can be found, then this font will be used.

In the traditional sense, each reader should recognize these 14 fonts: four Helvetica fonts (normal, bold, oblique and bold-oblique, that is, normal, bold, italic and bold italic), four Times-Roman fonts (normal, bold, italic and bold-italic, italic are also italic, the difference with oblique is: italic is italic, for fonts without italics, oblique attribute should be used to achieve the slanted text effect), four Courier fonts (normal, bold, oblique, and bold-oblique), Symbol and Zapfdingbats (I don’t know how to translate this for the time being, let’s put it here, it's probably a special term). These 14 fonts are also standard Type 1 fonts. Each reader cannot use the same font as the declared font, but will use the font that looks exactly the same as the declared font

    In order to create the PDF above, we need to use three fonts: two of which are explicitly declared, and one is implicitly declared. The code is as follows:

PdfDocument pdf = new PdfDocument(new PdfWriter(dest));
Document document = new Document(pdf);
PdfFont font = PdfFontFactory.createFont(FontConstants.TIMES_ROMAN);
PdfFont bold = PdfFontFactory.createFont(FontConstants.TIMES_BOLD);
Text title =
    new Text("The Strange Case of Dr. Jekyll and Mr. Hyde").setFont(bold);
Text author = new Text("Robert Louis Stevenson").setFont(font);
Paragraph p = new Paragraph().add(title).add(" by ").add(author);
document.add(p);
document.close();

    In line 1, we used PdfWriterCreate PdfDocument. j These objects are all low-level objects and will create a PDF based on your content. In line 2, we create an Documentinstance, which is a high-level object that allows you to create a document without worrying about the syntax of the PDF.

    In line 5 and line 6, we use PdfFontFactoryto create one PdfFont. In the FontConstants object, you can find 14 standard Type 1 fonts. In line 7, we created one Text, the content is the story title, and the font is set to TIME_BOLD. In line 8, one is created Text, the content is the author name, and the font is set to TIMES_ROMAN. We can't directly add these Textobjects to documentit, but we can add them to one BlockElement, here is line 9 Paragraph.

Between the title and the author, we added an Stringobject. Since we have not Stringdefined a font for this , Paragraphthe default font will be used. In iText, the default font is Helvetica. This is also what we saw in the above picture Li that the font list has Helvetica font

    In line 10, we added Paragraphto the documentobject; in line 11 we closed this Document.

    We have finished creating PDF without using embedded fonts. The result is that only a few fonts can be used when rendering files, but we can use embedded fonts to use more fonts.

2. Create a font program

    iText supports standard Type 1 fonts, because io-jar already contains Adobe Font Metrics (AFM) files for 14 fonts. These files contain the necessary metrics, which are used to calculate the width and height of words and lines, which are necessary when creating the layout of the file.

    If we want to create a font, we need a font program. For standard Type 1 fonts, these font programs are stored in PostScript Font Binary (PFB) files. Furthermore, if it is the standard Type 1 font in 14, these PFBs are copyrighted and they will not be included in iText7, because iText7 does not have a license, and iText can only include those metrics files.

    Due to copyright reasons, iText cannot embed these 14 fonts, but it does not mean that iText cannot embed fonts. In the following example, we have embedded three subsets of the Cardo font family in the PDF file. The Cardo font program is released under the Summer Institute of Logistics (SIL) Open Font License (OFL).

    The result is shown in Figure 2 below:

itext7-h-1-2

Figure 2. Embedding fonts

    First, we need to specify three fonts path Cardo-Regular.ttf, Cardo-Bold.ttfand Cardo-Italic.ttf, as shown below:

public static final String REGULAR =
    "src/main/resources/fonts/Cardo-Regular.ttf";
public static final String BOLD =
    "src/main/resources/fonts/Cardo-Bold.ttf";
public static final String ITALIC =
    "src/main/resources/fonts/Cardo-Italic.ttf";

    Then, from the FontProgramFactoryget to a FontProgramtarget:

FontProgram fontProgram =
    FontProgramFactory.createFont(REGULAR);

    Using this FontProgramexample, we can create an PdfFontobject:

PdfFont font = PdfFontFactory.createFont(
    fontProgram, PdfEncodings.WINANSI, true);

    Here we pass an encoding ( PdfEncodings.WINANSI), and we indicate that the font needs to be embedded ( true). Of course, we can directly pass to the PdfFontFactoryfont path to create the font, as shown below:

PdfFont bold = PdfFontFactory.createFont(BOLD, true);
PdfFont italic = PdfFontFactory.createFont(ITALIC, true);

    Now we can use three fonts to fill our Paragraphobjects:

Text title =
    new Text("The Strange Case of Dr. Jekyll and Mr. Hyde").setFont(bold);
Text author = new Text("Robert Louis Stevenson").setFont(font);
Paragraph p = new Paragraph().setFont(italic)
    .add(title).add(" by ").add(author);
document.add(p);

    The Helvetica font does not appear in the image above because we changed the Paragraphdefault font.

3. The difference between FontProgram and PdfFont

    In the following examples, we will use all the time PdfFontFactoryto create PdfFontobjects. PdfFontFactoryInternal will use FontPrograman example, but we have to be clear PdfFontand FontProgramimportant difference between:

  • One FontProgramobject can create different PdfFontobjects for different PDF documents
  • An PdfFontobject can only be used for onePdfDocument

    You can only use the PdfFontobject once , because it keeps track of all the glyphs needed in this document. In this way, the entire font program does not need to be added to the PDF file, only a subset of the fonts needs to be added. The advantage of doing this is to reduce the size of the PDF file.
    Let's take a look at the code:

protected PdfFont font;
protected PdfFont bold;
protected PdfFont italic;
public static void main(String args[]) throws IOException {
    File file = new File(DEST);
    file.getParentFile().mkdirs();
    C01E02_Text_Paragraph_Cardo2 app =
        new C01E02_Text_Paragraph_Cardo2();
    FontProgram fontProgram =
        FontProgramFactory.createFont(REGULAR);
    FontProgram boldProgram =
        FontProgramFactory.createFont(BOLD);
    FontProgram italicProgram =
        FontProgramFactory.createFont(ITALIC);
    for (int i = 0; i < 3; ) {
        app.font = PdfFontFactory.createFont(
            fontProgram, PdfEncodings.WINANSI, true);
        app.bold = PdfFontFactory.createFont(
            boldProgram, PdfEncodings.WINANSI, true);
        app.italic = PdfFontFactory.createFont(
            italicProgram, PdfEncodings.WINANSI, true);
        app.createPdf(String.format(DEST, ++i));
    }
}
public void createPdf(String dest) throws IOException {
    PdfDocument pdf = new PdfDocument(new PdfWriter(dest));
    Document document = new Document(pdf);
    Text title =
        new Text("The Strange Case of Dr. Jekyll and Mr. Hyde")
            .setFont(bold);
    Text author = new Text("Robert Louis Stevenson")
        .setFont(font);
    Paragraph p = new Paragraph()
        .setFont(italic).add(title).add(" by ").add(author);
    document.add(p);
    document.close();
}

    In this example, we create FontPrograman fontProgramexample: , boldProgramand italicProgram. We reuse these examples three times to create three PDF documents, and for each PDF document, we often see new PdfFontexamples.

    The following code is wrong, because we try to reuse PdfFontexamples to create different PDF documents:

public static void main(String args[]) throws IOException {
    File file = new File(DEST);
    file.getParentFile().mkdirs();
    C01E02_Text_Paragraph_Cardo2 app =
        new C01E02_Text_Paragraph_Cardo2();
    app.font = PdfFontFactory.createFont(REGULAR, true);
    app.bold = PdfFontFactory.createFont(BOLD, true);
    app.italic = PdfFontFactory.createFont(ITALIC, true);
    for (int i = 0; i < 3; ) {
        app.createPdf(String.format(DEST, ++i));
    }
}

    If you try this code, the following error will be thrown:

com.itextpdf.kernel.PdfException: Pdf indirect object belongs to other PDF document. Copy object to current pdf document.

    This exception will be createPdf()thrown in the second call . Because we try to use the first call createPdf()of the PdfFontinstance.

4. The importance of embedded fonts

    If you try to create a PDF in a different language, the display effect of the PDF will be worse. As shown in the picture below, we tried to add some text in Czech/Russian/Korean, and the text in Czech looks okay.

    Figure 3 below shows what happens if we don't embed fonts:

itext7-h-1-3

Figure 3. Ugly font substitution

    In this example, we have defined three fonts:

PdfFont font = PdfFontFactory.createFont(REGULAR);
PdfFont bold = PdfFontFactory.createFont(BOLD);
PdfFont italic = PdfFontFactory.createFont(ITALIC);

    Among them REGULAR, BOLDand ITALICconstants all point to the correct Cardo font ttf file, but there is no iText to embed the font. By the way, the Cardo font is not installed on the machine that reads the PDF. Adobe Reder uses Adobe Sans MM font to replace them. As we can see, these results do not look very good. If you are not using any standard Type 1 fonts, you should always embed the fonts.

    If you try to create a PDF in a different language, the display effect of the PDF will be worse. As shown in Figure 4 below, we tried to add some text in Czech/Russian/Korean. The text in Czech looks ok, but there is still one character missing (I don’t understand Czech, I can’t see 2333). As for Russian and Korean, they are not displayed at all.

itext7-h-1-4

Figure 4. Incorrect rendering of Czech, Russian and Korean fonts

    Of course, it is necessary to embed fonts, and we should mainly define the correct encoding.

5. Choose the right encoding

    In the above figure, the correct font to be rendered is as follows:

Podivný případ Dr. Jekylla a pana Hyda by Robert Louis Stevenson The
Strange Story of Dr. Jekyll and Mr. Hyde by Robert Louis Stevenson
하이드, 지킬, 나 by Robert Louis Stevenson

    The first line is the Czech translation of "The Strange Case of Dr.Jekyll and Mr.Hyde.". If you look more closely, you can see that the letter R has disappeared, because R does not have Winansi encoding, Winansi encoding, in The default code page in Western operating systems is 1253 (CP-1252, our Chinese default code page is not 1253, we usually use 936, GBK), Windows 1252, is a subset of ISO 8859-1d, also called Latin- 1.

    For Czech, we need to use another encoding. One option is to use the 1250 code page, a code used to represent Central and Eastern Europe in Latin script. The second line is S trannaya istoriya doktora Dzhekila i mistera Khayda . For this article, we can use code page 1251, which is an encoding used to cover languages ​​that use Cyrillic script. Both Cp1250 and Cp1251 are 8-bit character codes. The third line is Hyde, Jekyll, Me , which is a Korean drama based loosely on the story of Jekyll and Hyde. In order to display Korean, we cannot use 8 as the encoding. In order to render this text, we need to use Unicode. Unicode is a computing industry standard used to encode, represent, and process the text expressed in most of the world's writing systems.

When you create a font using 8-bit encoding, iText will create a simple font for the PDF. A simple font consists of up to 256 characters, mapped to up to 256 glyphs. When you use Unicode to create a font (PDF concept: Identity-H for horizontal writing system or Identity-V for vertical writing system ), iText will create a composite font. Composite fonts can contain 65,536 characters. This is less than the total number of code points available in Unicode (1,114,112). This means that no single font can contain all possible characters in all possible languages.

    In addition to Cp1250 and Cp1251, we can use Unicode to represent Czech and Russian. In fact, when we store hard-coded text in the source code, it is better to store Unicode values. As follows:

public static final String CZECH =
        "Podivn\u00fd p\u0159\u00edpad Dr. Jekylla a pana Hyda";
public static final String RUSSIAN =
        "\u0421\u0442\u0440\u0430\u043d\u043d\u0430\u044f "
        + "\u0438\u0441\u0442\u043e\u0440\u0438\u044f "
        + "\u0434\u043e\u043a\u0442\u043e\u0440\u0430 "
        + "\u0414\u0436\u0435\u043a\u0438\u043b\u0430 \u0438 "
        + "\u043c\u0438\u0441\u0442\u0435\u0440\u0430 "
        + "\u0425\u0430\u0439\u0434\u0430";
public static final String KOREAN =
        "\ud558\uc774\ub4dc, \uc9c0\ud0ac, \ub098";

    The next example we will use CZECH, RUSSIANand KOREAN.

Why should we use Unicode to represent special characters?

    When source code files are stored on disk, submitted to a version control system or transmitted in any way, there is always a risk of code loss. If a Unicode file is stored as plain text, a two-byte character will become two single-byte characters. For example, \ ud0aca character with a Unicode value will become two characters d0with acan ASCII code and . When this happens, the syllable (pronounced "kil") becomes Ьand the text becomes illegible. It is a good practice to use Unicode notation in the code snippet above; this will help you avoid encoding problems with source code.

    Using the correct encoding cannot effectively solve all font problems you encounter, such as the following code:

PdfFont font = PdfFontFactory.createFont(FontConstants.TIMES_ROMAN);
document.add(new Paragraph().setFont(font)
        .add(CZECH).add(" by Robert Louis Stevenson"));
document.add(new Paragraph().setFont(font)
        .add(RUSSIAN).add(" by Robert Louis Stevenson"));
document.add(new Paragraph().setFont(font)
        .add(KOREAN).add(" by Robert Louis Stevenson"));

    The Pdf font generated by this code is explicitly incorrect, not only because the correct encoding is not used, but also we have not defined a font that supports Russian and Korean. To this end, we embed FreeSansfonts for Czech and Russian , and use HCR Batangfonts for Korean . We first use Cp1250 and Cp1251 to represent Czech and Russian.

public static final String FONT = "src/main/resources/fonts/FreeSans.ttf";
public static final String HCRBATANG = "src/main/resources/fonts/HANBatang.ttf";
.......
PdfFont font1250 = PdfFontFactory.createFont(FONT, PdfEncodings.CP1250, true);
document.add(new Paragraph().setFont(font1250)
        .add(CZECH).add(" by Robert Louis Stevenson"));
PdfFont font1251 = PdfFontFactory.createFont(FONT, "Cp1251", true);
document.add(new Paragraph().setFont(font1251)
        .add(RUSSIAN).add(" by Robert Louis Stevenson"));
PdfFont fontUnicode =
    PdfFontFactory.createFont(HCRBATANG, PdfEncodings.IDENTITY_H, true);
document.add(new Paragraph().setFont(fontUnicode)
        .add(KOREAN).add(" by Robert Louis Stevenson"));

    The final result is shown in Figure 5:

itext7-h-1-5

Figure 5. Correct rendering of Czech, Russian and Korean fonts

    When we look at the font properties of the document, we can see that the FreeSansfont is used twice. This is correct: the first time we used Cp1250 encoding to add fonts, the second time we used Cp1251 encoding. We try Czech and Russian are used freeUnicodeto represent, that is, FreeSanSthe font, as shown in the following code:

PdfFont freeUnicode =
    PdfFontFactory.createFont(FONT, PdfEncodings.IDENTITY_H, true);
document.add(new Paragraph().setFont(freeUnicode)
    .add(CZECH).add(" by Robert Louis Stevenson"));
document.add(new Paragraph().setFont(freeUnicode)
    .add(RUSSIAN).add(" by Robert Louis Stevenson"));
PdfFont fontUnicode =
    PdfFontFactory.createFont(HCRBATANG, PdfEncodings.IDENTITY_H, true);
document.add(new Paragraph().setFont(fontUnicode)
    .add(KOREAN).add(" by Robert Louis Stevenson"));

    Figure 6 below shows the result. The page display may be the same as the above figure, but in the font properties, both Czech and Russian use Identity-H encoding FreeSansfonts.

itext7-h-1-6

Figure 6. The correct rendering of Czech, Russian and Korean (Unicode)

    For reasons of accessibility (I translated as perceptual, accessible in Chapter 7), the use of Unicode is one of the requirements of PDF/UA and certain PDF/A standards. With custom encoding, it is not always possible to know which glyph each character represents.
    In the next example, we will try to change the properties of the font, such as font size, font color and rendering mode.

6. Font attributes

    Figure 7 below is a screenshot of a PDF. This PDF uses the default font Helvetica, but we have defined different font sizes.

itext7-h-1-7

Figure 7. Different font sizes

    We can use setFontSize()to set the font size. This method is defined in the abstract class ElementPropertyContainer, which means we use it in different objects. In the following code, Textand Paragraphin both use this method.

Text title1 = new Text("The Strange Case of ").setFontSize(12);
Text title2 = new Text("Dr. Jekyll and Mr. Hyde").setFontSize(16);
Text author = new Text("Robert Louis Stevenson");
Paragraph p = new Paragraph().setFontSize(8)
        .add(title1).add(title2).add(" by ").add(author);
document.add(p);

    We set the newly created Paragaphfont size to 8pt. This font will be Paragraphinherited by all objects that join this , except that the object overrides the default font size. For example, we add title1and title2change the font size, and the font we add "by"will inherit the font size, which is 8pt.

In iText5, when we want a font to have different sizes and colors, we have to create different Fontobjects. In Text7, there is no need, just create an PdfFontobject. The font size and color are defined in the basic block, while the font, font size and other attributes can also be inherited from the parent object

    In the previous example, we used different fonts from the same series. For example, we created a document containing three different fonts in the Cardo family: Cardo-Regular, Cardo-Bold and Cardo-Italic. For most western fonts, you can at least find regular fonts, bold fonts, italic fonts, and bold italic fonts. Finding bold, italic and bold italic fonts for Eastern and Semitic languages ​​will be more difficult. In this case, you can use the method shown in Figure 8. If you look closely, you will find that different styles are used, but we only defined a single font in the PDF.

itext7-h-1-8

Figure 8. Imitating different font styles

    code show as below:

Text title1 = new Text("The Strange Case of ").setItalic();
Text title2 = new Text("Dr. Jekyll and Mr. Hyde").setBold();
Text author = new Text("Robert Louis Stevenson").setItalic().setBold();
Paragraph p = new Paragraph()
        .add(title1).add(title2).add(" by ").add(author);
document.add(p);

    For lines 1-3, we used the setItalic()sum setBold()method. setItalic()The method does not re-select an italic font, it italicizes the glyphs of the font to make it appear italic. setBold()The font will change the rendering mode of the font and increase the width of the brush. Next, let's change the color and rendering mode of the text, as shown in Figure 9:

itext7-h-1-9

Figure 9. Different font colors and rendering modes

    code show as below:

Text title1 = new Text("The Strange Case of ").setFontColor(Color.BLUE);
Text title2 = new Text("Dr. Jekyll")
        .setStrokeColor(Color.GREEN)
        .setTextRenderingMode(PdfCanvasConstants.TextRenderingMode.FILL_STROKE);
Text title3 = new Text(" and ");
Text title4 = new Text("Mr. Hyde")
        .setStrokeColor(Color.RED).setStrokeWidth(0.5f)
        .setTextRenderingMode(PdfCanvasConstants.TextRenderingMode.STROKE);
Paragraph p = new Paragraph().setFontSize(24)
        .add(title1).add(title2).add(title3).add(title4);
document.add(p);

    The font program contains the syntax for constructing each glyph path. By default, the path is drawn using the fill operator instead of the stroke operation, but we can change this default value.

  • Line 1: We use the setFontColor()method to set the font color to blue. This will change the fill color of the brush of the painting path
  • Line 2-3: We do not define a font color, which means that the text will be drawn in black. Instead, we use setStrokeColor()methods to define painting colors, and we use setTextRenderingMode()methods to change the rendering mode FILL_STROKE. The result is that the outline of each glyph is green. Inside the outline, we can see the default fill color-black.
  • Line 5: Everything uses default values. TextThe object will simply inherit Paragraphthe font size.
  • -Lines 6-8: Change the fill color to red and use the setStrokeWidth()method to set the brush width to 0.5 user units. The default pen width is 1 user unit, and 1 inch has 72 user units. Then we changed the rendering mode of the text to STROKE, which means that the text will not be filled with the default color brush, so as a result we can only see the outline.

    Imitating bold fonts is achieved by changing the text rendering mode to FILL_STROKEand increasing the width of the brush. Imitating italics is done by using the setSkew()method that will be discussed in Chapter 3 . Although this method can be displayed very well, it is not a good choice to use setBold()and setItalic()is not a good choice. This method is only used when we cannot find the corresponding style of text. The disadvantage of this method is that the rendering method of the text cannot be found when extracting the text from the PDF.

7. Reuse styles

    If you have to construct many different basic blocks, it is cumbersome to define the same style for different objects one at a time. E.g. The part of the text in Figure 10 below-the title of the story uses 13pt TimeRoman font. But the other part-the name of the main task is written in a 12pt size Courier, with a red font and a light gray background.
itext7-h-1-10

Figure 10. Reuse style

    The following example uses Styleobjects to define different styles at once:

Style normal = new Style();
PdfFont font = PdfFontFactory.createFont(FontConstants.TIMES_ROMAN);
normal.setFont(font).setFontSize(14);
Style code = new Style();
PdfFont monospace = PdfFontFactory.createFont(FontConstants.COURIER);
code.setFont(monospace).setFontColor(Color.RED)
        .setBackgroundColor(Color.LIGHT_GRAY);
Paragraph p = new Paragraph();
p.add(new Text("The Strange Case of ").addStyle(normal));
p.add(new Text("Dr. Jekyll").addStyle(code));
p.add(new Text(" and ").addStyle(normal));
p.add(new Text("Mr. Hyde").addStyle(code));
p.add(new Text(".").addStyle(normal));
document.add(p);

    In lines 1-3, we define a normalstyle; in lines 4-7, we define a codestyle-Courier font is often used when representing code blocks. In lines 8-13, I Paragraphadded different Textobjects to one . We set these Textobjects to normalor code.

    StyleAn object is a ElementPropertyContainersubclass of the class, and ElementProperttContainerthe base class of all building blocks. It contains a series of setters and getters for properties, such as font, color, border, size, and position. You can AbstractElementuse addStyle()methods on each subclass to set these properties at once.

Combining multiple attributes in a class is a new feature in iText7, which can write a lot less code than iText5

StyleYou can not only set the font     in the class, you can even set BlockElementthe inner and outer margins of d, which BlockElementwill be discussed in Chapter 4 and Chapter 5.

8. Summary

    In this chapter, we introduced PdfFontclasses and discussed font programs, embedding fonts and using different encodings. We used English, Czech, Russian and Korean to show the same title. Then we set the font properties, such as font size, font color and rendering mode. Finally we imitated the bold and italic styles.

    Of course, there is a lot more to say about fonts, which we will mention in the next tutorial. The next chapter we will discuss comprehensive If you create a PDF, to discuss RootElementthe implementation class Documentand Canvas.

Guess you like

Origin blog.csdn.net/u012397189/article/details/79888252