State of Text Rendering

原文在:http://behdad.org/text/

啥时候deepin的字体渲染也会很好呢?

State of Text Rendering

By Behdad Esfahbod <behdad behdad org> 
Last major update: January 18, 2010 
Last minor update: December 18, 2012

Disclaimer

At the time of writing the initial version of this paper, the author was working for Red Hat's Desktop team and has been involved with GNOME and Fedora for a long time. He has been a developer and/or maintainer of many modules discussed in this paper at various times, including fribidi, fontconfig, harfbuzz, pango, cairo, and gnome-terminal.

Introduction

Text is the primary means of communication in computers, and is bound to be so for the decades to come. With the widespread adoption of Unicode as the canonical character set for representing text a whole new domain has been opened up in a desktop system software design. Gone are the days that one would need to input, render, print, search, spell-check, ... one language at a time. The whole concept of internationalization (i18n) on which Unicode is based is all languages, all the time.

The Free Software desktop has been rather late to the Unicode bandwagon, but in the past ten years all the major pieces have gathered together and nowadays, on a modern GNU/Linux distribution like Fedora, one cannot easily get anything other than Unicode working.

Internationalization and Unicode text processing are about more than just rendering text on the screen. However, in this paper we focus on the specific problem of text rendering, ie. from input Unicode text to pixels lit on the screen. We will discuss the current architecture, identify problems that have limited progress in recent years, and propose actions to be taken to remedy them.

While there are multiple text rendering stacks available in the Free Software world and even on a single GNU/Linux desktop, in this document we focus on the GNOME text rendering stack and the Fedora Project where it comes to distro-specific issues. Fedora and Red Hat have been showing leadership in advancing the text stack for years, and other distributions have been fast adopting these new technologies. We expect that to remain the case for the years to come, although it would be nice to see other distributions / communities start contributing more closely to the parts of the stack we all share.

This document is a draft working-copy paper. It is a roadmap of where we are now and where we want to be, and will be updated as we get there.

Status Quo

When we talk about the text rendering stack, we really mean a collection of separate modules sitting on top of each other:

FreeType

Performs font rasterization. Given font data (file or data in memory), it does simple (non-complex) mapping of Unicode characters to glyph indices and rendering glyphs to images.

Fontconfig

Performs font selection based on a pattern of desired font characteristics. These characteristics typically include a family name, style, weight, slant, size, as well as language. Font configuration happens by way of a set of very expressive XML rules. Fontconfig uses FreeType to inspect fonts and caches the results in an mmap()able architecture-specific binary cache.

FriBidi

GNU FriBidi is an implementation of the Unicode Bidirectional Algorithm. Pango uses FriBidi and has an internal copy of it. AbiWord is the other major user of FriBidi. Many other projects use FriBidi as the simplest route to add support for Hebrew and Arabic scripts without adding support for a full complex text rendering engine.

HarfBuzz

HarfBuzz is the meat of the modern GNU/Linux text rendering stack. With OpenType emerging as the universal font format supporting complex text rendering, HarfBuzz, as an OpenType Layout engine, is where all the magic happens. In fact it is of such importance to the stack that it deserves an entire section of its own in this document.

Pango

Pango is, for the most part, the roof of the text rendering stack. Components sitting on top of Pango (eg. GTK+) need not know about complexities of i18n text and are expected to simply use these opaque objects called PangoLayout's. Pango has been designed to satisfy GTK+'s needs for i18n text. However, Pango still provides a low-level API on which one can build their own layout engine. This is what Firefox, Webkit-GTK, etc do, but it has proved to be a cumbersome practice. We will expand on that later.

There are other modules that are not immediately relevant to text rendering but facilitate getting the text on the screen: The X render extension provides the basic support for caching client-side rendered glyph shapes in the X server and showing them on the screen. Glyphs are rendered by the client (ie. application) and uploaded to the X server which will then hash and only keep one copy of each image, but each client has to go through the render+upload phase regardless. There are various higher-level wrappers around the text-rendering functionality of X render: the old and semi-obsolete one being Xft. These days however, cairo does that job for the GNOME stack and Qt does it for KDE.

HarfBuzz

Traditionally fonts were a collection of glyphs and a simple one-to-one mapping between characters and glyphs. Rudimentary support for simple ligatures was available in some font formats. With Unicode however there was a need for formats allowing complex transformation of glyphs (substitution and positioning). Two technologies were developed to achieve that, one is OpenType Layout from Microsoft and Adobe, the other is AAT from Apple. These two technologies, plus TrueType and Type1 font formats, all were combined in what is called OpenType.

There are fundamental differences in how AAT and OpenType Layout work. In AAT the font contains all the logic required to perform complex text shaping (the process of converting Unicode text to glyph indices and positions). Whereas in OpenType, the script-specific logic (say, Arabic cursive joining, etc) is part of the standard and implemented by the layout engine, with fonts providing only the font-specific data that the layout engine can use to perform complex shaping.

The Free Software text stack is based on the OpenType Layout technology. HarfBuzz is an implementation of the OpenType Layout engine (aka layout engine) and the script-specific logic (aka shaping engine).

History

Originally the FreeType project implemented the OpenType Layout engine as part of the FreeType 2 project, however it was dropped from FreeType at the last moment when it was decided that OpenType shaping is not involved in rasterizing glyphs and hence is out of the scope of FreeType. The FreeType Layout (FTL) code was salvaged by Pango and Qt developers and kept in house for quite a few years. Owen Taylor developed an abstract buffer on top of the layout engine making it much easier to use.

Around 2006 Pango and Qt developers cooperated to reunify the layout engine again, and HarfBuzz was born as a freedesktop.org project. Initially it was just merging back the existing code and renaming it, but after various meetings, the plan to make HarfBuzz be a unified shaping engine was born and have been the goal since. HarfBuzz was relicensed (thanks to FreeType developers) to the old MIT license to rid it of the FTL advertisement clause.

In 2007 (?) TrollTech donated the Qt shapers to HarfBuzz under the same license as the layout engine code. This is the current state of HarfBuzz. At this time Qt ships with its own copy of HarfBuzz which is identical to the upstream HarfBuzz. Pango ships with its own copy also, but only uses the layout engine, and not the HarfBuzz shapers.

Since 2008 the author has been working on rewriting the layout engine to be more robust and use mmap()ed fonts efficiently, and that work is mostly done now. Next step is to design a user-friendly high-level API for the shaping engine and merge the Pango and Qt shapers and put them under the new API. This is a work in progress by Red Hat and Mozilla.

HarfBuzz is currently being used by Pango, Qt, the Linux port of Google's Chromium browser, as well as some smaller project. The grand plan is for it to be used directly by any code needing direct access to a portable and robust complex shaping engine. That would include toolkits, browsers, word processors, and design applications. We will expand on that in a later section.

Other Free Software Shaping Engines

ICU

ICU is the Internationalization Classes for Unicode, a library developed by IBM with existing ports in C, C++, and Java. It does a lot more than shaping, and is a huge library. That's perhaps the main reason why it is not used widely for shaping. The most notable users of ICU are the OpenOffice.org suite and Sun's Java implementation. It is highly probable that ICU will be ported to using HarfBuzz when HarfBuzz gets to production stage.

m17n

Mostly of academic importance, m17n is an internationalization framework that includes a shaping engine. Its most notable characteristic is that it is based on language- and script-specific shaping rules expressed as Lisp code. Latest versions of Emacs use m17n for complex text rendering.

SIL Graphite

SIL Graphite is a complex/smart-font technology parallel to OpenType Layout. In this framework, the font itself contains all the shaping logic and the engine has no language- or script-specific knowledge. This allows for developing fonts for minority scripts and languages without having to update the engine first. For established scripts though, there is not much reason to prefer Graphite over OpenType.

Consumers

One can loosely divide the consumers of the text rendering stack based on their varying demands and requirements:

  • GUI Toolkits like GTK+ and Qt need the least flexibility when it comes to text rendering. Indeed, all the user cares about is that the text is rendered to the screen and is legible. Pango has been specifically designed with this use case in mind. Qt is even worse in that it pretty much does not support any other use case. Ultimately most (all?) other use cases should be made to use HarfBuzz directly, freeing Pango to do what it's really good at: Providing an easy-to-use API for GUI toolkits.

  • Web browsers have two unique requirements that make it hard to use the native text stack in full:

    1. They have to abide by the very strict CSS font selection rules (as opposed to, say, fontconfig's),
    2. They need to support web fonts, that is, fonts that are not installed on the system and are downloaded from the web on demand.

    It is worthwhile to review what the various web browser engines currently use for their complex text support:

    • Firefox uses Pango. Firefox 2 was hacked to use PangoLayout API. That was very abusive and inherently inefficient. Firefox 3 has got a new layout engine that is completely based on cairo. The Linux port subclasses PangoFcFontMap to be able to support both CSS text selection as well as web fonts. By doing that it is essentially reimplementing most of Pango and only using the shaping logic. It makes much more sense to use HarfBuzz directly, and Mozilla is now working on getting HarfBuzz ready for that.

    • Webkit-GTK uses PangoCairo. They use Pango the same way that Firefox 2 used to do. At the end of the day, it's at best a hack. Moving to HarfBuzz when the time is right should fix that.

    • Webkit/Android is the webkit engine as used by Google in Android and Chromium. It uses a system called Skia for 2D graphics rendering. The team at Google has released an alpha Linux port of Chromium that is using HarfBuzz directly.

  • Word processors' biggest unique demand from a shaping engine is good support for and lots of control on line hyphenation and justification. Also important to them is choosing fonts as closely and robustly as possible to the font requested by the document. Device-independent metrics as well as metric-compatibility with other word processors is another requirement (required by all kinds of office suite applications really.) OpenOffice.org currently uses ICU and AbiWord uses Pango. Both will have a better time using HarfBuzz shaping directly.

  • Designer tools demand much of what word processors do, but also access to advanced font features (alternate glyphs, etc), being able to correctly handle fonts with many various (and non-standard) styles, things like manual kerning, as well using SVG fonts and embedded subsetted fonts. Inkscape and The GIMP use Pango currently, and Scribus is in transition to / has been ported to using HarfBuzz directly.

  • Font designer tools can use HarfBuzz directly to generate previews. Other than that there is not much else to share really, even though they both deal with the very same objects (fonts): the font tools needs to be able to generate font tables, which is out of the scope of a shaping engine. Fontforge has the option to use Pango currently.

  • Terminal emulators with support for complex text are very weird hybrids. On the one hand terminal emulators have to lay text out in a predefined grid in a predefined way, which is in conflict with many aspects and requirements of complex text, on the other hand users demand support for complex text in their terminals. It gets uglier when you think about bidirectional text, say, inside a console text editor. Nonetheless, it is fair to say that such hybrids do not put any new demands on the shaping engine. gnome-terminal currently has no support for complex text other than combining marks. Konsole has bidirectional text support. Apple's Terminal App has at least bidi support as well as Arabic shaping support, not sure about other complex text. Update (Jan 18, 2010): The terminal mode (term and ansi-term) in recent versions of Emacs can render complex text, including Indic.

  • Batch document processors have no requirements other than what's required by, say, browsers or word processors. However, so far a decent internationalized batch document processor has pretty much been nonexistent. The reason historically has been that shaping engines were always developed in the context of GUI frameworks, and batch processors typically do not rely on one, and hence are designed without the i18n models present in all moderns GUIs in mind. XMLFO / Docbook processors, etc fit in this category and should use HarfBuzz and the rest of the stack for full complex text support.

  • TeX engines are batch document processors but worth looking into separately. Historically TeX had no shaping engines and basic shaping was done using macro packages and a variety of hacks. More recently though, XeTeX was invented. XeTeX simply outsources the shaping to an external library, ICU or Apple's ATSUI currently. XeTeX is a separate branch of the TeX evolutionary hierarchy than the mainstream pdfTeX though. The XeTeX creator is working on HarfBuzz on behalf of Mozilla now, and plans to port XeTeX to HarfBuzz eventually. In the long term though, pdfTeX's successor luaTeX should be made to do the same thing. There is more to Unicode support than just shaping, and in those areas the TeX engines can gain a lot by building on top of existing libraries.

The Problem

Over the past few years the Free Software text stack has made a lot of progress. When one looks at each piece, technical excellence is evident. For example:

  • FreeType has the widest range of supported font formats in the world.

  • Fontconfig has the most expressive font configuration language. In fact, other text rendering stacks simply don't have much of a configuration mechanism.

  • Fontconfig-based text stacks have been the first to support font fallback based on glyph coverage transparently. This feature appeared in Mozilla around the same time (early 2000s) and only recently in Windows Vista as far as I know, though I've been pointed to this which suggests that per-script font fallback was supported since Windows 2000, but certainly not per-codepoint.

  • FriBidi has been the most standard-complying implementation of the Unicode Bidirectional Algorithm for years. All algorithm bugs found in FriBidi over the years have revealed bugs in the spec itself as well as its reference implementations.

  • Pango has been the first text stack to support some of the minority scripts encoded in Unicode, even before that version of Unicode was officially released.

However, when one stands back and looks at the stack as a whole, it is not something to envy. As a whole, we have not been making ground-breaking progress for quite a while. The last major progress was the move to client-side fonts itself which fueled a renaissance. Since then, it has mostly been bug fixing, cleanup, polish, small features here and there. Pretty similar to the GNOME2 status one would say. Indeed, the client-side fonts were first introduced in early GNOME2. What we need is the GNOME3 of text rendering, in time for GNOME3.

To those familiar with the text stack, it is hard to not see what is wrong. I believe there are two problems: 1) the current stack is good enough, so improving it stays low-priority for parties involved, and 2) what I like to call segregated efforts. By that I mean, for example:

  • Fontconfig exposes a dangerously expressive configuration language in XML, and from there any misconfiguration is font packager's fault and not fontconfig's. Even if it is painfully hard to express sane preferences using fontconfig rules.

  • Fontconfig rightly chose XML as its configuration language, the idea being that font configuration GUIs can be developed to output configuration XML based on user preferences. This has yet to be realized though. Update (Jan 18, 2010): There is a small GUI project called Fontik that allows per-font configuration. This is a step in the right direction IMO.

  • Pango maintainer adds new features, making it possible to implement advanced features in more demanding applications like designer tools, but leaves it to others to actually implement such features in the tools higher in the stack. All the new features get is a mention in the release notes and, if lucky, a blog post.

  • Text Layout Summit has been a fairly successful meeting bringing all interested parties together since 2006 (it has been held at GNOME Boston Summit 2006, Akademy 2007, and Libre Graphics Meeting 2008). However, those same parties do not work as closely out of the meeting. For example, while Scribus and HarfBuzz developers have been discussing various issues at length during these meetings, there has not been a much email exchange between the projects over the years.

  • unifont.org is a website that has many of the right aspirations for how an ideal font and text ecosystem should look like, but the project does not work with the HarfBuzz and other library designers as closely as I like to see.

  • SIL Graphite and m17n both are reinventing the wheel in one sense. They each definitely have their own justifications why a new wheel may be needed, but the fact remains, that their efforts does not advance the mainstream GNU/Linux text rendering pipeline.

One may even argue that the extremely modular design of GNU/Linux systems makes it painfully hard to expose a truly integrated solution, in many areas including text rendering. For example, the X architecture combined with client-side font rendering makes it close to impossible to optimize the pipeline to take advantage of all the possibilities exposed by modern GPUs, like Microsoft does for example. However, that excuse is irrelevant as it may be part of the problem statement, but it hardly is the answer.

Recent Advances

Only recently have the Desktop Team at Red Hat and the Fedora Font SIG started working on features that extend across the stack (vertically or horizontally):

猜你喜欢

转载自blog.csdn.net/qq_32768743/article/details/88831109