Typography Home Typography Home

Developing fonts > Specifications

Developing OpenType Fonts
for Lao Script (2 of 3):
Shaping Engine

The Uniscribe Lao shaping engine processes text in stages. The stages are:

  1. Analyze characters for valid diacritic combinations
  2. Shape (substitute) glyphs with OTLS (OpenType Library Services)
  3. Position glyphs with OTLS

The following sections will help font developers understand the rationale for the Lao feature encoding model, and help application developers better understand how layout clients can divide responsibilities with operating system functions.

Analyze Characters

The unit that the shaping engine receives for the purpose of shaping is a string of Unicode characters, in a sequence. The contextual analysis engine verifies valid diacritic combinations. For additional information, see Invalid Combining Marks.

The handling of the AM in the analysis phase is special. In the case where an above mark does not exist on the preceding base consonant, the 'ccmp' feature will be used to decompose the AM into the NIGGAHITA and AA glyphs. This allows the NIGGAHITA glyph to be positioned correctly above the preceding base consonant. If there is a tone mark on the base consonant already, the analysis engine will decompose the AM and reorder the NIGGAHITA to between the base consonant and the tone mark. This allows the NIGGAHITA glyph to be positioned correctly above the base consonant, and the tone mark to be positioned correctly above the NIGGAHITA. This behavior cannot be tested in VOLT, as this logic is not in VOLT.

Shape Glyphs with OTLS

The first step Uniscribe takes in shaping the character string is to map all characters to their nominal form glyphs.

Next, Uniscribe calls OTLS to apply the features. All OTL processing is divided into a set of predefined features (described and illustrated in the Features section). Each feature is applied, one by one, to the appropriate glyphs in the syllable and OTLS processes them. Uniscribe makes as many calls to the OTL Services as there are features. This ensures that the features are executed in the desired order.

The steps of the shaping process are outlined below.

Shaping features:

  1. Language forms
    1. Apply feature 'ccmp' to preprocess any glyphs that require composition or decomposition

Position Glyphs with OTLS

Uniscribe next applies features concerned with positioning, calling functions of OTLS to position glyphs.

Positioning features:

  1. Kerning
    1. Apply feature 'kern' to provide pair kerning between base glyphs requiring adjustment for better typographical quality
  2. Mark to base
    1. Apply feature 'mark' to position diacritic glyphs to the base glyph
  3. Mark to Mark
    1. Apply feature 'mkmk' to position diacritic glyphs to other diacritic glyphs

Invalid Combining Marks

Combining marks and signs that appear in text not in conjunction with a valid consonant base are considered invalid. Uniscribe displays these marks using the fallback rendering mechanism defined in the Unicode Standard (section 5.12, 'Rendering Non-Spacing Marks' of the Unicode Standard 3.0), i.e. positioned on a dotted circle.

For the fallback mechanism to work properly, a Lao OTL font should contain a glyph for the dotted circle (U+25CC). In case this glyph is missing from the font, the invalid signs will be displayed on the missing glyph shape (white box).

In addition to the 'dotted circle', other Unicode code points that are recommended for inclusion in any Lao font is the ZWSP (zero width space; U+200B). Lao words are not separated by spaces, so the ZWSP can be used for word boundaries since it will allow for word wrapping at the end of a line. Some applications will use a lexical lookup to do word wrapping without needing ZWSP characters.

If an invalid combination is found, the diacritic that causes the invalid state is placed on a dotted circle to indicate to the user the invalid combination. The shaping engine for non-OpenType fonts will cause invalid mark combinations to overstrike. This is the problem that inserting the dotted circle for the invalid base solves. It should also be noted that the dotted circle is not inserted into the application's backing store; this is a run-time insertion into the glyph array that is returned from the ScriptShape function.

The invalid diacritic logic for Lao is based on the classes listed below. There is a check to make sure more than one mark of a class is not placed on the same base.

ClassDescriptionCode points
ABOVE1Above mark closest to baseU+0EB1, U+0EB4, U+0EB5, U+0EB6, U+0EB7, U+0EBB, U+0ECD
ABOVE2Second level above markU+0EC8, U+0EC9, U+0ECA, U+0ECB, U+0ECC
BELOW1Below mark closest to baseU+0EBC
BELOW2Second level below markU+0EB8, U+0EB9
AMThe AM character is decomposed into two glyphs (NIGGAHITA and AA). The NIGGAHITA is of class ABOVE1.U+0EB3

Next section:  Features

introduction | shaping engine | features | appendix


Top of page