Typography Home Typography Home

Developing fonts > Specifications

Developing OpenType Fonts
for Gurmukhi Script (1 of 3):
Introduction

Microsoft Typography
Last updated: August 2008

Please note: This document reflects the changes made in 2005 for recommendations for Indic-script OpenType font and shaping-engine implementations. While Indic fonts made according to the earlier recommendations will still function properly in new versions of Uniscribe, font developers may wish to update their fonts, particularly if they wish to avoid certain limitations of the earlier implementation.

This document presents information that will help font developers create or support OpenType fonts for the Gurmukhi script covered by the Unicode Standard. Gurmukhi is used to write the Punjabi language in the Punjab in India.

This is a multi-page specification. To access specific pages, use the Contents section below, or the navigation bar at the bottom of each page.

Introduction

This document targets developers implementing Indic shaping behavior compatible with Microsoft OpenType specification for Indic scripts. It contains information about terminology, font features and behavior of the Indic shaping engine in regards to the Gurmukhi script. While it does not contain instructions for creating Gurmukhi fonts, it will help font developers understand how the Indic shaping engine processes Indic text. In addition, registered features of the Gurmukhi script are defined and illustrated with examples.

The new Indic shaping engine allows for variations in typographic conventions, giving a font developer control over shaping by the choice of designation of glyphs to certain OpenType features. For example, the location where pre-pended matras and the reph (in scripts like Devanagari) are re-ordered within a syllable cluster is affected by the presence of a half form. While there are no half-forms in Gurmukhi, the half feature is made available for typographic preferences. See illustrations below.

Indic shaping

The Indic shaping engine always re-orders a pre-pended matra immediately in front of the previous base consonant (or half form, if there is one).

Option 1: (default results) because the Ka is not listed in the half feature; the shaping engine treats the Ka as the first main consonant and re-orders the I-matra immediately in front of the previous base consonant.

Option 2: While the Ka does not have a true half form in Gurmukhi, it can be listed in the 'half' feature lookup substituting the 'halant form' of Ka. Thus, the shaping engine will treat it as a half form and the I-matra will be positioned immediately in front of the "half-form" K(a).

Glossary

The following terms are useful for understanding the layout features and script rules discussed in this document.

Above-base form of consonants - A variant form of a consonant that appears above the base glyph.

Akhand ligatures - Required consonant ligatures that may appear anywhere in the syllable, and may or may not involve the base glyph. Akhand ligatures have the highest priority and are formed first; some languages include them in their alphabets. Akhand ligatures may be displayed in either half- or full-form.

Base glyph - The only consonant or consonant conjunct in the orthographic syllable that is written in its "full" (nominal) form. In Gurmukhi, the last consonant of the syllable usually forms the base glyph. In "degenerate" syllables that have no vowel (last letter of a word), the last consonant in halant form serves as the base consonant and is mapped as the base glyph. Layout operations are defined in terms of a base glyph, not a base character, since the base can often be a ligature.

Below-base form of consonants - A variant form of a consonant that appears below the base glyph. In the glyph sequence, the below-base form comes after the consonant(s) that form the base glyph. Below-base forms are represented by a non-spacing mark glyph.

Cluster - A group of characters that form an integral unit in Indic scripts, often times a syllable.

Consonant - Each represents a single consonant sound. Consonants may exist in different contextual forms and have an inherent vowel (usually, the short vowel "a"). For example, "Ka" and "Ta", rather than just "K" or "T."

Consonant conjuncts (aka "conjuncts") - Ligatures of two or more consonants. Consonant conjuncts may have both full and half forms, or only full forms.

Gurmukhi syllable - Effective orthographic "unit" of Gurmukhi writing systems. Syllables are composed of consonant letters, independent vowels, and dependant vowels. In a text sequence, these characters are stored in phonetic order (although they may not be represented in phonetic order when displayed). Once a syllable is shaped, it is indivisible. The cursor cannot be positioned within the syllable. Transformations discussed in this document do not cross syllable boundaries.

Halant (Virama) - The character used after a consonant to "strip" it of it's inherent vowel. A Halant follows all but the last consonant in every Gurmukhi syllable.

NOTE: A syllable containing halant characters may be shaped with no visible halant signs by using different consonant forms or conjuncts instead.

Halant form of consonants - The form produced by adding the halant (virama) to the nominal shape. The Halant form is used in syllables that have no vowel or as the half form when no distinct shape for the half form exists.

Half form of consonants (pre-base form) - A variant form of consonants which appear to the left of the base consonant, if they do not participate in a ligature. Consonants in their half form precede the ones forming the base glyph. Some Indic scripts, like Devanagari have distinctly shaped half forms for most of the consonants. If not distinct shape exists, the full form will display with an explicit Vrama (same shape as the halant form).

Matra (Dependent Vowel) - Used to represent a vowel sound that is not inherent to the consonant. Dependent vowels are referred to as "matras" in Sanskrit. They are always depicted in combination with a single consonant, or with a consonant cluster. The greatest variation among different Indian scripts is found in the rules for attaching dependent vowels to base characters.

New shaping behavior - Shaping behavior defined in this version of the Indic OpenType Font Specification. Information in this document relates primarily to the new implementation model. Old behavior may be mentioned in comments about compatibility.

Nukta - A combining character that alters the way a preceding consonant (or matra) is pronounced.

Old shaping behavior - Shaping behavior defined in previous versions of the Indic OpenType Font Specification.

OpenType layout engine - Library responsible for executing OpenType layout features in a font. In the Microsoft text formatting stack, it is named OTLS (OpenType layout services).

OpenType tag - 4-byte identifier for script, language system or feature in the font.

Post-base form of consonants - A variant form of a consonant that appears to the right of the base glyph. A consonant that takes a post-base form is preceded by the consonant(s) forming the base glyph plus a halant (virama). Post-base forms are usually spacing glyphs.

Pre-base form of consonants - A variant form of a consonant that appears to the left of the base glyph. Note that most pre-base consonant forms are logically as well as visually before the base consonant. Half forms are examples of this kind of pre-base form. In some scripts, though, a pre-base Ra may logically follow the base consonant (that is, it follows it phonetically and in the character sequence of the text), even though it is presented visually before the base. The shaping engine detects such cases dynamically using the <pref> feature and re-orders the pre-base-form glyph as needed.

Reph - The above-base form of the letter "Ra" that is used in the Devanagari script, when "Ra" is the first consonant in the syllable and is not the base consonant.

Shaping Engine - Code responsible for shaping input, classified to a particular script.

Split Matra - A matra that is decomposed into pieces for rendering. Usually the different pieces appear in different positions relative to the base. For instance, part of the matra may be placed at the beginning of the cluster and another part at the end of the cluster.

Syllable - A single unit of Indic text processing. Shaping of Indic text is performed independently for each syllable. Process of identifying boundaries of each syllable is described below.

Vattu - A below-base form of a consonant.

Indic forms

1. Pre-base form
2. The base consonant
3. Above-base form (reph)
4. Post-base (matra)
5. Below-base form (vattu/rakaar)

Next section:  Shaping Engine

introduction | shaping engine | features | appendices


Top of page