Typography Home Typography Home

Developing fonts > Specifications

Developing OpenType Fonts
for Gujarati Script (1 of 3):
Introduction

Microsoft Typography
March 2002

Please note: Microsoft's OpenType font specifications for Indic scripts have been revised. This version is for reference purposes only. An updated version of the Gujarati specification can be found here.

This document presents information that will help font developers create or support OpenType fonts for the Gujarati script covered by the Unicode Standard. Gujarati is closely related to Devanagari and is used to write the Gujarati language of north India.

This is a multi-page specification. To access specific pages, use the Contents section below, or the navigation bar at the bottom of each page.

Introduction

In this specification, font developers will learn how to encode complex script features in their fonts, choose character sets, organize font information, and use existing tools to produce Gujarati fonts. Registered features of Gujarati script are defined and illustrated, encodings are listed, and templates are included for compiling Gujarati layout tables for OpenType fonts.

This document also presents information about the Gujarati OpenType shaping engine of Uniscribe, the Windows component responsible for text layout.

In addition to being a primer and specification for the creation and support of Gujarati fonts, this document is intended to more broadly illustrate the OpenType Layout architecture, feature schemes, and operating system support for shaping and positioning text.

Glossary

The following terms are useful for understanding the layout features and script rules discussed in this document.

Akhand ligatures - Required consonant ligatures that may appear anywhere in the syllable, and may or may not involve the base glyph. Akhand ligatures have the highest priority and are formed first; some languages include them in their alphabets. Akhand ligatures may be in either half or full or other form.

Base glyph - The only consonant or consonant conjunct in the syllable that is written in its "full" (nominal) form. The last consonant of the syllable (except for syllables ending with letter "Ra") usually forms the base glyph. Layout operations are defined in terms of a base glyph, not a base character, since the base can often be a ligature.

Below-base form of consonants - The form that consonants appear below the base glyph. Consonants in below-base form appear in Gujarati syllables after the ones that form the base glyph. Below-base forms are represented by the non-spacing mark glyph.

Consonant - Each represents a single consonant sound. Consonants may exist in different contextual forms, and have an inherent vowel (usually, the short vowel "a"). Therefore, those illustrated in the examples below are named, for example, "Ka" and "Ta", rather than just "K" or "T."

Consonant conjuncts (aka 'conjuncts') - Ligatures of two or more consonants. Consonant conjuncts may have both full and half forms, or only full forms.

Gujarati syllable - Effective orthographic "unit" of Gujarati writing systems, consisting of a consonant and a vowel core, and optionally preceded by one or more consonants. Syllables are composed of consonant letters, independent vowels, and dependant vowels. In a text sequence, these characters are stored in phonetic order (although they may not be represented in phonetic order when displayed). Once a syllable is shaped, it is indivisible. The cursor cannot be positioned within the syllable. Transformations discussed in this document do not cross syllable boundaries.

Halant (Virama) - The character used after a consonant to "strip" it of its inherent vowel. A Virama follows all but the last consonant in every Gujarati syllable.

NOTE: A syllable containing halant characters may be shaped with no visible halant signs by using different consonant forms or conjuncts instead.

Halant form of consonants - The form produced by adding the Virama to the nominal shape. The Halant form is used in syllables that have no vowel or as the half form when no distinct shape for the half form exists.

Half form of consonants (pre-base form) - The form where consonants appear to the left of the base consonant, if it does not participate in a ligature. Consonants in their half form precede the ones forming the base glyph. Gujarati has distinctly shaped half forms for most of the consonants. If a consonant does not have a distinct shape for the half form, and does not form any ligature, it will be displayed with an explicit Virama; that is, in this case, the half form and the halant form have the same shape.

Matra (Dependent Vowel) - Used to represent a vowel sound that is not inherent to the consonant. Dependent vowels are referred to as "matras" in Sanskrit. They are always depicted in combination with a single consonant, or with a consonant cluster. The greatest variation among different Indian scripts is found in the rules for attaching dependent vowels to base characters.

Nukta - Character that alters the way a preceding consonant is pronounced.

Post-base form of consonants - Form in which consonants appear to the right of the base glyph. Post-base forms are usually spacing glyphs.

Pre-base form of consonants - Form in which consonants appear to the left of the base glyph.

Reph - Above-base form of the letter "Ra" that is used in most scripts if "Ra" is the first consonant in the syllable and is not the base consonant.

Vattu (Rakar) - Below-base form of letter "Ra". Vattu requires exceptional treatment for two reasons: 1) it may become a below-base form to half form glyphs, as well as to full form glyphs; 2) it often assumes different shapes depending on what consonant it follows. Consonants will form required ligatures with Vattu, producing vattu ligatures that can either be in half or full form.

Notation

The following notation is used in this document to illustrate layout operations:

C - Consonant character

K - Generalized consonant, including:

  • Akhand ligatures
  • Nukta forms of consonants (ligatures with Nukta)

Generalized consonants are produced by composing akhands and composing ligatures that contain the nukta sign.

Ah, Af - Akhand half and full form respectively

V - Vattu glyph

Reph - Reph glyph

Cf, Kf - Glyph representing a full form of a consonant/generalized consonant

Ch, Kh - Half form of a (generalized) consonant

Cs, Ks - Below-base form of a (generalized) consonant

Cp, Kp - Post-base form of a (generalized) consonant

H - Halant character or glyph

M - Matra character or glyph

VO - Independent vowel

Lf - Consonant conjunct

Lh - Half form ligature (except Akhand) e.g. a vattu ligature

VM - Vowel modifier character or glyph

SM - Stress or tone mark (e.g. Udatta, Anudatta, Acute, Grave)

LM - Length mark

{ } – Indicates 0, 1 or multiple occurrence

[ ] – Indicates 0 or 1 occurrence

Next section:  Shaping Engine

introduction | shaping engine | features | appendices


Top of page