Microsoft Typography | Developer information | Specifications | OpenType font development
Indic OpenType Specification | Terms | Shaping | Features | Other | Appendix


The following terms are useful for understanding the layout features and script rules discussed in this document.

Akhand ligatures - required consonant ligatures that may appear anywhere in the syllable, and may or may not involve the base glyph. Akhand ligatures have the highest priority and are formed first; some languages include them in their alphabets. Akhand ligatures may be in either half or full or other form.

Base glyph - the only consonant or consonant conjunct in the syllable that is written in its "full" (nominal) form. In Devanagari, the last consonant of the syllable (except for syllables ending with letter "Ra") usually forms the base glyph; in other scripts the first consonant or conjunct (e.g. Telugu), or others, may form the base glyph. In "degenerate" syllables that have no vowel (last letter of a word), the last consonant in halant form serves as the base consonant and is mapped as the base glyph. Layout operations are defined in terms of a base glyph, not a base character, since the base can often be a ligature.

Below-base form of consonants - the form in which consonants appear below the base glyph. Consonants in below-base form appear in Indic syllables after the ones that form the base glyph. Below-base forms are represented by the non-spacing mark glyph.

Consonant - each represents a single consonant sound. Consonants may exist in different contextual forms, and have an inherent vowel (usually, the short vowel "a"). Therefore, those illustrated in the examples below are named, for example, "Ka" and "Ta", rather than just "K" or "T."

Consonant conjuncts - (also referred to as simply conjuncts) ligatures of two or more consonants. Consonant conjuncts may have both full and half forms, or only full forms.

Halant (Virama) - the character used after a consonant to "strip" it of its inherent vowel. A Virama follows all but the last consonant in every Indic syllable; except in languages like Sanskrit, Tamil, and Malayalam the last consonant may also have a Virama.

Note: a syllable containing halant characters may be shaped with no visible halant signs by using different consonant forms or conjuncts instead.

Halant form of consonants - form produced by adding the Virama to the nominal shape. The Halant form is used in syllables that have no vowel or as the half form when no distinct shape for the half form exists. In some scripts (Bengali, Tamil) the half and halant forms are always the same. While in other scripts like Malayalam, some consonants have more than one way of representing Halant forms. These alternate forms have distinct shapes that do not show a visible Halant.

Half form of consonants (pre-base form) - form in which consonants appear to the left of the base consonant, if it does not participate in a ligature. Consonants in their half form, precede the ones forming the base glyph. Devanagari has distinctly shaped half forms for most of the consonants. If a consonant does not have a distinct shape for the half form, and does not form any ligature, it will be displayed with an explicit Virama; that is, in this case, the half form and the halant form have the same shape.

Indic syllable - the effective orthographic "unit" of Indic writing systems; consisting of a consonant and a vowel core, and optionally preceded by one or more consonants. Syllables are composed of consonant letters, independent vowels, and dependant vowels. In a text sequence, these characters are stored in phonetic order (although they may not be represented in phonetic order when displayed). Once a syllable is shaped, it is indivisible. The cursor cannot be positioned within the syllable. Transformations discussed in this document do not cross syllable boundaries.

Matra (Dependent Vowel) - used to represent a vowel sound that is not inherent to the consonant. Dependent vowels are referred to as "matras" in Sanskrit. They are always depicted in combination with a single consonant, or with a consonant cluster. The greatest variation among different Indian scripts is found in the rules for attaching dependent vowels to base characters.

Nukta - a character that alters the way a preceding consonant is pronounced.

Post-base form of consonants - form in which consonants appear to the right of the base glyph. Examples include; Oriya "Ya", Malayalam "Ya" and "Va" etc. Post-base forms are usually spacing glyphs.

Pre-base form of consonants - form in which consonants appear to the left of the base glyph.

Reph - the above-base form of the letter "Ra" (or, in Kannada, the post-base form) that is used in most scripts if "Ra" is the first consonant in the syllable and is not the base consonant (in Kannada, Reph is a post-base form). In Indic scripts, except for Tibetan uses, Reph is an above-base form only.

Vattu (Rakar) - the below-base form of letter "Ra". Vattu requires exceptional treatment for two reasons. First, it may become a below-base form to half form glyphs, as well as to full form glyphs. Second, it often assumes different shapes depending on what consonant it follows. Consonants will form required ligatures with Vattu, producing vattu ligatures that can either be in half or full form. The example below uses the Devanagari script.

Indic forms

1. Pre-base form
2. The base consonant
3. Above-base form (reph)
4. Post-base (matra)
5. Below-base form (vattu)


The following notation is used in this document to illustrate layout operations:

C - a consonant character

K - a "generalized" consonant, including:

  • Akhand ligatures, and
  • Nukta forms of consonants (ligatures with Nukta)

"Generalized" consonants are produced by composing akhands and composing ligatures that contain the nukta sign.

Ah, Af - the Akhand half and full form respectively

V - the Vattu glyph

Reph - the Reph glyph

Cf, Kf - a glyph representing a full form of a consonant/generalized consonant

Ch, Kh - a half form of a (generalized) consonant

Cs, Ks - a below-base form of a (generalized) consonant

Cp, Kp - a post-base form of a (generalized) consonant

H - a Halant character or glyph

M - a Matra character or glyph

VO - an independent vowel

Lf - a consonant conjunct

Lh - a half form ligature (except Akhand) e.g. a vattu ligature

VM - a vowel modifier character or glyph

SM - a stress or tone mark (e.g. Udatta, Anudatta, Acute, Grave)

LM - a length mark

{ } - indicates 0 or multiple occurrences

[ ] - indicates 0 or 1 occurrence

this page was last updated December 2001
© 2001 Microsoft Corporation. All rights reserved. Terms of use.
comments to the MST group: how to contact us


Indic OpenType Specification | Terms | Shaping | Features | Other | Appendix
Microsoft Typography | Developer information | Specifications | OpenType font development