Developing fonts > Specifications
Developing OpenType Fonts
Microsoft Typography Please note: This document reflects the changes made in 2005 recommendations for Indic-script OpenType font and shaping-engine implementations. While Indic fonts made according to the earlier recommendations will still function properly in the new versions of Uniscribe, font developers may choose to update their fonts, particularly if they wish to avoid certain limitations of the earlier implementation. This document presents information that will help font developers create or support OpenType fonts for the Malayalam script covered by the Unicode Standard. The Malayalam script is used to write the Malayalam language spoken in the Kerala state of South India. While the shapes of the letters resemble the Tamil script, Malayalam has a full set of consonant conjuncts. This is a multi-page specification. To access specific pages, use the Contents section below, or the navigation bar at the bottom of each page. Contents
Introduction This document targets developers implementing Indic shaping behavior compatible with Microsoft OpenType specification for Indic scripts. It contains information about terminology, font features and behavior of the Indic shaping engine in regards to the Malayalam script. While it does not contain instructions for creating Malayalam fonts, it will help font developers understand how the Indic shaping engine processes Indic text. In addition, registered features of the Malayalam script are defined and illustrated with examples. The new Indic shaping engine allows for variations in typographic conventions, giving a font developer control over shaping by the choice of designation of glyphs to certain OpenType features. For example, the location where the reph and pre-pended matra are re-ordered within a syllable cluster is affected by the presence of a half form. See illustrations below. In this specification, font developers will learn how to encode complex script features in their fonts, choose character sets, organize font information and use existing tools to produce Malayalam fonts. Registered features of Malayalam script are defined and illustrated, encodings are listed and templates are included for compiling Malayalam layout tables for OpenType fonts.
In the example below using the Devanagari script, (Ra + halant + Da+ halant + Ma + I-matra), Ra + halant will form the reph, but how the Da is classified will determine the position of the reph as well as the location of the pre-pended matra.
![]() The following terms are useful for understanding the layout features and script rules discussed in this document. Above-base form of consonants - A variant form of a consonant that appears above the base glyph. Akhand ligatures - Required consonant ligatures that may appear anywhere in the syllable, and may or may not involve the base glyph. Akhand ligatures have the highest priority and are formed first; some languages include them in their alphabets. Malayalam has several Akhand ligatures. Base glyph - The only consonant or consonant conjunct in the syllable that is written in its "full" (nominal) form. In Malayalam, the last consonant of the syllable (except for syllables ending with letter "Ra") usually forms the base glyph. In "degenerate" syllables that have no vowel (last letter of a word), the last consonant in halant form serves as the base consonant and is mapped as the base glyph. Layout operations are defined in terms of a base glyph, not a base character, since the base can often be a ligature. Below-base form of consonants – The below-base form comes after the consonant(s) that form the base glyph. (Malayalam consonant La) Below-base forms are represented by the non-spacing mark glyph. Chandrakala (Virama) - The character used after a consonant to "strip" it of the inherent vowel. This is also known Halant. Chillaksaram – Consonants that appear in the final position of a syllable and merge with Halant (Virama) are known as Chillaksaram in Malayalam. Cluster – A group of characters that form an integral unit in Indic scripts, often times a syllable. Consonant - Each represents a single consonant sound. Consonants may exist in different contextual forms and have an inherent vowel (usually, the short vowel "a"). For example, "Ka" and "Ta", rather than just "K" or "T." Consonant conjuncts (aka “conjuncts”) - Ligatures of two or more consonants. Consonant conjuncts may have both full and half forms, or only full forms. Halant (Virama) - The character used after a consonant to "strip" it of its inherent vowel. This is known as chandrakala in Malayalam. NOTE: A syllable containing halant characters may be shaped with no visible halant signs by using different consonant forms or conjuncts instead. Halant form of consonants - The form produced by adding the halant (virama) to the nominal shape. The Halant form is used in syllables that have no vowel or as the half form when no distinct shape for the half form exists. Half form of consonants (pre-base form) - A variant form of consonants which appear to the left of the base consonant, if they do not participate in a ligature. Consonants in their half form precede the ones forming the base glyph. Some Indic scripts, like Devanagari have distinctly shaped half forms for most of the consonants. If not distinct shape exists, the full form will display with an explicit Virama (same shape as the halant form). Matra (Dependent Vowel) - Used to represent a vowel sound that is not inherent to the consonant. Dependent vowels are referred to as "matras" in Sanskrit. They are always depicted in combination with a single consonant, or with a consonant cluster. The greatest variation among different Indian scripts is found in the rules for attaching dependent vowels to base characters. New shaping behavior - Shaping behavior defined in this version of the Indic OpenType Font Specification. Information in this document relates primarily to the new implementation model. Old behavior may be mentioned in comments about compatibility. Nukta - A combining character that alters the way a preceding consonant (or matra) is pronounced. Old shaping behavior - Shaping behavior defined in previous versions of the Indic OpenType Font Specification. OpenType layout engine – Library responsible for executing OpenType layout features in a font. In the Microsoft text formatting stack, it is named OTLS (OpenType layout services). OpenType tag – 4-byte identifier for script, language system or feature in the font. Post-base form of consonants – A variant form of a consonant that appears to the right of the base glyph. A consonant that takes a post-base form is preceded by the consonant(s) forming the base glyph plus a halant (virama). Post-base forms are usually spacing glyphs. Pre-base form of consonants - A variant form of a consonant that appears to the left of the base glyph. Note that most pre-base consonant forms are logically as well as visually before the base consonant. Half forms are examples of this kind of pre-base form. In some scripts, though, a pre-base Ra may logically follow the base consonant (that is, it follows it phonetically and in the character sequence of the text), even though it is presented visually before the base. The shaping engine detects such cases dynamically using the <pref> feature and re-orders the pre-base-form glyph as needed. Reph – The above-base form of the letter "Ra" that is used in Devanagari when "Ra" is the first consonant in the syllable and is not the base consonant. Shaping Engine –Code responsible for shaping input, classified to a particular script. Split Matra - A matra that is decomposed into pieces for rendering. Usually the different pieces appear in different positions relative to the base. For instance, part of the matra may be placed at the beginning of the cluster and another part at the end of the cluster. Syllable - A single unit of Indic text processing. Shaping of Indic text is performed independently for each syllable. Process of identifying boundaries of each syllable is described below. Vattu - A below-base form of a consonant. ![]() Example in Devanagari script 1. Pre-base form 2. The base consonant 3. Above-base form (reph) 4. Post-base (matra) 5. Below-base form (vattu/rakaar) Next section: Shaping Engine introduction | shaping engine | features | appendices |