Developing fonts > Specifications Developing OpenType Fonts Microsoft Typography Please note: This document reflects the changes made in 2005 recommendations for Indic-script OpenType font and shaping-engine implementations. While Indic fonts made according to the earlier recommendations will still function properly in the new versions of Uniscribe, font developers may choose to update their fonts, particularly if they wish to avoid certain limitations of the earlier implementation. This document presents information that will help font developers create or support OpenType fonts for all Bengali script languages covered by the Unicode Standard. The Bengali script, closely related to the Devanagari script, is used to write Bengali, Assamese and Manipuri. This is a multi-page specification. To access specific pages, use the Contents section below, or the navigation bar at the bottom of each page. Contents
Introduction This document targets developers implementing Indic shaping behavior compatible with Microsoft OpenType specification for Indic scripts. It contains information about terminology, font features and behavior of the Indic shaping engine in regards to the Bengali script. While it does not contain instructions for creating Bengali fonts, it will help font developers understand how the Indic shaping engine processes Indic text. In addition, registered features of the Bengali script are defined and illustrated with examples. The new Indic shaping engine allows for variations in typographic conventions, giving a font developer control over shaping by the choice of designation of glyphs to certain OpenType features. For example, the location where the reph and pre-pended matra are re-ordered within a syllable cluster is affected by the presence of a half form. See illustrations below. In the example below (Ra + halant + Da+ halant + Ka + I-matra), Ra + halant will form the reph, but how the Da is classified will determine the position of the reph as well as the location of the pre-pended matra. ![]() Option 1= The re-ordering behavior of the shaping engine for Bengali where the 'Da' has a half form; the reph will be positioned on the first main consonant; and the I-matra will be positioned immediately in front of the "half-form" D(a). Option 2= If the Da did not have a half form and was NOT listed in the half feature, the halant-form will display and the shaping engine will treat it as the first main consonant on which to position the reph. And the I-matra will be positioned immediately in front of the base (or half-form) preceding it, which in this case is the Ka. The following terms are useful for understanding the layout features and script rules discussed in this document. Above-base form of consonants - A variant form of a consonant that appears above the base glyph. In Bengali, only the consonant Ra has an above-base form, known as "reph". Akhand ligatures - Required consonant ligatures that may appear anywhere in the syllable, and may or may not involve the base glyph. Akhand ligatures have the highest priority and are formed first; some languages include them in their alphabets. Akhand ligatures may be displayed in either half- or full-form. Base glyph - The only consonant or consonant conjunct in the syllable that is written in its "full" (nominal) form. In Bengali, the last consonant of the syllable (except for syllables ending with letter "Ra") usually forms the base glyph. In "degenerate" syllables that have no vowel (last letter of a word), the last consonant in halant form serves as the base consonant and is mapped as the base glyph. Layout operations are defined in terms of a base glyph, not a base character, since the base can often be a ligature. Below-base form of consonants - A variant form of a consonant that appears below the base glyph. In Bengali, the consonant Ra and Ba have below-base forms. In the glyph sequence, the below-base form comes after the consonant(s) that form the base glyph. Below-base forms are represented by a non-spacing mark glyph. Bengali syllable - Effective orthographic "unit" of Bengali writing systems. Syllables are composed of consonant letters, independent vowels, and dependant vowels. In a text sequence, these characters are stored in phonetic order (although they may not be represented in phonetic order when displayed). Once a syllable is shaped, it is indivisible. The cursor cannot be positioned within the syllable. Transformations discussed in this document do not cross syllable boundaries. Cluster - a group of characters that form an integral unit in Indic scripts, often times a syllable. Consonant - Each represents a single consonant sound. Consonants may exist in different contextual forms and have an inherent vowel (usually, the short vowel "a"). For example, "Ka" and "Ta", rather than just "K" or "T." Consonant conjuncts (aka 'conjuncts') - Ligatures of two or more consonants. Consonant conjuncts may have both full and half forms, or only full forms. Halant (Virama) - The character used after a consonant to "strip" it of it's inherent vowel. A halant follows all but the last consonant in every Bengali syllable; a halant also follows the last consonant if the syllable has no vowel. NOTE: A syllable containing halant characters may be shaped with no visible halant signs by using different consonant forms or conjuncts instead. Halant form of consonants - The form produced by adding the halant (virama) to the nominal shape. The Halant form is used in syllables that have no vowel or as the half form when no distinct shape for the half form exists. Half form of consonants (pre-base form) - A variant form of consonants which appear to the left of the base consonant, if they do not participate in a ligature. Consonants in their half form precede the ones forming the base glyph. Bengali has distinctly shaped half forms for most consonants. If a consonant does not have a distinct shape for the half form and does not form any ligature, it will be displayed with an explicit Virama (same shape as the halant form). Matra (Dependent Vowel) - Used to represent a vowel sound that is not inherent to the consonant. Dependent vowels are referred to as "matras" in Sanskrit. They are always depicted in combination with a single consonant, or with a consonant cluster. The greatest variation among different Indian scripts is found in the rules for attaching dependent vowels to base characters. New shaping behavior - Shaping behavior defined in this version of the Indic OpenType Font Specification. Information in this document relates primarily to the new implementation model. Old behavior may be mentioned in comments about compatibility. Nukta - A combining character that alters the way a preceding consonant (or matra) is pronounced. Old shaping behavior - Shaping behavior defined in previous versions of the Indic OpenType Font Specification. OpenType layout engine Library responsible for executing OpenType layout features in a font. In the Microsoft text formatting stack, it is named OTLS (OpenType layout services). OpenType tag - 4-byte identifier for script, language system or feature in the font. Post-base form of consonants - A variant form of a consonant that appears to the right of the base glyph. A consonant that takes a post-base form is preceded by the consonant(s) forming the base glyph plus a halant (virama). Post-base forms are usually spacing glyphs. Pre-base form of consonants - A variant form of a consonant that appears to the left of the base glyph. Note that most pre-base consonant forms are logically as well as visually before the base consonant. Half forms are examples of this kind of pre-base form. In some scripts, though, a pre-base Ra may logically follow the base consonant (that is, it follows it phonetically and in the character sequence of the text), even though it is presented visually before the base. The shaping engine detects such cases dynamically using the <pref> feature and re-orders the pre-base-form glyph as needed. Reph - the above-base form of the letter "Ra" that is used in Bengali when "Ra" is the first consonant in the syllable and is not the base consonant. Shaping Engine - code that processes text strings that is aware of language rules. Split Matra - A matra that is decomposed into pieces for rendering. Usually the different pieces appear in different positions relative to the base. For instance, part of the matra may be placed at the beginning of the cluster and another part at the end of the cluster. Syllable - A single unit of Indic text processing. Shaping of Indic text is performed independently for each syllable. Process of identifying boundaries of each syllable is described below. Vattu - A below-base form of a consonant. ![]() 1. Pre-base form 2. The base consonant 3. Above-base form (reph) 4. Post-base (matra) 5. Below-base form (vattu) Next section: Shaping Engine introduction | shaping engine | features | appendices |