Microsoft Typography | Developer information | Specifications | OpenType font development
Arabic OpenType Specification | Terms | Shaping | Features | Other | Appendix


Features for the Arabic script

The features listed below have been defined to create the basic forms for the languages that are supported on Arabic systems. Regardless of the model an application chooses for supporting layout of complex scripts, Uniscribe requires a fixed order for executing features within a run of text to consistently obtain the proper basic form. This is achieved by calling features one-by-one in the standard order listed below.

The order of the lookups within each feature is also very important. For more information on lookups and defining features in OpenType fonts, see Encoding feature information in the OpenType font development section.

The standard order for applying Arabic features encoded in OpenType fonts:
Not all of the features listed below apply to all Arabic script languages.

Feature Feature function Layout operation Required
Language based forms:
ccmp Character composition/decomposition substitution GSUB
isol Isolated character form substitution GSUB
fina Final character form substitution GSUB X
medi Medial character form substitution GSUB X
init Initial character form substitution GSUB X
rlig Required ligature substitution GSUB X
calt Connection form substitution GSUB
Typographical forms:
liga Standard ligature substitution GSUB
dlig Discretionary ligature substitution GSUB
cswh Contextual swashes GSUB
mset Mark positioning via substitution GSUB
Positioning features:
curs Cursive positioning GPOS
kern Pair kerning GPOS
mark Mark to base positioning GPOS
mkmk Mark to mark positioning GPOS
       
[GSUB = glyph substitution, GPOS = glyph positioning]


Descriptions and examples of above features

Many of the registered features described and illustrated in this document are based on the OpenType font Arabic Typesetting. Arabic Typesetting contains layout information and glyphs to support all of the required features for the Arabic script and language systems supported. The Arabic Typesetting font will be available as part of Visual OpenType Layout Tool (VOLT) and is provided under the terms of the VOLT supplemental files end user license agreement. The Arabic Typesetting font is available for download in the Appendix of this document


Character composition (and decomposition)

Feature Tag: "ccmp"

The 'ccmp' feature is used to compose a number of glyphs into one glyph, or decompose one glyph into a number of glyphs. This feature is implemented before any other features because there may be times when a font vender wants to control certain shaping of glyphs. An example of using this table is seen below. The 'ccmp' table maps default alphabetic forms to both a composed form (essentially a ligature, GSUB lookup type 4), and decomposed forms (GSUB lookup type 2).

The rationale for the decomposition illustrated above is to take advantage of the color diacritic feature found in Microsoft applications like Word and Publisher.


Isolated form

Feature Tag: "isol"

The 'isol' feature is used to map the Unicode character value to its isolated form. This is usually the same glyph form. However, Unicode defines Arabic presentation forms as different than the Unicode character form. If a vender has a good quality font tool, or a font utility that can edit the CMAP table, more than one Unicode character can point to the same glyph ID. (GSUB lookup type 1).


Final form

Feature Tag: "fina"

The 'fina' feature is used to map the Unicode character value to its final form. (GSUB lookup type 1).


Medial form

Feature Tag: "medi"

The 'medi' feature is used to map the Unicode character value to its medial form. (GSUB lookup type 1).


Initial form

Feature Tag: "init"

The 'init' feature is used to map the Unicode character value to its initial form. (GSUB lookup type 1).


Required ligatures

Feature Tag: "rlig"

The 'rlig' feature is used to map glyph values to their correct ligated form. Font developers should use this table for all ligatures that they want to map as such all of the time. Ligatures that should be optional, based on user preferences should not be included in this table. Optional ligatures are defined in the 'liga' table.

The 'rlig' feature maps sequences of glyphs to corresponding ligatures (GSUB lookup type 4). Ligatures with more components must be stored ahead of those with fewer components in order to be found. See Ordering ligatures in the Encoding Feature Information section. The set of required ligatures will vary by design and script.

NOTE: If you want your fonts to have some level of backward compatibility with Windows9x/ME system level support you will also want to include the items in the 'rlig' feature in the 'liga' feature. This is because older operating systems do not use Uniscribe for shaping and are not aware of the 'rlig' feature.


Connection forms

Feature Tag: "calt"

In specified situations, replaces default glyphs with alternate forms that provide better joining behavior. Used in script typefaces which are designed to have some or all of their glyphs join. The 'calt' table specifies the context in which each substitution occurs, and maps one or more default glyphs to replacement glyphs (GSUB lookup type 6).


Standard ligatures

Feature Tag: "liga"

The 'liga' feature is used to map glyphs to their optional ligated form. Font developers should use this table for all ligatures that they want the user to be able to control by user preference. Uniscribe has a flag that will allow this type of feature to be deactivated. The 'liga' feature maps sequences of glyphs to corresponding ligatures (GSUB lookup type 4). Ligatures with more components must be stored ahead of those with fewer components in order to be found. See Ordering ligatures in the Encoding Feature Information section. The set of optional ligatures will vary by typeface design and script.

Note: Ligatures that should be formed all of the time should not be included in this feature type. Required ligatures are defined in the 'rlig' table.


Discretionary ligatures

Feature Tag: "dlig"

The 'dlig' feature is also used to map glyphs to their optional ligated form. Font developers should use this table for all ligatures that they want the user to be able to control by user preference. Uniscribe has a flag that will allow this type of feature to be deactivated. The 'dlig' feature maps sequences of glyphs to corresponding ligatures (GSUB lookup type 4). Ligatures with more components must be stored ahead of those with fewer components in order to be found. See Ordering ligatures in the Encoding Feature Information section. The set of optional ligatures will vary by typeface design and script.


Contextual swash

Feature Tag: "cswh"

The 'cswh' feature replaces default character glyphs with corresponding swash glyphs based upon the context surrounding the character. Note that there may be more than one swash alternate for a given character. The 'cswh' table maps glyph IDs for default forms to those for one or more corresponding swash forms. While many of these substitutions are one-to-one (GSUB lookup type 1), others require a selection from a set (GSUB lookup type 3). Font developers may choose to build two tables (one for each lookup type) or only one that uses lookup type 3 for all substitutions. If several styles of swash are present across the font, the set of forms for each character should be ordered consistently


Mark positioning via substitution

Feature Tag: "mset"

The 'mset' feature is used to position Arabic combining marks in fonts for Windows 95 using glyph substitution. In Arabic, the Hamza is positioned differently when placed above a Yeh Barree as compared to the Alef. Windows 95 implementation: In contrast to the "mark" feature, the 'mset' feature uses glyph substitution to combine marks and base glyphs. It replaces a default mark glyph with a correctly positioned mark glyph. The font designer specifies the position of the mark when describing the mark's contour in the font file. Microsoft's Arabic fonts, created for Windows 95, use a contextual substitution lookup (GSUB LookupType = 5) to implement the 'mset' feature.

Example: the default fatha is positioned high and the 'mset' feature is used to substitute a low form when placed over a Beh.


Cursive positioning

Feature Tag: "curs"

The 'curs' feature positions cursive characters so that the exit point of the current character matches with entry point of the following character. The 'curs' table maps connecting point of joining glyphs and may be implemented as a Cursive Attachment (GPOS lookup type 3).


Kerning

Feature Tag: "kern"

The 'kern' feature is used to adjust amount of space between glyphs, generally to provide optically consistent spacing between glyphs. Although a well-designed typeface has consistent inter-glyph spacing overall, some glyph combinations require adjustment for improved legibility. Besides standard adjustment in either horizontal or vertical direction, this feature can supply size-dependent kerning data via device tables, "cross-stream" kerning in the Y text direction, and adjustment of glyph placement independent of the advance adjustment. Note that this feature would not be used in monospaced fonts.

The font stores a set of adjustments for pairs of glyphs (GPOS lookup type 2 or 8). These may be stored as one or more tables matching left and right classes, and/or as individual pairs. If both forms are used, the classes should be listed last, so as to provide a means to replace any non-ideal values that may result from the class tables. Additional adjustments may be provided for larger sets of glyphs (e.g., triplets, quadruplets, etc.) to overwrite the results of pair kerns in particular combinations. These should precede the pairs.

Creating kern table using Microsoft VOLT


Mark to base positioning

Feature Tag: "mark"

The 'mark' feature positions mark glyphs in relation to a base glyph, or a ligature glyph. This feature may be implemented as a MarkToBase Attachment lookup (GPOS LookupType = 4) or a MarkToLigature Attachment lookup (GPOS LookupType = 5).

Positioning mark to base using Microsoft VOLT

Positioning mark to base (ligature) using Microsoft VOLT


Mark to mark positioning

Feature Tag: "mkmk"

The 'mkmk' feature positions mark glyphs in relation to another mark glyph. This feature may be implemented as a MarkToMark Attachment lookup (GPOS LookupType = 6).

Positioning mark to mark using Microsoft VOLT



this page was last updated 25 February 2002
© 2001 Microsoft Corporation. All rights reserved. Terms of use.
comments to the MST group: how to contact us

 

Arabic OpenType Specification | Terms | Shaping | Features | Other | Appendix
Microsoft Typography | Developer information | Specifications | OpenType font development