Microsoft Typography | Developer information | Specifications | OpenType font development
Syriac OpenType Specification | Terms | Shaping | Features | Other | Appendix


Other encoding issues


Handling invalid combining marks

Combining marks and signs that appear in text not in conjunction with a valid consonant base are considered invalid. Uniscribe displays these marks using the fallback rendering mechanism defined in the Unicode Standard (section 5.12, 'Rendering Non-Spacing Marks' of the Unicode Standard 3.1), i.e. positioned on a dotted circle.

Please note that to render a sign standalone (in apparent isolation from any base) one should apply it on a space (see section 2.5 'Combining Marks' of Unicode Standard 3.1). Uniscribe requires a ZWJ to be placed between the space and a mark for them to combine into a standalone sign.

For the fallback mechanism to work properly, a Syriac OTL font should contain a glyph for the dotted circle (U+25CC). In case this glyph is missing form the font, the invalid signs will be displayed on the missing glyph shape (white box).

In addition to the 'dotted circle' other Unicode code points that are recommended for inclusion in any Syriac font are; ZWJ (zero width joiner U+200C), ZWNJ (zero width non-joiner; U+200D), LTR (left to right mark; U+200E), and RTL (right to left mark; U+200F). The ZWNJ can be used between two letters to prevent them from forming a cursive connection.

If an invalid combination is found, like two 'pthahas' on the same base character, the diacritic that causes the invalid state is placed on a dotted circle to indicate to the user the invalid combination. The shaping engine for non-OpenType fonts will cause invalid mark combinations to overstrike. This is the problem that inserting the dotted circle for the invalid base solves. It should also be noted that the dotted circle is not inserted into the application's backing store. This is a run-time insertion into the glyph array that is returned from the ScriptShape function.

The invalid diacritic logic for Syriac is based on the classes listed below. There is a check to make sure more than one mark of a class is not placed on the same base.

Class Description Code points
DIAC1 Syriac above Greek U+0730, U+0733, U+0736, U+073A, U+073D
DIAC2 Syriac below Greek U+0731, U+0734 U+0737, U+073B, U+073E
DIAC3 Syriac other U+0740, U+0749, U+074A
DIAC4 Syriac dotted class above U+0732, U+0735, U+073F
DIAC5 Syriac dotted class below U+0738, U+0739, U+073C
DIAC6 Syriac qushshaya U+0741, U+030A
DIAC7 Syriac rukkakha U+0742, U+0325
DIAC8 Syriac line type above U+0747, U+0303, U+0304
DIAC9 Syriac line type below U+0748, U+032D, U+032E, U+0330, U+0331
DIAC10 Syriac seyame above U+0308
DIAC11 Syriac seyame below U+0304
DIAC12 Syriac dot above U+0307
DIAC13 Syriac dot below U+0323
DIAC14 Syriac two dots above U+0743
DIAC15 Syriac two dots below U+0744
DIAC16 Syriac three dots above U+0745
DIAC17 Syriac three dots below U+0746



this page was last updated 25 February 2002
© 2001 Microsoft Corporation. All rights reserved. Terms of use.
comments to the MST group: how to contact us

 

Syriac OpenType Specification | Terms | Shaping | Features | Other | Appendix
Microsoft Typography | Developer information | Specifications | OpenType font development