Microsoft Typography | Developer information | Specifications | OpenType font development
Hebrew OpenType Specification | Terms | Shaping | Features | Other | Appendix

Other encoding issues

Handling invalid combining marks

Combining marks and signs that appear in text not in conjunction with a valid consonant base are considered invalid. Uniscribe displays these marks using the fallback rendering mechanism defined in the Unicode Standard (section 5.12, 'Rendering Non-Spacing Marks' of the Unicode Standard 3.1), i.e. positioned on a dotted circle.

For the fallback mechanism to work properly, a Hebrew OTL font should contain a glyph for the dotted circle (U+25CC). In case this glyph is missing from the font, the invalid signs will be displayed on the missing glyph shape (white box).

In addition to the 'dotted circle' other Unicode code points that are recommended for inclusion in any Hebrew font are; LTR (left to right mark; U+200E), and RTL (right to left mark; U+200F).

If an invalid combination is found, like two 'nikuds' on the same base character, the diacritic that causes the invalid state is placed on a dotted circle to indicate to the user the invalid combination. The shaping engine for non-OpenType fonts will cause invalid mark combinations to overstrike. This is the problem that inserting the dotted circle for the invalid base solves. It should also be noted that the dotted circle is not inserted into the application's backing store. This is a run-time insertion into the glyph array that is returned from the ScriptShape function.

The invalid diacritic logic for Hebrew is based on the classes listed below. There is a check to make sure more than one mark of a class is not placed on the same base.

Class Description Code points
DIAC Hebrew diacritic U+05B0 - U+05B6, U+05BB
CANT1 Hebrew cantilation - above left U+0599, U+05A1, U+05A9, U+05AE
CANT2 Hebrew cantilation - above center left U+0597, U+05A8, U+05AC
CANT3 Hebrew cantilation - above center U+0592 - U+0595, U+05A7, U+05AB
CANT4 Hebrew cantilation - above center right U+0598, u+059C, U+059E, U+059F
CANT5 Hebrew cantilation - above right U+059D, U+05A0
CANT6 Hebrew cantilation - below left U+059B, U+05A5
CANT7 Hebrew cantilation - below center left U+0591, U+05A3, U+05A6
CANT8 Hebrew cantilation - below center right U+0596, U+05A4, U+05AA
CANT9 Hebrew cantilation - below right U+059A, U+05AD
CANT10 Hebrew cantilation - Masora Circle U+05AF
DAGESH Hebrew dagesh/mapiq U+05BC
DOTABV Hebrew upper dot U+05C4
HOLAM Hebrew holam U+05B9
METEG Hebrew Meteg/sof pasuq U+05BD
PATAH Hebrew patah U+05B7
QAMATS Hebrew shin/sin dot U+05B8
RAFE Hebrew Rafe U+05BF
SHINSIN Hebrew shin/sin dot U+05C1, U+05C2

this page was last updated 13 August 2002
© 2001 Microsoft Corporation. All rights reserved. Terms of use.
comments to the MST group: how to contact us


Hebrew OpenType Specification | Terms | Shaping | Features | Other | Appendix
Microsoft Typography | Developer information | Specifications | OpenType font development