Baseline tags

This section defines the standard OpenType Layout baseline tags that Microsoft supports. A registered baseline tag has a specific meaning when used in the horizontal writing direction (used in the 'BASE' table's HorizAxis table), vertical writing direction (used in the 'BASE' table's VertAxis table), or both, and conveys information to font users about a baseline's use. For example, the “romn” baseline tag is commonly used to identify the baseline to layout Latin text in the horizontal, vertical, or both directions for Latin text layout. For compatibility and ease of use, Microsoft encourages font developers to use registered baseline tags.

This version of the Tag Registry identifies the baselines that Microsoft has implemented to date. All baseline tags are 4-byte character strings composed of a limited set of ASCII characters in the 0x20-0x7E range. Baseline tags consist of four lowercase letters.

Baseline Tag Baseline for HorizAxis Baseline for VertAxis
“hang” The hanging baseline. This is the horizontal line from which syllables seem to hang in Tibetan script. The hanging baseline, (which now appears vertical) for Tibetan characters rotated 90 degrees clockwise, for vertical writing mode.
“icfb” Ideographic character face bottom edge baseline.
(See section Ideographic Character Face below for usage.)
Ideographic character face left edge baseline.
(See section Ideographic Character Face below for usage.)
“icft” Ideographic character face top edge baseline.
(See section Ideographic Character Face below for usage.)
Ideographic character face right edge baseline.
(See section Ideographic Character Face below for usage.)
“ideo” Ideographic em-box bottom edge baseline.
(See section Ideographic Em-Box below for usage.)
Ideographic em-box left edge baseline. If this tag is present in the VertAxis, the value must be set to 0.
(See section Ideographic Em-Box below for usage.)
“idtp” Ideographic em-box top edge baseline. (See section Ideographic Em-Box below for usage.) Ideographic em-box right edge baseline. If this tag is present in the VertAxis, the value is strongly recommended to be set to head.unitsPerEm. (See section Ideographic Em-Box below for usage.)
“math” The baseline about which mathematical characters are centered. The baseline about which mathematical characters, when rotated 90 degrees clockwise for vertical writing mode, are centered.
“romn” The baseline used by simple alphabetic scripts such as Latin, Cyrillic and Greek. The alphabetic baseline for characters rotated 90 degrees clockwise for vertical writing mode. (This would not apply to alphabetic characters that remain upright in vertical writing mode, since these characters are not rotated.)

Ideographic Em-Box

[ The notation <Axis>.<Baseline Tag> is used in the following description to mean the baseline tag as defined in the specified axis. For example, HorizAxis.ideo means the ideo baseline tag as defined in the HorizAxis of the BASE table. See above for a list of registered baseline tags. ]

A font's ideographic em-box is the rectangle that defines a standard escapement around the full-width ideographic glyphs of the font, for both the horizontal and vertical writing directions. It is usually a square, but may be non-square as in the case of fonts used in Japanese newspaper layout that have a vertically condensed design.

The left, right, top and bottom edges of the ideographic em-box are to be determined as follows:

ideoEmboxLeft = 0

If HorizAxis.ideo defined:

ideoEmboxBottom = HorizAxis.ideo

If HorizAxis.idtp defined:

ideoEmboxTop = HorizAxis.idtp
Else:
ideoEmboxTop = HorizAxis.ideo + head.unitsPerEm

If VertAxis.idtp defined:

ideoEmboxRight = VertAxis.idtp
Else:
ideoEmboxRight = head.unitsPerEm

If VertAxis.ideo defined and non-zero:

Warning: Bad VertAxis.ideo value

Else If this is a CJK font:

ideoEmboxBottom = OS/2.sTypoDescender
ideoEmboxTop = OS/2.sTypoAscender
ideoEmboxRight = head.unitsPerEm
Else:
ideoEmbox cannot be determined for this font

Determining whether a font is CJK (Chinese, Japanese, or Korean) or not, as in the second-last “Else” clause above, can be done by checking the CJK-related bits of the OS/2.ulUnicodeRange fields.

Note that font designers can specify a HorizAxis.ideo baseline in their non-CJK fonts; this can be used by applications when aligning the font with an ideographic font used on the same line of text, when the user has specified ideographic em-box alignment.

The ideographic em-box center baseline is defined as halfway between the ideographic em-box top and bottom baselines in the horizontal axis, and halfway between the ideographic em-box left and right baselines in the vertical axis. These center baselines are defined in whole character units. The division used in the calculation must round to the character unit nearest 0 if needed. Thus, for maximal precision of center baseline placement, vendors should ensure that opposite edges of the ideographic em-box box are an even number of character units apart.

Example:

The values of the ideographic baseline tags for the Kozuka Mincho font family (designed on a 1000-unit em) are:

HorizAxis.ideo = -120; HorizAxis.idtp = 880.
Since this describes a square ideographic em-box, it is sufficient to record only the following:
HorizAxis.ideo = -120.
If HorizAxis.ideo is not present, then the following will be used for the ideographic em-box bottom and top, since this is a CJK font:
OS/2.sTypoDescender = -120; OS/2.sTypoAscender = 880.

Compatibility notes:

  1. Most applications expect the width of full-width ideographs in a CJK font to be exactly one em, thus it is strongly recommended that VertAxis.idtp, if present, be set to head.unitsPerEm. (The idtp baseline tag was introduced in OpenType 1.3.)
  2. While the OpenType specification allows for CJK fonts' OS/2.sTypoDescender and OS/2.sTypoAscender fields to specify metrics different from the HorizAxis.ideo and HorizAxis.idtp in the 'BASE' table, CJK font developers should be aware that existing applications may not read the 'BASE' table at all but simply use the OS/2.sTypoDescender and OS/2.sTypoAscender fields to describe the bottom and top edges of the ideographic em-box. If developers want their fonts to work correctly with such applications, they should ensure that any ideographic em-box values in the 'BASE' table of their CJK fonts describe the same bottom and top edges as the OS/2.sTypoDescender and OS/2.sTypoAscender fields.
  3. Applications on platforms other than Windows that don't parse the 'OS/2' table won't have access to the OS/2.sTypoDescender and OS/2.sTypoAscender fields, since these metrics are exposed only through Windows APIs currently. Thus, CJK fonts will typically have the same descender value recorded in hhea.Descender, OS/2.sTypoDescender, and HorizAxis.ideo (if present), and the same Ascender value recorded in hhea.Ascender, OS/2.sTypoAscender, and HorizAxis.idtp (if present).

See the section “OpenType CJK Font Guidelines“ for more information about constructing CJK fonts.

Ideographic Character Face

[ The notation <Axis>.<Baseline Tag> is used in the following description to mean the baseline tag as defined in the specified axis. For example, HorizAxis.icfb means the icfb baseline tag as defined in the HorizAxis of the BASE table. See above for a list of registered baseline tags. ]

The ideographic character face (ICF), also known as the average character face (ACF), specifies the approximate bounding box of the full-width ideographic and kana glyphs in a CJK font. (This is different from the FontBBox, as described in the PostScript programming language, which is the bounding box of all glyphs in the font.) In Japanese, the term for ICF is heikin jizura.

It is typically expressed as a percentage that represents the ratio of the length of an ICF box edge to the length of an ideographic em-box edge, and is conceptualized as a square centered within the ideographic em-box. However, in OpenType, the ICF box's left, bottom, right, and top edges are specified as the VertAxis.icfb, HorizAxis.icfb, VertAxis.icft, and HorizAxis.icft baselines, respectively, thus giving font designers the flexibility to specify a non-square and/or non-centered ICF box.

Font designers should set the value of the ICF box edges based on how tight or loose they want the font to appear when text is set with no tracking or kerning (beta gumi in Japanese). Therefore, the left-over boundary of the ideographic em-box around the ICF box is the default escapement of the font.

Applications can use the ICF box as an alignment tool, to ensure that glyphs touch the edges of the text frame and page objects are visually aligned to text edges. It is also useful for aligning glyphs of different sizes on the same line. In Japanese traditional paper-based workflow, the ICF box was often used for these purposes. It provides optically aligned results that are superior to using the ideographic em-box.

HorizAxis.icfb is the mininum piece of information required to define the ICF, in a CJK font. First, the ideographic em-box dimensions must be calculated as in the section “Ideographic Em-Box” above. The ICF edges are then calculated in the following order:

If HorizAxis.icfb defined:
icfBottom = HorizAxis.icfb

margin = HorizAxis.icfb - ideoEmboxBottom

If HorizAxis.icft defined:

icfTop = HorizAxis.icft
Else:
icfTop = ideoEmboxTop - margin

If VertAxis.icfb defined:

icfLeft = VertAxis.icfb
Else:
icfLeft = margin

If VertAxis.icft defined:

icfRight = VertAxis.icft
Else:
icfRight = ideoEmBoxRight - icfLeft
Else:
ICF cannot be determined for this font

For the last case above, i.e. fonts that don't have ICF information in their 'BASE' table, an application may choose to apply a heuristic such as calculating the bounding box of some or all of the ideographic and kana glyphs, and then averaging its margin with the ideographic em-box.

The ICF center baseline is defined as halfway between the ICF top and bottom baselines in the horizontal axis, and halfway between the ICF left and right baselines in the vertical axis. These center baselines are defined in whole character units. The division used in the calculation must round to the character unit nearest 0 if needed. Thus, for maximal precision of center baseline placement, vendors should ensure that opposite edges of the ICF box are an even number of character units apart.

Example:

The values of the ICF baselines for the Extra Light and Heavy weights of the Kozuka Mincho font family (designed on a 1000-unit em, with ideographic em-box as given in the example in the previous section) are:

Kozuka Mincho Extra Light:
VertAxis.icfb = 41; HorizAxis.icfb = -79;
VertAxis.icft = 959; HorizAxis.icft = 839.
Since this describes a square ICF centered in a square ideographic em-box, it is sufficient to record only the following:
HorizAxis.icfb = -79.

Kozuka Mincho Heavy:
VertAxis.icfb = 26; HorizAxis.icfb = -94;
VertAxis.icft = 974; HorizAxis.icft = 854.
It is sufficient to record only:
HorizAxis.icfb = -94.

It is strongly recommended that each of the edges of the ICF box be equidistant from the corresponding edge of the ideographic em-box. Following this will result in more predictable results in applications that use these values. That is, for fonts based on a square ideographic em-box, the ICF box should be a centered square.

See the section “OpenType CJK Font Guidelines“ for more information about constructing CJK fonts.


This page was last updated 14 October 2002.

© 2002 Microsoft Corporation. All rights reserved. Terms of use.

Comments to the MST group: how to contact us.