Click Here to Install Silverlight*
Middle EastChange|All Microsoft Sites
Microsoft
Arabic Applications Development Tips (Using Microsoft VC++ Ver. 6)

Introduction

This document aims to provide important information concerning the Arabic application development considerations such as the: Language Settings, Code Pages, System Locales, Arabic Specific Date Functions, and the most commonly used Arabic API's. the document will also provide a good set of examples on how to write the code associated with each of these API's. we will also highlight some of the National Language Support Issues (NLS) and how to make your applications more localized to your specific Arabic Country settings.

If you have further inquiries about any of the subjects provided in this document please contact us on

vsarabic@microsoft.com

MAKELANGID Macro

The MAKELANGID macro creates a language identifier from a primary language identifier and a secondary language identifier.

WORD MAKELANGID(
  USHORT usPrimaryLanguage// primary language identifier
  USHORT usSubLanguage       // sublanguage identifier
);
Parameters

usPrimaryLanguage
Specifies the primary language identifier. This parameter can be one of the following predefined values:

LANG_ARABIC

LANG_ENGLISH

For a user-defined language, usPrimaryLanguage can be a value in the range 0x0200 to 0x03FF. All other values are reserved for system use.

usSubLanguage
Specifies the secondary language identifier. This parameter can be one of the following values:

SUBLANG_ARABIC_SAUDI_ARABIA

SUBLANG_ARABIC_IRAQ

SUBLANG_ARABIC_EGYPT

SUBLANG_ARABIC_LIBYA

SUBLANG_ARABIC_ALGERIA

SUBLANG_ARABIC_MOROCCO

SUBLANG_ARABIC_TUNISIA

SUBLANG_ARABIC_OMAN

SUBLANG_ARABIC_YEMEN

SUBLANG_ARABIC_SYRIA

SUBLANG_ARABIC_JORDAN

SUBLANG_ARABIC_LEBANON

SUBLANG_ARABIC_KUWAIT

SUBLANG_ARABIC_UAE

SUBLANG_ARABIC_BAHRAIN

SUBLANG_ARABIC_QATAR

For a user-defined secondary language, usSubLanguage can be a value in the range 0x20 to 0x3F. All other values are reserved for system use.

Return Values
The return value is a language identifier.
 

Remarks
The following three combinations of usPrimaryLanguage and usSubLanguage have special meaning:

Primary language ID

Secondary language ID

Meaning

LANG_NEUTRAL

SUBLANG_NEUTRAL

Language neutral

LANG_NEUTRAL

SUBLANG_DEFAULT

User default language

LANG_NEUTRAL

SUBLANG_SYS_DEFAULT

System default language

The MAKELANGID macro is defined as follows:
#define MAKELANGID(p, s) ((((WORD) (s)) << 10) | (WORD) (p)) 

Language Identifiers
The following are language identifiers. They are composed of a primary language identifier and a secondary language identifier.

The following identifiers were composed using the MAKELANGID macro.

Identifier

Language

0x0000

Language Neutral

0x0400

Process Default Language

0x0401

Arabic (Saudi Arabia)

0x0801

Arabic (Iraq)

0x0c01

Arabic (Egypt)

0x1001

Arabic (Libya)

0x1401

Arabic (Algeria)

0x1801

Arabic (Morocco)

0x1c01

Arabic (Tunisia)

0x2001

Arabic (Oman)

0x2401

Arabic (Yemen)

0x2801

Arabic (Syria)

0x2c01

Arabic (Jordan)

0x3001

Arabic (Lebanon)

0x3401

Arabic (Kuwait)

0x3801

Arabic (U.A.E.)

0x3c01

Arabic (Bahrain)

0x4001

Arabic (Qatar)

Primary Language Identifiers

The following are the primary language identifiers. They can be combined with secondary language identifiers to form language identifiers.

Identifier

Prefined Symbol

Language

0x00

LANG_NEUTRAL

Neutral

0x01

LANG_ARABIC

Arabic

Secondary Language Identifiers

The following are secondary language identifiers. They can be combined with primary language identifiers to form language identifiers.

Identifier

Predefined Symbol

Language

0x00

SUBLANG_NEUTRAL

Neutral

0x01

SUBLANG_DEFAULT

Default

0x02

SUBLANG_SYS_DEFAULT

System Default

0x01

SUBLANG_ARABIC_SAUDI_ARABIA

Arabic (Saudi Arabia)

0x02

SUBLANG_ARABIC_IRAQ

Arabic (Iraq)

0x03

SUBLANG_ARABIC_EGYPT

Arabic (Egypt)

0x04

SUBLANG_ARABIC_LIBYA

Arabic (Libya)

0x05

SUBLANG_ARABIC_ALGERIA

Arabic (Algeria)

0x06

SUBLANG_ARABIC_MOROCCO

Arabic (Morocco)

0x07

SUBLANG_ARABIC_TUNISIA

Arabic (Tunisia)

0x08

SUBLANG_ARABIC_OMAN

Arabic (Oman)

0x09

SUBLANG_ARABIC_YEMEN

Arabic (Yemen)

0x10

SUBLANG_ARABIC_SYRIA

Arabic (Syria)

0x11

SUBLANG_ARABIC_JORDAN

Arabic (Jordan)

0x12

SUBLANG_ARABIC_LEBANON

Arabic (Lebanon)

0x13

SUBLANG_ARABIC_KUWAIT

Arabic (Kuwait)

0x14

SUBLANG_ARABIC_UAE

Arabic (U.A.E.)

0x15

SUBLANG_ARABIC_BAHRAIN

Arabic (Bahrain)

0x16

SUBLANG_ARABIC_QATAR

Arabic (Qatar)

 

National Language Support

National language support functions help Win32®-based applications support the differing language- and location-specific needs of users around the world.

This overview describes the national language support functions and explains how to use them in your Win32-based applications.

Code-Page Identifiers

Identifier

Meaning

037

EBCDIC

437

MS-DOS United States

500

EBCDIC "500V1"

708

Arabic (ASMO 708)

709

Arabic (ASMO 449+, BCON V4)

710

Arabic (Transparent Arabic)

720

Arabic (Transparent ASMO)

850

MS-DOS Multilingual (Latin I)

864

Arabic

875

EBCDIC

1026

EBCDIC

1200

Unicode (BMP of ISO 10646)

1252

Windows 3.1 US (ANSI)

1256

Arabic


OEM Code-Page Identifiers

Identifier

Meaning

437

MS-DOS United States

708

Arabic (ASMO 708)

709

Arabic (ASMO 449+, BCON V4)

710

Arabic (Transparent Arabic)

720

Arabic (Transparent ASMO)

850

MS-DOS Multilingual (Latin I)

864

Arabic

IsValidCodePage Function

The IsValidCodePage determines whether a specified code page is valid.

BOOL IsValidCodePage(
  UINT CodePage   // specifies the code page to check
);
Parameters

CodePage
Specifies the code page to check. Each code page is identified by a unique number.

Return Values
If the code page is valid, the return value is nonzero.
If the code page is not valid, the return value is zero. To get extended error information, call GetLastError.

Remarks

A code page is considered valid only if it is installed in the system.

GetOEMCP Function

The GetOEMCP function retrieves the current OEM code-page identifier for the system. (OEM stands for original equipment manufacturer.)

UINT GetOEMCP(VOID)

Parameters
This function has no parameters.
Return Values
The return value is the current OEM code-page identifier for the system or a default identifier if no code page is current.


Locale Information
The following are locale constants.

LOCALE_FONTSIGNATURE

Windows NT 4.0 or later: A bit pattern used to determine the relationship between character coverage needed to support the locale and font contents.

LOCALE_ICALENDARTYPE

Current calendar type. This type can be one of these values:

1

Gregorian (as in United States)

2

Gregorian (English strings always)

6

Hijri (Arabic lunar)

9

Gregorian Middle East French calendar

10

Gregorian Arabic calendar

11

Gregorian Transliterated English calendar

12

Gregorian Transliterated French calendar

LOCALE_ICENTURY

Specifier for full 4-digit century. The maximum number of characters allowed for this string is two. The specifier can be one of the following values:

0

Abbreviated 2-digit century

1

Full 4-digit century

LOCALE_ICOUNTRY

Country code, based on international phone codes, also referred to as IBM country codes. The maximum number of characters allowed for this string is six.

LOCALE_ICURRDIGITS

Number of fractional digits for the local monetary format. The maximum number of characters allowed for this string is three.

LOCALE_ICURRENCY

Positive currency mode. The maximum number of characters allowed for this string is two. The mode can be one of the following values:

0

Prefix, no separation

1

Suffix, no separation

2

Prefix, 1-character separation

3

Suffix, 1-character separation

LOCALE_IDATE

Short date format-ordering specifier. The maximum number of characters allowed for this string is two. The specifier can be one of the following values:

0

Month-Day-Year

1

Day-Month-Year

2

Year-Month-Day

LOCALE_IDAYLZERO

Specifier for leading zeros in day fields. The maximum number of characters allowed for this string is two. The specifier can be one of the following values:

0

No leading zeros for days

1

Leading zeros for days

LOCALE_IDEFAULTANSICODEPAGE

American National Standards Institute (ANSI) code page associated with this locale. The maximum number of characters allowed for this string is six.

LOCALE_IDEFAULTCODEPAGE

Original equipment manufacturer (OEM) code page associated with the country. The maximum number of characters allowed for this string is six.

LOCALE_IDEFAULTCOUNTRY

Country code for the principal country in this locale. This is provided so that partially specified locales can be completed with default values. The maximum number of characters allowed for this string is six.

LOCALE_IDEFAULTEBCDICCODEPAGE

Windows NT 5.0 or later: Default EBCDIC code page associated with the locale. The maximum number of characters allowed for this string is six.

LOCALE_IDEFAULTLANGUAGE

Language identifier for the principal language spoken in this locale. This is provided so partially specified locales can be completed with default values. The maximum number of characters allowed for this string is five.

LOCALE_IDEFAULTMACCODEPAGE

Default Macintosh code page associated with the locale.

LOCALE_IDEFAULTOEMCODEPAGE

Original equipment manufacturer (OEM) code page associated with the locale. The maximum number of characters allowed for this string is six.

LOCALE_IDIGITS

Number of fractional digits. The maximum number of characters allowed for this string is three.

LOCALE_IDIGITSUBSTITUTION

Windows NT 5.0 or later: Determines the shape of the digits. The specifier can be one of these values:

0

Context-the national shape depends on the previous text in the same output.

1

None/Arabic-gives full Unicode compatibility.

2

Native-national shapes determined by LOCALE_SNATIVEDIGITS.

LOCALE_IFIRSTDAYOFWEEK

Specifier for the first day in a week. The specifier can be one of these values:

0

LOCALE_SDAYNAME1

1

LOCALE_SDAYNAME2

2

LOCALE_SDAYNAME3

3

LOCALE_SDAYNAME4

4

LOCALE_SDAYNAME5

5

LOCALE_SDAYNAME6

6

LOCALE_SDAYNAME7

LOCALE_IFIRSTWEEKOFYEAR

Specifier for the first week of the year. The specifier can be one of these values:

0

Week containing 1/1 is the first week of that year.

1

First full week following 1/1 is the first week of that year.

2

First week containing at least four days is the first week of that year.

LOCALE_IINTLCURRDIGITS

Number of fractional digits for the international monetary format. The maximum number of characters allowed for this string is three.

LOCALE_ILANGUAGE

Language identifier indicating the language. The maximum number of characters allowed for this string is five.

LOCALE_ILDATE

Long date format-ordering specifier. The maximum number of characters allowed for this string is two. The specifier can be one of the following values:

0

Month-Day-Year

1

Day-Month-Year

2

Year-Month-Day

LOCALE_ILZERO

Specifier for leading zeros in decimal fields. The maximum number of characters allowed for this string is two. The specifier can be one of the following values:

0

No leading zeros

1

Leading zeros

LOCALE_IMEASURE

System of measurement. This value is zero if the metric system (Systéme International d'Unités, or S.I.) is used, and 1 if the U.S. system is used. The maximum number of characters allowed for this string is two.

LOCALE_IMONLZERO

Specifier for leading zeros in month fields. The maximum number of characters allowed for this string is two. The specifier can be one of the following values:

0

No leading zeros for months

1

Leading zeros for months

LOCALE_INEGCURR

Negative currency mode. The maximum number of characters allowed for this string is three. The mode can be one of the following values:

0

($1.1)

1

-$1.1

2

$-1.1

3

$1.1-

4

(1.1$)

5

-1.1$

6

1.1-$

7

1.1$-

8

-1.1 $ (space before $)

9

-$ 1.1 (space after $)

10

1.1 $- (space before $)

11

$ 1.1- (space after $)

12

$ -1.1 (space after $)

13

1.1- $ (space before $)

14

($ 1.1) (space after $)

15

(1.1 $) (space before $)

LOCALE_INEGNUMBER

Negative number mode. The mode can be one of these values:

0

(1.1)

1

-1.1

2

- 1.1

3

1.1-

4

1.1 -

LOCALE_INEGSEPBYSPACE

Separation of monetary symbol in a negative monetary value. This value is 1 if the monetary symbol is separated by a space from the negative amount, zero if it is not. The maximum number of characters allowed for this string is two.

LOCALE_INEGSIGNPOSN

Formatting index for negative values. This index uses the same values as LOCALE_IPOSSIGNPOSN. The maximum number of characters allowed for this string is two.

LOCALE_INEGSYMPRECEDES

Position of monetary symbol in a negative monetary value. This value is 1 if the monetary symbol precedes the negative amount, zero if it follows it. The maximum number of characters allowed for this string is two.

LOCALE_IOPTIONALCALENDAR

Additional calendar types. This can be a zero-separated list of one or more of these calendars type values:

0

No additional types valid

1

Gregorian (as in United States)

2

Gregorian (English strings always)

6

Hijri (Arabic lunar)

9

Gregorian Middle East French calendar

10

Gregorian Arabic calendar

11

Gregorian Transliterated English calendar

12

Gregorian Transliterated French calendar

LOCALE_IPAPERSIZE

Windows NT 5.0 or later: Default paper size associated with the locale. The specifier can be one of the following values:

0

US Letter

1

A4

2

Legal

LOCALE_IPOSSEPBYSPACE

Separation of monetary symbol in a positive monetary value. This value is 1 if the monetary symbol is separated by a space from a positive amount, zero if it is not. The maximum number of characters allowed for this string is two.

LOCALE_IPOSSIGNPOSN

Formatting index for positive values. The maximum number of characters allowed for this string is two. The index can be one of the following values:

0

Parentheses surround the amount and the monetary symbol.

1

The sign string precedes the amount and the monetary symbol.

2

The sign string succeeds the amount and the monetary symbol.

3

The sign string immediately precedes the monetary symbol.

4

The sign string immediately succeeds the monetary symbol.

LOCALE_IPOSSYMPRECEDES

Position of monetary symbol in a positive monetary value this value is 1 if the monetary symbol precedes the positive amount, zero if it follows it. The maximum number of characters allowed for this string is two.

LOCALE_ITIME

Time format specifier the maximum number of characters allowed for this string is two. The specifier can be one of the following values:

0

AM / PM 12-hour format

1

24-hour format

LOCALE_ITIMEMARKPOSN

Specifier indicating whether the time marker string (AM or PM) preceeds or follows the time string the registry value is iTimePrefix for compatibility with previous Asian versions of Windows. The specifier can take one of the following values:

0

Use as suffix.

1

Use as prefix.

LOCALE_ITLZERO

Specifier for leading zeros in time fields the maximum number of characters allowed for this string is two. The specifier can be one of the following values:

0

No leading zeros for hours

1

Leading zeros for hours

LOCALE_NOUSEROVERRIDE

Can be combined with other LOCALE values to bypass any user override and return the system default value for the given locale information.

LOCALE_RETURN_NUMBER

Windows NT 4.0 or later: This value may be ORed with any specifier beginning with LOCALE_I to return the value as a number instead of as a string. The buffer to receive the value must be at least the length of a DWORD.

LOCALE_S1159
String for the AM designator.
LOCALE_S2359
String for the PM designator.
LOCALE_SABBREVCTRYNAME
Abbreviated name of the country from the ISO Standard 3166
LOCALE_SABBREVDAYNAME1
Native abbreviated name for Monday.
LOCALE_SABBREVDAYNAME2
Native abbreviated name for Tuesday.
LOCALE_SABBREVDAYNAME3
Native abbreviated name for Wednesday.
LOCALE_SABBREVDAYNAME4
Native abbreviated name for Thursday.
LOCALE_SABBREVDAYNAME5
Native abbreviated name for Friday.
LOCALE_SABBREVDAYNAME6
Native abbreviated name for Saturday.
LOCALE_SABBREVDAYNAME7
Native abbreviated name for Sunday.
LOCALE_SABBREVLANGNAME
Abbreviated name of the language, created by taking the two-letter language abbreviation from the ISO Standard 639 and adding a third letter, as appropriate, to indicate the sublanguage
LOCALE_SABBREVMONTHNAME1
Native abbreviated name for January.
LOCALE_SABBREVMONTHNAME2
Native abbreviated name for February.
LOCALE_SABBREVMONTHNAME3
Native abbreviated name for March.
LOCALE_SABBREVMONTHNAME4
Native abbreviated name for April.
LOCALE_SABBREVMONTHNAME5
Native abbreviated name for May.
LOCALE_SABBREVMONTHNAME6
Native abbreviated name for June.
LOCALE_SABBREVMONTHNAME7
Native abbreviated name for July.
LOCALE_SABBREVMONTHNAME8
Native abbreviated name for August.
LOCALE_SABBREVMONTHNAME9
Native abbreviated name for September.
LOCALE_SABBREVMONTHNAME10
Native abbreviated name for October.
LOCALE_SABBREVMONTHNAME11
Native abbreviated name for November.
LOCALE_SABBREVMONTHNAME12
Native abbreviated name for December.
LOCALE_SABBREVMONTHNAME13
Native abbreviated name for 13th month, if it exists.
LOCALE_SCOUNTRY
Full localized name of the country.
LOCALE_SCURRENCY
String used as the local monetary symbol.
LOCALE_SDATE
Character(s) for the date separator
LOCALE_SDAYNAME1
Native long name for Monday
LOCALE_SDAYNAME2
Native long name for Tuesday
LOCALE_SDAYNAME3
Native long name for Wednesday
LOCALE_SDAYNAME4
Native long name for Thursday
LOCALE_SDAYNAME5
Native long name for Friday
LOCALE_SDAYNAME6
Native long name for Saturday
LOCALE_SDAYNAME7
Native long name for Sunday
LOCALE_SDECIMAL
Character(s) used as the decimal separator.
LOCALE_SENGCOUNTRY
Full English name of the country this is always restricted to characters that can be mapped into the ASCII 127-character subset.
LOCALE_SENGCURRNAME

Windows NT 5.0 or later: The full English name of the currency associated with the locale.

LOCALE_SENGLANGUAGE
Full English name of the language from the International Organization for Standardization (ISO) Standard 639 this is always restricted to characters that can be mapped into the ASCII 127-character subset.

LOCALE_SGROUPING
Sizes for each group of digits to the left of the decimal An explicit size is needed for each group, and sizes are separated by semicolons. If the last value is zero, the preceding value is repeated. For example, to group thousands, specify 3;0.

LOCALE_SINTLSYMBOL
Three characters of the international monetary symbol specified in ISO 4217, "Codes for the Representation of Currencies and Funds," followed by the character separating this string from the amount.

LOCALE_SISO3166CTRYNAME

Windows NT 4.0 or later: The ISO 3166 code for the country name.

LOCALE_SISO639LANGNAME

Windows NT 4.0 or later: The abbreviated name of the language based on the ISO Standard 639 values.

LOCALE_SLANGUAGE
Full localized name of the language.
LOCALE_SLIST
Character(s) used to separate list items. For example, a comma is used in many locales.
LOCALE_SLONGDATE

Long date formatting string for this locale The string can consist of a combination of day, month, and year format pictures defined in the Day, Month, Year, and Era Format Pictures table in National Language Support Constants and any string of characters enclosed in single quotes. Characters in single quotes remain as given.

LOCALE_SMONDECIMALSEP
Character(s) used as the monetary decimal separator.
LOCALE_SMONGROUPING

Sizes for each group of monetary digits to the left of the decimal an explicit size is needed for each group, and sizes are separated by semicolons. If the last value is zero, the preceding value is repeated. For example, to group thousands, specify 3;0.

LOCALE_SMONTHNAME1
Native long name for January
LOCALE_SMONTHNAME2
Native long name for February
LOCALE_SMONTHNAME3
Native long name for March
LOCALE_SMONTHNAME4
Native long name for April
LOCALE_SMONTHNAME5
Native long name for May
LOCALE_SMONTHNAME6
Native long name for June
LOCALE_SMONTHNAME7
Native long name for July
LOCALE_SMONTHNAME8
Native long name for August
LOCALE_SMONTHNAME9
Native long name for September
LOCALE_SMONTHNAME10
Native long name for October
LOCALE_SMONTHNAME11
Native long name for November
LOCALE_SMONTHNAME12
Native long name for December
LOCALE_SMONTHNAME13
Native name for 13th month, if it exists
LOCALE_SMONTHOUSANDSEP
Character(s) used as the monetary separator between groups of digits to the left of the decimal.
LOCALE_SNATIVECTRYNAME
Native name of the country
LOCALE_SNATIVECURRNAME

Windows NT 5.0 or later: The native name of the currency associated with the locale.

LOCALE_SNATIVEDIGITS
Native equivalents to ASCII zero through 9
LOCALE_SNATIVELANGNAME
Native name of the language
LOCALE_SNEGATIVESIGN
String value for the negative sign.
LOCALE_SPOSITIVESIGN
String value for the positive sign.
LOCALE_SSHORTDATE

Short date formatting string for this locale, The string can consist of a combination of day, month, and year format pictures defined in Day, Month, Year, and Era Format Pictures table in National Language Support Constants.

LOCALE_SSORTNAME

Windows NT 5.0 or later: The full localized name of the sort for the given locale identifier

LOCALE_STHOUSAND
Character(s) used to separate groups of digits to the left of the decimal.
LOCALE_STIME
Character(s) for the time separator
LOCALE_STIMEFORMAT

Time formatting strings for this locale, The string can consist of a combination of the hour, minute, and second format pictures defined in the Hour, Minute, and Second Format Pictures table in National Language Support Constants.

LOCALE_SYEARMONTH

Windows NT 5.0 or later: The Year/Month formatting string for the locale. This string shows the proper format for a date string that contains only the year and the month.

LOCALE_USE_CP_ACP

This may be ORed with any other LCTYPE in the GetLocaleInfoA call to ensure that the code page used to do the translation of the other LCTYPE from Unicode to ANSI is based on the system ANSI code page, rather than the default ANSI code page for the LCID specified in the GetlocalInfoA call.

Many of the locale types previously listed are closely related, such that changing one affects the value of the others. The following table shows the relationships between these types:

Constant

Affects

LOCALE_ICURRENCY

LOCALE_IPOSSEPBYSPACE, LOCALE_IPOSSYMPRECEDES

LOCALE_INEGCURR

LOCALE_INEGSEPBYSPACE, LOCALE_INEGSYMPRECEDES, LOCALE_INEGSIGNPOSN, LOCALE_IPOSSIGNPOSN

LOCALE_SSHORTDATE

LOCALE_SDATE, LOCALE_IDATE

LOCALE_SLONGDATE

LOCALE_ILDATE

LOCALE_STIMEFORMAT

LOCALE_STIME, LOCALE_ITIME, LOCALE_ITLZERO

 

CompareString

The CompareString function compares two character strings, using the locale specified by the given identifier as the basis for the comparison.

The CompareString function ignores Arabic Kashidas during the comparison. Thus, if two strings are identical save for the presence of Kashidas, CompareString returns a value of 2; the strings are considered "equal" in the collation sense, though they are not necessarily identical.

int CompareString(
  LCID Locale,       // locale identifier
  DWORD dwCmpFlags// comparison-style options
  LPCTSTR lpString1, // pointer to first string
  int cchCount1,     // size, in bytes or characters, of first string
  LPCTSTR lpString2, // pointer to second string
  int cchCount2      // size, in bytes or characters, of second string
);

Parameters

Locale

Specifies the locale used for the comparison. This parameter can be one of the following predefined locale identifiers:

Value

Meaning

LOCALE_SYSTEM_DEFAULT

The system's default locale.

LOCALE_USER_DEFAULT

The current user's default locale.

This parameter can also be a locale identifier created by the MAKELCID macro.


dwCmpFlags

A set of flags that indicate how the function compares the two strings. By default, these flags are not set. This parameter can specify zero to get the default behavior, or it can be any combination of the following values:

Value

Meaning

NORM_IGNORECASE

Ignore case.

NORM_IGNOREKANATYPE

Do not differentiate between Hiragana and Katakana characters. Corresponding Hiragana and Katakana characters compare as equal.

NORM_IGNORENONSPACE

Ignore nonspacing characters.

NORM_IGNORESYMBOLS

Ignore symbols.

NORM_IGNOREWIDTH

Do not differentiate between a single-byte character and the same character as a double-byte character.

SORT_STRINGSORT

Treat punctuation the same as symbols.

lpString1

Pointer to the first string to be compared.

cchCount1

Specifies the size, in bytes (ANSI version) or characters (Unicode version), of the string pointed to by the lpString1 parameter. If this parameter is - 1, the string is assumed to be null terminated and the length is calculated automatically.

lpString2
Pointer to the second string to be compared.

cchCount2

Specifies the size, in bytes (ANSI version) or characters (Unicode version), of the string pointed to by the lpString2 parameter. If this parameter is - 1, the string is assumed to be null terminated and the length is calculated automatically.

Return Values

If the function succeeds, the return value is one of the following values:

Value

Meaning

CSTR_LESS_THAN

The string pointed to by the lpString1 parameter is less in lexical value than the string pointed to by the lpString2 parameter.

CSTR_EQUAL

The string pointed to by lpString1 is equal in lexical value to the string pointed to by lpString2.

CSTR_GREATER_THAN

The string pointed to by lpString1 is greater in lexical value than the string pointed to by lpString2.

If the function fails, the return value is zero. To get extended error information, call GetLastError. GetLastError may return one of the following error codes:

ERROR_INVALID_FLAGS
ERROR_INVALID_PARAMETER

Remarks

Notice that if the return value is 2, the two strings are "equal" in the collation sense, though not necessarily identical.

To maintain the C run-time convention of comparing strings, the value 2 can be subtracted from a nonzero return value. The meaning of < 0, ==0 and > 0 is then consistent with the C run times.

If the two strings are of different lengths, they are compared up to the length of the shortest one. If they are equal to that point, then the return value will indicate that the longer string is greater. For more information about locale identifiers, see Locale Identifiers.

Typically, strings are compared using what is called a "word sort" technique. In a word sort, all punctuation marks and other nonalphanumeric characters, except for the hyphen and the apostrophe, come before any alphanumeric character. The hyphen and the apostrophe are treated differently than the other nonalphanumeric symbols, in order to ensure that words such as "coop" and "co-op" stay together within a sorted list.

If the SORT_STRINGSORT flag is specified, strings are compared using what is called a "string sort" technique. In a string sort, the hyphen and apostrophe are treated just like any other nonalphanumeric symbols: they come before the alphanumeric symbols.

The lstrcmp and lstrcmpi functions use a word sort. The CompareString and LCMapString functions default to using a word sort, but use a string sort if their caller sets the SORT_STRINGSORT flag.

The CompareString function is optimized to run at the highest speed when dwCmpFlags is set to 0 or NORM_IGNORECASE, and cchCount1 and cchCount2 have the value -1.

Windows CE: Windows CE does not support the ANSI version of this function.
Windows CE does not support the following values for the dwCmpFlags parameter:
NORM_IGNOREKANATYPE
NORM_IGNORENONSPACE
NORM_IGNORESYMBOLS
NORM_IGNOREWIDTH

The dwCmpflags parameter always includes the SORT_STRINGSORT value.



©2009 Microsoft Corporation. All rights reserved. Contact Us |Terms of Use |Trademarks |Privacy Statement
Microsoft