The Twenty-third International Unicode Conference was held between March 24 – 26, 2003 in Prague, Czech Republic. The following presentations were given by Microsoft representatives:
The advent of single worldwide source on Windows 2000 enabled Microsoft to support Indic scripts on the Windows platform for the first time. Starting with Hindi, Konkani, Marathi, Sanskrit and Tamil on Windows 2000, Windows XP added support for Kannada, Telugu, Gujarati and Punjabi. The next major release of Windows will add yet more Indic script support. This presentation will focus on the technical and political challenges particular to enabling Indic scripts on Windows, and the specific technologies that make that Indic support possible.
Participants should come away with a basic understanding of the relationship between a linguistic character in Indic and codepoints in Unicode, and how that relates to the underlying technologies on the platform, including:
| • | the interaction of input, fonts and rendering engines |
| • | legacy codepages versus Unicode |
| • | the misunderstood need for an IME (versus a keyboard) |
| • | collation |
Indic Script Support on the Windows Platform
(735KB PowerPoint presentation)
Cathy Wissink, Program Manager
People use collation in their daily lives: finding names in a phone book, perusing a library card catalog, reading a book index. As such, people have expectations on where to find information within a structure. What complicates the process is the fact that these expectations vary from culture to culture. In addition, people have implicit knowledge of the correctness of collation (is it right or wrong?), but generally cannot explain what the rules of correctness are.
In a properly globalized product, users will have properly collated data–e.g., in the file system, in a database, in an e–mail address book. How should implementers go about ensuring culturally–correct collation in product? What are the basic linguistic issues of collation, and how do they manifest themselves in technology?
This presentation will explain the basic tenets of collation in language, debunk some myths about collation in globalized software, show how collation functions are used (using examples from the Win32 API), and touch upon best practices.
Sorting It All Out: An introduction to Collation
(975KB Powerpoint Presentation)
Cathy Wissink, Program Manager
Michael Kaplan, Software Design Engineer
The surrogate range of Unicode is how Unicode supports over 1,000,000 possible characters, and now that there are officially supplementary characters defined, people need to be thinking about these characters and how they will be supported. Under Windows, there is some support (in Windows 2000 and XP, Office, as well as in the Visual Studio.NET) and this support will be discussed, as well as how to best use it in your applications and components.
Supplementary Character Support in Microsoft Products
(153KB Powerpoint Presentation)
Michael Kaplan, Software Design Engineer
Microsoft SQL Server (MS SQLS) has supported Unicode for the last two verions, each time with different ideas of how text is to be stored and collated. The purpose of this paper is top describe what MS SQLS does in order to support the storing, normalizing, and sorting of Unicode data, both at the engine level and at the level where you have some control over the ordering. Many of the specific issues in MS SQLS 2000's new COLLATE keyword will be addressed, including both the features and the limitations of this powerful functionality. In addition, the mixed meaning of SQLS collations and code page conversions will be discussed.
The conclusion of the paper is that MS SQL Server can be a very compelling multilingual platform for its Unicode support, but with some important caveats that must be understood in order to properly make use of its capabilities.
Unicode and Collation Support in Microsoft SQL Server
(107KB PowerPoint presentation)
Michael Kaplan, Software Design Engineer
Getting data into applications by keyboards seems like one of the fundamentally simple features on Windows, but once you add additional issues like fonts/rendering engines it does not seem so simple anymore. It also turns out to be a bit more complicated when you add many different keyboard layouts on top of over 100 languages. Once you add the ability to define your own keyboard layouts (whether by using Microsoft interfaces or third party products) where all of Unicode can be supported, it becomes downright complex!
This presentation will talk about the interaction between input, fonts, and rendering engines, the many features that keyboard layouts support such as dead keys and ligatures, the issues with code pages vs. Unicode, when IMEs are preferred and when they are not, the collation issues that enter into the equation, and finally tools to make it all a bit easier.
Unicode and Keyboards on Windows
(261KB PowerPoint presentation)
Michael Kaplan, Software Design Engineer
Cathy Wissink, Program Manager
Windows XP builds upon the international functionality of the Windows 2000 platform. It provides improved globalization features related to Unicode such as support for new scripts, languages and locales, a wider variety of input locales and rendering features, enhanced support for surrogates and improved Multilingual User Interface (MUI) support.
The primary purpose of this paper is to outline the international support in Windows XP, describe this support within the historical context of Unicode on the Windows NT platform, discuss how language support is integrated into a single worldwide source, and finally, demonstrate that Unicode support is essential to building a fully globalized operating system.
In addition, globalization trends in upcoming releases of Windows (.NET Server 2003 and beyond) will be discussed.
Unicode and Windows XP
(404KB Powerpoint Presentation)
Cathy Wissink, Program Manager
Domain Name (DNS) is one of the fundamental concepts which makes Internet a reality by providing a worldwide recognizable way to reach internet sites. However its character syntax is currently restricted to the ASCII repertoire. Finally DNS is embracing Unicode and this is having an effect on the following related concept: Universal Resource Identifier (URI) also used widely on the Web. This talk describes the International Domain Name Architecture (IDNA), Internationalized Resource Identifiers (IRI) and how Unicode is a key component of these specifications.
IDN and IRI, Finally an Internationalized Solution to Domain Name and Resource Identifiers
(154KB PDF Document)
Michel Suignard, Program Manager
This talk provides a detailed overview of advanced globalization topics and show how the .NET Framework deals with these topics.
Advanced globalization topics include:
| • |
Text handling
| ||||||||||
| • | Interoperability with Windows native code and SQL server | ||||||||||
| • | Encodings (GB18030 support, Unicode encodings) |
Globalization with Microsoft .NET Framework
(280KB Powerpoint Presentation)
François Liger, Program Manager
This talk is a practical presentation, during which a fully multi-lingual, multi-cultural application is built. The goal of the presentation is to demonstrate globalization best practices in building real web applications (including database and web service support) with .NET Framework. This talk builds on the other proposals I submitted by putting all the "theoretical" pieces presented in "Globalization with Microsoft .NET Framework and Advanced Globalization with .NET Framework" together in a live application.
Building a Multi-lingual, Multi-cultural Web Site with the Microsoft .NET Framework
(133KB Powerpoint Presentation)
François Liger, Program Manager
As mobile data devices move into the mainstream developing applications for them becomes a unique market opportunity. This talk will explore considerations that have to be taken into account when writing international applications for ASP.NET mobile controls, the .NET Compact Framework and Windows CE .NET. It will explain how to deal with different character encodings, differences in locale support, fonts and localized user interfaces while still maintaining a unified code base.
Developing International Applications for Mobile Devices with .NET
(391KB Powerpoint Presentation)
Achim Ruopp, International Program Manager
Microsoft offers an E-Business Server 2002 line of products (BizTalk® Server 2002, Commerce Server® 2002, and Content Management Server® 2002) that enable companies to integrate enterprise applications, exchange information with business partners, and develop comprehensive solutions for managing online business and web content.
International support is built-in in these three servers. The paper will give an architecture overview and explain the features of each product. In particular, the paper will highlight the globalization and localization aspects such as Unicode compliance, legacy code pages support for Enterprise Application Integration (EAI), and infrastructure for multi-lingual web sites. We will demonstrate how business to consumer (B2C), business to business (B2B), and enterprise portal scenarios with partners located across the world can be implemented.
These products can easily be deployed worldwide and satisfy needs of international customers. At the end of the talk we will preview future globalization features (MUI and more languages) in the coming releases.
Going Global with Microsoft E-Business Servers
(1305KB Powerpoint Presentation)
Laurence Kancherla, Lead Program Manager
Gwyneth Marshall, International Program Manager
Vincent Célié, Program Manager