Using different language formats in Notepad

Notepad allows you to create and open documents in several different formats: ANSI, Unicode, big-endian Unicode, or UTF-8. These formats allow you to work with documents that use different character sets.

By default, your documents will be saved as standard ANSI text.

Unicode is a superset of all the major scripts of the world. It includes character sets common to business and computer use. When you save a document in Unicode, you can use Unicode control characters to help with text flow and direction for languages such as Arabic and Hebrew.

Some fonts cannot display all of the Unicode characters. If you see any characters missing in your text file, you can change the font to one that includes the character. Generally, Microsoft Sans Serif is a good choice for Unicode characters.

The bytes (a unit of storage) in a word in a Unicode document created on a big-endian processor, such as the Macintosh, are arranged in an order opposite to that of the bytes in a word in a document created on an Intel processor. The most significant byte has the lowest address, with the word stored big end first. To make your documents accessible to users on these types of computers, save your Notepad file in the big-endian Unicode format.

UTF stands for Universal Character Set Transformation Format. UTF-8 is the 8-bit form of Unicode. Save your document in UTF-8 if you are using older transmission media that support only 8 bits of significant data within individual bytes.

See the Unicode Consortium Web site for more information on these formats.

Note

Related Topics

Choosing a program to write a document

Change the font style and size

Web addresses can change, so you might be unable to connect to the Web site or sites mentioned here. 



© 2015 Microsoft Corporation. All rights reserved. Contact Us |Terms of Use |Trademarks |Privacy & Cookies