Click Here to Install Silverlight*
Middle EastChange|All Microsoft Sites
Microsoft*
Search Microsoft.com for:
Microsoft Products & Arabic Support ver. 3.5 
     Windows
Microsoft Windows Vista

 
White Paper - (Windows Vista and Arabic support)
White Paper - (Microsoft Windows Vista Arabic search)
White Paper - (Windows Vista Media Center and Arabic support)
White Paper - (Windows DVD Maker the Arabic support )
White Paper - (Windows Movie Maker and Arabic support)


White Paper - (Microsoft Windows Vista Arabic Search)
Table of contents:


Executive Summary:


We live in an age where nearly everything is digital. Documents, music, video, photos, and even daily correspondence (including e-mail, faxes, and voice mail) are increasingly created, stored, and accessed in electronic form on personal computers. This fact and the huge increase in hard disk storage capacity have made it increasingly difficult to stay on top of the information stored on our PCs. The enhanced desktop search and organization features in all Windows Vista editions help you readily locate files, e-mail messages, and other items on your PC. If you remember anything about this file Windows Vista can find it for you quickly. Windows Vista goes beyond desktop search it can also help you "see" your files in multiple ways. Want to see all of your documents arranged by date? How about by author? No problem. The system can auto-organize your content using basic properties that are often automatically saved with your files. Even better, you can tag your files with relevant properties, enabling the system to bring together your documents, photos, music, and videos in whatever way you think about them.




Overview & Scope


The Arabic Vista search functionality is enhanced by an Arabic language-specific word breaker, stemmer and Named Entities (NEs) detection tool to provide increased relevance of search results. Word breakers are an essential part of any search engine, since they define the elements of a search query which will be matched against the document index. Many search engines use the simple language-neutral technique of breaking on white space, which is insufficient. The new Arabic word breaker in Vista, which benefit from linguistic and statistical information, will significantly enhance the user's search experience in a variety of structurally different languages.




Goals


The Arabic language-specific word breaker has the main goal of extending language coverage and improving word breaking behavior to improve the search engine experience and gain an advantage in terms of language coverage. The new Arabic language-specific word breaker will improve the user experience when using these languages in a search context. In Vista the search engine starts showing results on each keystroke. As a result, each character that is typed is effectively prefix matched (wild-carded) so that it returns any words that begin with that character or characters. The effect should be that as you type, the number of matching items is reduced (although depending on how the typed string is word-broken the reality is that the returned result count could go up or down).




Breaking and Non-Breaking Characters


The determination of word breaking characters is essential, as it establishes which characters will be coded as word separators. Breaking characters include white space characters, punctuation markers, quotation marks, parenthesis, symbols, and more. Any character that is not explicitly listed as a breaking character is not a breaking character. One way of categorizing word breaking characters from a linguistic point of view is to assign them to two main groups:

  • Special Cases
  • There are many special cases in word breaking which override standard word breaking behavior. These special cases typically result from normally word breaking characters not breaking words in certain contexts or because a particular language uses punctuation token or a special symbol in a way which combines with the form of a word and therefore requires special treatment. Common examples of this include abbreviations and acronyms.

  • Named Entities
  • Named Entities are sequences of tokens which we want to recognize as a single token and link to a standardized format. Some of these sequences may contain normally breaking characters. Using Named Entities enabled us to identify different representations of the same information as equivalent, thus extending search coverage. It includes Numbers, Currencies, Times, Dates, Emails, URLs, File paths, and file names.




Additional Features for Search


This section groups together a number of additional features related to The Vista Arabic search engine. These features include:

Pass-through Feature by including a query in quotation marks, the word or words in the search query is matched without change against the index.

Special Word List, the word breaker has some rules to ensure that phrases with characters that are normally breaking characters (e.g., "#") are not broken in some frequent lexical contexts (e.g., "C#").

Diacritics the Arabic word-breaker and search engine preserves the diacritics emitting the form with the diacritic. Diacritics are marks added to a letter or phoneme to indicate a special phonetic value. Diacritics distinguish words that are otherwise graphically identical such as "اليُمنُ", "لُبنانِ". For some languages it is configured to be diacritically sensitive by default & in other languages is not. For Arabic it must be explicitly configured, When the index is configured to "treat similar words with diacritics as different words" ((معاملة الكلمات المتشابهة بعلامات تشكيل على أنها كلمات مختلفة (i.e. to be ‘diacritically sensitive’), a search for "لبنانِ" will not return items that contain the word "لُبنانِ" (and vice versa). Conversely, if the index is configured to be diacritically insensitive (the default in English and in Arabic builds), then a search for "لبنانِ" will return items that contain the word "لُبنانِ". This setting is configurable from the advanced options of Indexing Options in Control Panel and requires a re-build of the index after changing the setting (Figure 1).


Figure 1: Advanced Indexing Options




Vista Search Engine Features


Windows Vista goes beyond desktop search. It can also help you "see" your files in multiple ways. Want to see all of your documents arranged by date? How about by author? No problem. The system can auto-organize your content using basic properties that are often automatically saved with your files. Even better, with Windows Photo Gallery and Windows Media Player 11 or with third-party applications, you can tag your files with relevant properties, enabling the system to bring together your documents, photos, music, and videos in whatever way you think about them.

  • Instant Search

    Instantly find what you need with Windows Vista which introduces the new Instant Search, an enhanced desktop search and organization tool that helps you locate files and e-mail messages on your PC. If you remember anything about a file (the type of file, when it was created, or even what it contains), Windows Vista can quickly find it for you. With Instant Search, you are never more than a few keystrokes away from whatever you're looking for. This feature, which is available almost anywhere you are in Windows Vista, enables you to search for a file name, a property, or even text contained within a file, and it returns pinpointed results. It's fast and easy. Instant Search is also contextual, optimizing its results based on your current activity whether it's searching Control Panel applets, looking for music files in Windows Media Player, or looking over all your files and applications on the Start menu.

  • Start-Menu Search

    With its "fast as you can type" search performance, the newly redesigned Start menu is your portal to virtually anything on your PC. To find a specific file, application, or Internet Favorite, just open the Start menu (or press the Windows key on the keyboard) and start typing in the embedded Instant Search box. As you type, Windows Vista instantly searches file and application names, metadata, and the full text of all files, and groups your results by category: Programs; Favorites/Internet History; Files, including documents and media; and Communications, including e-mail, events, tasks, and contacts.

    The screen shots below shows the result of the typing effect on the search results displayed, (Figure 2) shows the result of one character typed "م", (Figure 3) shows the results got reduced after second character "مص", (Figure 4) shows the results got reduced after third character "مصر".


  • Figure 2: Start Menu – Instant Search 1



    Figure 3: Start Menu – Instant Search 2



    Figure 4: Start Menu – Instant Search 3


    The screen shots below shows more features of the newly redesigned Start menu, moving the mouse cursor to any item in the search result display more information about this item (Figure 5), right clicking on this item shows more details and actions about this item (Figure 6).


    Figure 5: Start Menu – Instant Search – Item Description



    Figure 6: Start Menu – Instant Search – Item properties


  • Windows Vista Explorer Showcases Search

    The new Windows Vista Explorer showcases Instant Search in the top-left corner. It's always with you when you're using the Documents Explorer, Music Explorer, Pictures Explorer, and the new Search Explorer. As in the Start menu, you only have to type a few letters before you start seeing the most relevant results. If the results aren't what you're looking for, you have easy access to tools that can help you refine your search or search across the Internet using your favorite search engine (Figure 7). For advanced search options click "بحث متقدم" (Figure 8)

    Figure 7: Windows Vista Explorer Showcases Search


    Figure 8: Windows Vista Explorer Showcases Advanced Search


    Note: in advanced search mode you can use Hijri or Um EL Qura calendar, in the date search field.

  • Search Folders

    Windows Vista introduces Search Folders, a powerful new tool that makes it easy to find and organize your files. A Search Folder is simply a search that you save. Opening a Search Folder runs your saved search, displaying up-to-date results quickly. For example, you could design a search for all documents that are authored by John and that contain the word "مشروع" You'd save this search, titled "المشاريع" as a Search Folder. When you open this Search Folder, the search runs and you see the results right away. As you add more files to your computer that contain the word "مشروع" those files will appear in the Search Folder alongside other matching files, no matter where you physically saved them on your PC. It's simple and fast. Being able to view content on your computer sorted into saved Search Folders adds a lot of flexibility to the ways you can work with your files. In addition, Windows Vista still supports traditional, location-based folders. Folders are useful because they foster easy migration from one computer to another, and because your existing programs would break without them. In Windows Vista, you'll still save content in folders, but it's easier to use those folders because of tools such as Instant Search and enhanced column header controls.

  • Organization

    Although the new desktop search capabilities in Windows Vista fulfill many search needs, they are not designed to address every information management need. For instance, they do not readily help you find collections of similar files, such as files from the same project or author, and then share those files with other people, organize them, or move them around on your hard disk. That's where the powerful Explorers extend the benefits of the new Windows Vista desktop search capabilities to the next level by combining Instant Search with the ability to auto-organize content across your PC based on file properties. Rather than having to remember specific locations or folder names to find your documents, music, pictures, and e-mail, you can rely on the ability of Windows Vista to search file properties known as "metadata."

  • Tagging your files

    Powerful new search and organization features in Windows Vista make extensive use of file properties, or metadata, to give you even more dynamic ways to interact with your information. Many of your files already contain useful metadata. For example, Microsoft Office automatically records certain document properties, such as author and date created. And music ripped from CDs often has properties such as song, album, and artist name. But Windows Vista also gives you ways to apply custom properties to your files. You can quickly and easily apply properties to any file or group of files in:

    Details Pane: The easiest way to add a property to a file is to select the file and change it in the Details Pane at the bottom of the Explorer. Many of the entry fields support AutoComplete, making it even easier to add properties, for one file or across many files. Selecting multiple files and adding a property via the Details Pane adds that property to all selected files.

    Properties window: You can still go to the familiar Properties window by right-clicking a file and selecting Properties. In the Details tab you have quick access to a file's metadata. One handy feature is the ability to remove all properties of a file with a single click, which can help you prepare a file for sharing with others by removing details such as the author's name.




Additional Resources

To learn more about Microsoft Windows Vista Desktop Search Engine, please refer to the following list of related links for additional resources and information.




Disclaimer

This white paper will discuss the Arabic support in Windows Vista including the changes from Windows XP and the new added features related to the Arabic language.

The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the date of publication.

This White Paper is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS DOCUMENT.

Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation.

Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property.

Unless otherwise noted, the example companies, organizations, products, domain names, e-mail addresses, logos, people, places, and events depicted herein are fictitious, and no association with any real company, organization, product, domain name, email address, logo, person, place, or event is intended or should be inferred.

© 2007 Microsoft Corporation. All rights reserved.

Windows XP and Windows Vista are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries.

The names of actual companies and products mentioned herein may be the trademarks of their respective owners.





    Last updated: Tuesday , May 31, 2007


©2008 Microsoft Corporation. All rights reserved. Contact Us |Terms of Use |Trademarks |Privacy Statement