Developing OpenType Fonts for Arabic Script. 2/8/2018. 18 minutes to read. Contributors. In this article This document presents information that will help font developers create or support OpenType fonts for all Arabic script languages covered by the Unicode Standard. Introduction Font developers will learn how to encode complex script features in their fonts, choose character sets, organize font information, and use existing tools to produce Arabic fonts.
Registered features of the Arabic script are defined and illustrated, encodings are listed, and templates are included for compiling Arabic layout tables for OpenType fonts. This document also presents information about the Arabic OpenType shaping engine of Uniscribe, the Windows component responsible for text layout.
In addition to being a primer and specification for the creation and support of Arabic fonts, this document is intended to more broadly illustrate the OpenType Layout architecture, feature schemes, and operating system support for shaping and positioning text. Glossary The following terms are useful for understanding the layout features and script rules discussed in this document. Base Glyph - Any glyph that can have a diacritic mark above or below it. Layout operations are defined in terms of a base glyph, not a base character, as a ligature may act as the base. Character - Each character represents a Unicode character code point.
For example 'lam' character is U+0644. A character may have multiple forms of glyphs. Diacritic Marks - A character that is positioned above or below a character to provide pronunciation guidance (i.e. Accent acute, grave, tilde, etc.) Glyph - A glyph represents a form of one or more characters. For example, the final, initial and medial 'lam' glyphs (U+FEDE, U+FEDF & U+FEEO) are all forms of the 'lam' character (U+0644). Kashida - Also known as the 'tatweel' character (U+0640). This character is used for elongation between connecting characters and is used for justification.
Ligature - A combination of glyphs that join to form a single glyph. For example, the 'lam alef' combinations of glyphs are mandatory ligatures for Arabic. Other ligatures, like 'lam meem initial', are optional. Shaping Engine. The Uniscribe Arabic shaping engine processes text in stages. The stages are:. Analyzing the characters for contextual shape.
Shaping (substituting) glyphs with OTLS (OpenType Library Services). Positioning glyphs with OTLS The descriptions which follow will help font developers understand the rationale for the Arabic feature encoding model, and help application developers better understand how layout clients can divide responsibilities with operating system functions. Analyzing the Characters The unit that the shaping engine receives for the purpose of shaping is a string of Unicode characters, in a sequence. The contextual analysis engine determines the correct contextual form the character should take, based on the character before and after it. The contextual shape maps to an OTL feature for that form (isol, init, medi, fina).
Additionally, during the analysis process, the engine verifies valid diacritic combinations. For additional information, see the section. Shaping with OTLS The first step Uniscribe takes in shaping the character string is to map all characters to their nominal form glyphs (e.g. The glyph for U+0627). Then, Uniscribe applies contextual shape features to the glyph string. Next, Uniscribe calls OTLS to apply the features. All OTL processing is divided into a set of predefined features (described and illustrated in the Features section of this document).
Each feature is applied, one by one, to the appropriate glyphs in the syllable and OTLS processes them. Uniscribe makes as many calls to the OTL Services as there are features. This ensures that the features are executed in the desired order. The steps of the shaping process are outlined below. Not all of the features listed apply to all Arabic script languages. Appendix A: Writing System Tags Features are encoded according to both a designated script and language system.
The language system tag specifies a typographic convention associated with a language or linguistic subgroup. For example, there are different language systems defined for the Arabic script; Arabic, Baluchi, Ladakhi, Pashto, etc. Other typographic systems could be defined for Moroccan Arabic or Wahabi tradition of Qur'anic typography. Currently, the Uniscribe engine only supports the 'default' language for each script. However, font developers may want to build language specific features which are supported in other applications and will be supported in future Microsoft OpenType implementations. NOTE: It is strongly recommended to include the 'dflt' language tag in all OpenType fonts because it defines the basic script handling for a font. The 'dflt' language system is used as the default if no other language specific features are defined or if the application does not support that particular language.
If the 'dflt' tag is not present for the script being used, the font may not work in some applications. The following tables list the registered tag names for scripts and language systems.
Script and Font Support in Windows. 18 minutes to read.
Contributors. In this article Since before Windows 2000, text-display support for new scripts has been added in each major release of Windows. This article describes changes made in each major release. Note that support for a script may require certain changes to text stack components as well as changes to fonts. The Windows operating system has many text stack components: DirectWrite, GDI, Uniscribe, GDI+, WPF, RichEdit, ComCtl32, and others.
Script Unicode
The information provided here pertains primarily to GDI and DirectWrite. It is also generally applicable to UI frameworks such as RichEdit or the MSHTML rendering agent used for Windows apps and for rendering Web content, though those components may exhibit certain differences. Comments on language usage are included in cases in which associations between scripts and languages may not be well known.
The list of languages for any given script is not necessarily exhaustive. On This Page. Windows 10 Windows 10 converges the Windows platform for use across multiple device categories. The description above of previous releases applies to Windows Client (desktop) and Server editions. This section on Windows 10 covers all Windows 10 editions, including Desktop, Server and Mobile. All Windows 10 editions support the same set of scripts.
In addition to the scripts supported in earlier Windows releases, Windows 10 adds support for several additional, historic scripts.
![]()
The SIL Arabic script fonts are encoded according to Unicode, so your application must support Unicode text in order to access letters other than the standard ANSI characters. Most applications now provide basic Unicode support. You will, however, need some way of entering Unicode text into your document. Keyboarding The Arabic script font packages do not include any keyboarding helps or utilities.
If you cannot use the built-in keyboards of the operating system, you will need to install the appropriate keyboard and input method for the characters of the language you wish to use. If you want to enter characters that are not supported by any system keyboard, the can be helpful on Windows systems. Also available for Windows is. For other platforms, or can be helpful.
If you want to enter characters that are not supported by any system keyboard, and to access the full Unicode range, we suggest you use gucharmap, kcharselect on Ubuntu or similar software. Another method of entering some symbols is provided by a few applications such as Adobe InDesign. They can display a glyph palette that shows all the glyphs (symbols) in a font and allow you to enter them by clicking on the glyph you want. Other suggestions are listed here:.
Rendering SIL’s Arabic script fonts are designed to work with two advanced font technologies, and OpenType. To take advantage of the advanced typographic capabilities of these fonts, you must be using applications that provide an adequate level of support for Graphite or OpenType. Other suggestions are listed here:.
Conversion One common type of data conversion is from Roman script to Arabic script. Cross-script conversion is often very language specific. TECkit is one program that can be used for character encoding conversion. TECkit allows users to write their own custom conversion mappings. The TECkit package is available for download from SIL’s Web site. The software will be an important tool in data conversion.
Pmml in action ebook. One page that may prove helpful is:. Other suggestions are listed here:.
See All 256 Rows On En.wikipedia.org
1 INTRODUCTION Like all multi-lingual computing, Arabic computing is now firmly in the domain of Unicode. Unicode is an industrial protocol with the status of international agreement. It is designed to encode the elements of all known script systems in such a way that they become interchangeable between programs and operating systems.
Its implementation is well underway. Unicode eliminates the need to tamper with fonts to get special characters, but it is not a font. For legible text on screen and paper, Unicode depends on compatible fonts with the required characters, where necessary with additional dedicated font technology.
Comments are closed.
|
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |