The @dir attribute
Bi-directional text is text that contains text in both text directionalities, right-to-left (RTL) and left-to-right (LTR). For example, languages such as Arabic, Hebrew, Farsi, Urdu, and Yiddish have text written from right-to-left; however, numerics and embedded sections of Western language text are written from left to right. Some multilingual documents also contain a mixture of text segments in two directions.
DITA contains the following attributes that have an effect on bi-directional text processing:
- @xml:lang
- Identifies the language and locale, and so can be used to identify text that requires bi-directional rendering.
- @dir
- Identifies or overrides the text directionality. It can be set to "ltr", "rtl", "lro", or "rlo"
In general, properly-written mixed text does not need any special markers; the Unicode bidirectional algorithm positions the punctuation correctly for a given language. The processor is responsible for displaying the text properly. However, some rendering systems might need directions for displaying bidirectional text, such as Arabic, properly. For example, Apache FOP might not render Arabic properly unless the left-to-right and right-to-left indicators are used.
The use of the @dir attribute and the Unicode algorithm is explained in the
article Specifying the direction of text and tables: the dir
attribute (http://www.w3.org/TR/html4/struct/dirlang.html#h-8.2) . This article
contains several examples of how to use the @dir attribute set to either
"ltr" or "rtl". There is no example of setting the @dir attribute to either
"lro" or "rlo", although it can be inferred from the example that uses the
<bdo> element, a now-deprecated W3C mechanism for overriding the
entire Unicode bidirectional algorithm.
Recommended usage
The @dir attribute, together with the @xml:lang attribute, is essential for rendering table columns and definition lists in the proper order.
In general text, the Unicode Bidirectional algorithm, as specified by the @xml:lang attribute together with the @dir attribute, provides for various levels of bidirectionality:
- Directionality is either explicitly specified via the @xml:lang attribute in combination with the @dir attribute on the highest level element (topic or derived peer for topics, map for ditamaps) or assumed by the processing application. If used, the @dir attribute SHOULD be specified on the highest level element in the topic or document element of the map.
- When embedding a right-to-left text run inside a left-to-right text run (or
vice-versa), the default direction might provide incorrect results based on the
rendering mechanism, especially if the embedded text run includes punctuation that
is
located at one end of the embedded text run. Unicode defines spaces and punctuation
as
having neutral directionality and defines directionality for these neutral characters
when they appear between characters having a strong directionality (most characters
that
are not spaces or punctuation). While the default direction is often sufficient to
determine the correct directionality of the language, sometimes it renders the
characters incorrectly (for example, a question mark at the end of a Hebrew question
might appear at the beginning of the question instead of at the end or a parenthesis
might render incorrectly). To control this behavior, the @dir attribute
is set to "ltr" or "rtl" as needed, to ensure that the desired direction is applied
to
the characters that have neutral bidirectionality. The "ltr" and "rtl" values override
only the neutral characters (for example, spaces and punctuation), not all Unicode
characters.
note
Problems with Unicode rendering can be caused by the rendering mechanism. The problems are not due to the XML markup itself. - Sometimes you might want to override the default directionality for strongly bidirectional characters. Overrides are done using the "lro" and "rlo" values, which overrides the Unicode Bidirectional algorithm. This override forces a direction on the contents of the element. These override attributes give the author a brute force way of setting the directionality independent of the Unicode Bidirectional algorithm. The gentler "ltr" and "rtl" values have a less radical effect, only affecting punctuation and other so-called neutral characters.
For most authoring needs, the "ltr" and "rtl" values are sufficient. Use the override values only when you cannot achieve the desired effect using the the "ltr" and "rtl" values.
Processing expectations
Applications that process DITA documents, whether at the authoring, translation, publishing, or any other stage, SHOULD fully support the Unicode bidirectional algorithm to correctly implement the script and directionality for each language that is used in the document.
Applications SHOULD ensure that the root element in every topic document and the root element in the root map has values for the @dir and @xml:lang attributes.