The dir attribute
Bidirectional text processing is controlled by several factors:
- The xml:lang attribute may be used to identify text that requires bidirectional rendering. The Unicode Bidirectional algorithm provides the means to properly identify western content in mixed text.
- The dir attribute may be set on the root element, in combination with the xml:lang attribute. For example, to correctly set in a web browser a text in Arabic with embedded English content, the root element should be set with xml:lang="ar" and dir="rtl". All text, including punctuation marks, will be set correctly.
- The dir attribute may be set to either "ltr" or "rtl" on an element in the
document.
JTH: 21 Sept 2009; changed dir values to all uppercase to conform with the reference below.RDA 29 Jan 2010: changed back to lower case to conform with the DTD (upper case values are invalid in DITA content). Also added quotes to further clarify that these are attribute values.
- The dir attribute may be set to either "lro" or "rlo" on an element in the document.
The Unicode bidirectional algorithm positions the punctuation correctly for a given language. The rendering is responsible for displaying the text properly.
The use of the dir attribute and the Unicode algorithm is explained in the article Specifying the direction of text and tables: the dir attribute (http://www.w3.org/TR/html4/struct/dirlang.html#adef-dir) . This article contains several examples of how to use the dir attribute set to either left-to-right or right-to-left. There is no example of setting the dir attribute to either "lro" or "rlo", although it can be inferred from the example that uses the <bdo> element, a now-deprecated W3C mechanism for overriding the entire Unicode bidirectional algorithm.
Note that properly written mixed text does not need any special markers. The Unicode bidirectional algorithm is sufficient. However, some rendering systems may need directions for displaying bidirectional text, such as Arabic, properly. For example, the Apache FOP tool may not render Arabic properly unless the left-to-right and right-to-left indicators are used.
Recommended usage
The dir attribute, together with the xml:lang attribute, is essential for rendering table columns and definition lists <dl> to ensure proper order.
In general text, the Unicode Bidirectional algorithm, as specified by the xml:lang attribute together with the dir attribute, provides for various levels of bidirectionality, as follows:
- Directionality is either explicitly specified via the xml:lang attribute in combination with the dir attribute on the highest level element (topic or derived peer for topics, map for ditamaps) or assumed by the processing application. If used, it is recommended to specify the dir attribute on the highest level element in the topic or document element of the map.
- When embedding a right-to-left text run inside a left-to-right text run (or vice-versa),
the default direction may provide incorrect results based on the rendering
mechanism, especially if the embedded text run includes punctuation that is located
at one end
of the embedded text run. Unicode defines spaces and punctuation as having neutral
directionality and defines directionality for these neutral characters when they appear
between characters having a strong directionality (most characters that are not spaces
or
punctuation). While the default direction is often sufficient to determine the correct
directionality of the language, sometimes it renders the characters incorrectly (for
example,
a question mark at the end of a Hebrew question may appear at the beginning of the
question
instead of at the end or a parenthesis may render incorrectly). To control this behavior,
the
dir attribute is set to "ltr" or "rtl" as needed, to ensure that the desired direction
is
applied to the characters that have neutral bidirectionality. The "ltr" and "rtl"
values
override only the neutral characters (e.g. spaces and punctuation), not all Unicode
characters.
note
Problems with Unicode rendering may be caused by the rendering mechanism. The problems are not due to the XML markup itself. - Sometimes you may want to override the default directionality for strongly bidirectional characters. Overrides are done using the "lro" and "rlo" values, which overrides the Unicode Bidirectional algorithm. This override forces a direction on the contents of the element. These override attributes give the author a brute force way of setting the directionality independent of the Unicode Bidirectional algorithm. The gentler "ltr" and "rtl" values have a less radical effect, only affecting punctuation and other so-called neutral characters.
For most authoring needs, the "ltr" and "rtl" values are sufficient. Only when the desired effect cannot be achieved using these values, should the override values be used.
Implementation precautions
Applications that process DITA documents, whether at the authoring, translation, publishing, or any other stage, should fully support the Unicode bidirectional algorithm to correctly implement the script and directionality for each language used in the document.
Applications should ensure every highest level topic element and the root map element explicitly assign the dir attribute, as well as the xml:lang attribute.