How did we end up looking uncivilized and half-illiterate?
Lack of standards, wrong standards, confusing workarounds and then a very slow adoption of good standards — no wonder the diacritics turned into a puzzle. Magazine headlines, television supers and advertisements obliviously use incorrect Romanian spelling, or non-Romanian letters.The situation is so bad that even on national banknotes the spelling is bastardized — if you know who designed this, please encourage the person(s) to quit design and pursue a career in agriculture:
What happened? How did we end up looking like idiots?
An introduction to diacritics
Romanian glyphsIn the sense of diacritics as being signs added to letters to alter their pronunciation or to make distinction between words, the Romanian alphabet does not have diacritics. There are, however, five special letters in the Romanian alphabet (two of them are associated with the same sound), formed by modifying other latin letters. Strictly speaking they are not diacritics, but are generally referred to as such:
for the sound
for the sound
for the sound
for the sound
for the sound
Although we only have 5 diacritics (Czech language has 15) we, sometimes, manage to get 3 out of 5 wrong. Most of the time we get 2 out of those 3 wrong: Ș and Ț.
The most common mistakes plaguing written Romanian language are the following glyph substitutions:
While the first mistake is caused mainly by indolence, the second one and the third have an epic story behind and deserve a closer look.
The story of Ș and Ț: an epic clusterfuck
When I say epic, I really mean exactly that. Imagine everything going atrociously wrong — for 20 years! Here is what happened.
- 1987 Romanian language is associated with ISO 8859‑2 (Latin 2) — the international standard stipulates S-cedilla and T-cedilla glyphs. Romanian officials are oblivious to the matter. Very, very bad.
- 1995 Unicode consortium specifies in version 1.1.5 codepoints U+015E (Latin Capital Letter S With cedilla), U+015F (Latin Small Letter S With cedilla), U+0162 (Latin Capital Letter T With cedilla), U+0163 (Latin Small Letter T With cedilla) as suitable for both Turkish and Romanian, and defined them as containing the cedilla accent. Turkish language indeed uses cedilla in U+015E, U+015F but does not make any use of U+0162, U+0163. Romanian language doesn’t use any of them. Very bad.
- 1995 Windows 95 launches with no support for Romanian language by default. Support is available on CD-ROM Extras for Microsoft Windows 95 Upgrade. The typeface ILP Rumanian B100 substitutes Q/q with Ă/ă. Dark ages. Moronically bad.
- 1997 Apple’s MAC OS 7.6.1 honors Romanian S/s with comma below and T/t with comma below diacritics with MacRomanian (ten years before Microsoft). Interesting enough, its tables do not resolve U+015E, U+015F, U+0162 nor U+0163 (no S/s with cedilla nor T/t with cedilla) — at all! Good.
- 1997 Adobe Glyph List (AGL 1.0 and 1.1) specifies “Tcommaacent” and “tcommaaccent” instead of Tcedilla/tcedilla (no resolve for Scedilla and scedilla). The consequence of this decision is that Romanian documents using the (unofficial) Unicode points U+015E/F and U+0162/3 (for Ș/ș and Ț/ț) are rendered in Adobe fonts in a visually inconsistent way using S/s with cedilla and T/t with comma below. Good going bad…
- 1997 It takes ten years for ASRO to react. In 1997 the association complains to ISO about the S-cedilla and T-cedilla standardization requesting an amendment. Good.
- 1998 The revised version of ISO/IEC 8859‑2 (Latin 2) is ratified without the requested amendment. A note mentions that “the letters S and T with cedilla below may be used to substitute for the letters S and T with comma below”. Very bad.
- 1998 Adobe switches 015E/F back to T/tcedilla. Defines 0218/9 as S/scommaaccent, 021A/B as T/tcommaaccent before Unicode’s 3.0 revision but after Apple’s MAC OS 7.6.1. Good.
- 1999 In its 3.0 release, the Unicode consortium adds the mappings U+0218 (Latin Capital Letter S With comma below), U+0219 (Latin Small Letter S With comma below), U+021A (Latin Capital Letter T With comma below), U+021B (Latin Small Letter T With comma below), and defined them as containing a “commaaccent”. Great.
- 1999 The Romanian Standards Association adopts SR 13411 standard that stipulates S/s-comma and T/t-comma as official Romanian letters. Good.
- 2001 ISO publishes ISO/IEC 8859-16 also known as Latin-10 or “South-Eastern European” incorporating Romanian SR 13411 standard, in spite of strong opposition from USA’s representatives and from Mr. J. W. van Wingen, Netherlands’ representative. Finally Romanian language’s standard form is also the correct one. Good.
- 2001 Microsoft Office v. X for Mac OS X is released crippled, without support Unicode font display or input. Office documents with diacritics created on Windows won’t display properly on the Macintosh. Bad.
- 2001 Apple immediately aligns their OS X to ISO/IEC 8859-16. Good, but…
- 2001 Unfortunately, Mac OS X does not recognize the “*commaaccent” glyphnames that are defined by Adobe for Romanian and Baltic languages (such as Tcommaaccent, Rcommaaccent, Kcommaaccent, Ncommaaccent) but instead only recognizes the “*cedilla” names (T/tcedilla, R/rcedilla, K/kcedilla, N/ncedilla) or the “uni****” names (uni0162, uni0156, uni0136, uni0145). This means that Mac OS X will fail to recognize the glyphs T/tcommaaccent, R/rcommaaccent, K/kcommaaccent, N/ncommaaccent and map them to their respective Unicodes. [Adam Twardoch] Bad.
- 2001 Microsoft along with other software vendors disregards ISO/IEC 8859-16. Ugly.
- 2001 Microsoft Windows XP is launched. In order to correctly encode and render both S-comma and T-comma, one has to install the European Union Expansion Font Update. Unfortunately, there is no official way to add keyboard support for these characters. In order to type them, one has to either install 3rd party keyboards, or use the Character Map. Bad.
- 2003 Macromedia Freehand MX (11) is released without OpenType support. Bad.
- 2003 Adobe releases Creative Suite 1 applications with Unicode support. Designers are able to produce inter-platform Romanian typography without hacking fonts. Great.
- 2003 People protest against Microsoft practices — most notable is Mr. Cristian Secară with his 2003 open letter to Microsoft Romania (link in Romanian). Good.
- 2003 The dormant Linguistic Institute of the Romanian Academy finally honors the request concerning the exact form of the glyphs under letters S and T — says it must be a comma. Very late, still good.
- 2004 Microsoft Office 2004 for Mac is released with Unicode support. Good.
- 2007 Six years late and five months after Romania (and Bulgaria) joined the EU, Microsoft releases updated fonts that include all official glyphs of Romanian alphabet. This font update targeted Windows XP SP2, Windows Server 2003 and Windows Vista. Good, at last.
- 2007 Mac OS X ignores the glyph-to-Unicode mapping provided in the “cmap” table of OpenType PS (CFF/.otf) fonts, while it uses it for OpenType TT (.ttf) fonts. For OpenType PS fonts, Mac OS X uses the glyph-to-glyphname mapping provided in the font and then maps the glyphnames to Unicodes itself. Bad.
- 2007 The subset of Unicode most widely supported on Microsoft Windows systems, Windows Glyph List 4, still does not include the comma-below variants of S/s and T/t. Bad, as usual.
- 2008 Some OpenType fonts from Adobe and all C-series Vista fonts implement the optional OpenType feature GSUB/latn/ROM/locl. This feature forces S-cedilla to be rendered using the same glyph as S with comma below. When this second (but optional) remapping takes place, Romanian Unicode text is rendered with comma-below glyphs regardless of code point variants. Good.
- 2008 Very few Windows applications support the locl feature tag. From the Adobe CS3 suite, only InDesign has support for it. Bad.
- 2008 Apple updates iPhone OS X to version 2.1, adds Romanian keyboard and correct glyphs for Romanian diacritics. Good.
- 2008 Nokia phones still use incorrect S-cedilla and T-cedilla glyphs. Bad.
- 2013 Google fixes the Roboto font in Android 4.3. Android has now proper Romanian support in both keyboard and fonts [Cristian, reply no.: 87 and Mihai, reply no.: 89]. Good.
This is how the puzzle looks so far (if you have new pieces of this, please contribute — the comments are open).
But first, there’s room for some more bad news.
More bad news: the keyboards
Romanian keyboard layouts
The current Romanian National Standard SR 13392:2004 establishes two layouts for Romanian keyboards: a “primary” one and a “secondary” one.
The primary layout is intended for more traditional users that learned long ago how to type with older, Microsoft-style implementations of the Romanian keyboard. The secondary layout is mainly used by programmers and it doesn’t contradict the physical arrangement of keys on a US-style keyboard. The secondary arrangement is used as the default one by the majority of GNU/Linux distributions.
Apple is indeed the only company which sells localized physical keyboards on the Romanian market, but must now re-locate the specific Romanian letters on the physical keyboard according to the Romanian standard. Sooner or later Apple must do that, but the sooner the better.—Sorin Paliga, author of Romanian Keylayouts for MAC OS.
It turns out that the localized keyboards Apple ships to Romania — although functioning perfectly — are not standard compliant. And that’s not all.
Physical keyboard engraving
Even if Apple’s OS X was ahead of the diacritics adoption curve and it was the first hardware manufacturer to ship localized keyboards, they have a glaring bug — for years now: the Romanian keyboard is marked with the wrong glyphs!
Even though it works correctly, the S-comma key is engraved with S-cedilla and T-comma key is engraved with T-cedilla! Un-fșșțțing-believable!
I filed this with Apple’s bug tracker: bug ID 6287188.
And some good news: Romanian keyboard on iPhone
The new iPhone firmware 2.1 ads a Romanian keyboard with diacritics.
In order to use them, switch the Romanian Keyboard on (Settings → General → International → Keyboards → Romanian → On), then press the globe-key and you’ll notice the space bar reading “Spațiu” instead of “Space”. Then tap and hold one of the keys (A, I, S or T) and a row of additional letters will unfold, containing the diacritic marks.
Current status: embarrassment
Computers are supposed to be able to process text with ease, consistency and predictable output. In Romania — year 2008 — they’re still unable to accomplish this basic task.
Academic intelligentsia, when not oozing indolence, gets busy thinking of maddening spelling reforms. Local computer manufacturers happily crank out garage-quality boxes, completely oblivious to how are those boxes supposed to work. Foreign manufacturers enlist Romania at “others”. Microsoft does only what’s best at: adds in entropy via maligned standards only to be wrestling its own mess later on. Big publishing, advertising and print shops have built closed ecosystems that often work with hacked keyboard layouts and fonts (if they care), adding to the general incompatibility. And the web just goes with the flow.
The only officially responsible institution to set things right from the very beginning was and is the Romanian Academy (Academia Română) via the Institute for Linguistics (Inst. de lingvistică), nobody else. This institution was and is the only responsible for this remarkable mess.—Sorin Paliga, author of Romanian Keylayouts for MAC OS, in the comments, reply no.: 81, 23 Jul 2013.
And so we’re stuck with this embarrassing mess — what’s really exasperating, though, is that in 20 years indolence has become a de facto standard: we know we stink but we’re comfortable with that.
Take a standHow can we improve the situation? Well, by using the correct diacritics, obviously. But if/when that proves difficult, we should better drop diacritics altogether than use some sloppy substitutions (ã or ǎ instead of ă, ş instead of ș, ţ instead of ţ).
Long explanation: Because using the wrong substitution bastardizes the language — those letters do not exist in Romanian. Because it’s misleading for those who don’t know any better — they’ll think it’s perfectly acceptable to align to the bad practice. Because substitutions turn into a baggage of backwards-compatibility issues and are harder to parse. Because it means you’re a shitty designer. And because, well, in the end, it’s just bad taste.
Short explanation: Because wearing no underwear is preferable to wearing it on the outside, over your trousers.
The article “Romanian diacritic marks” was first published on October 27, 2008. If you have news regarding Romanian diacritics, please contribute to the topic in the comments. Thank you.