Kategoriarkiv: Unicode

Creating a Russian Extended Keyboard Layout

In my spare time I am currently working on a Chuvash-Tatar phrasebook. I have used the Chuvash and Tatar keyboard layout on Linux. They work fine, but switching between them takes time. So I decided to add Tatar letters (right Alt + combinations) to my Chuvash keyboard layout. While adding it I found a combined Russian-Ukranian United keyboard layout and I thought:

  • What if I create a new keyboard layout for Russian that will have almost all additional Cyrillic letters? A Russian Extended keyboard layout could be based on the Russian keyboard layout and have other non-Russian letters.

This is what I have come up to so far. The definition can be found on my project at github: russian-extended-kbd. I will update it more and provide more info about how it is organized and how to install it. I’ll also try to implement it for Windows and maybe for Mac (I doubt it, everything is so locked-down there).

rux-xkb-kbd

This is just a proof-of-concept so far. It only works on Linux (with xkb). Nevertheless, some key characteristics of this layout:

  • It has all the letters of Russian, Erzya, Moksha, Chuvash, Udmurt, Mari (Meadow and Hill Mari), Bashkir, Tatar and other languages of the Russian Federation and other countries.
  • It provides powerful dead keys for (breve, diaeresis, double acute, macron) for composing multiple Cyrillic non-Russian letters
  • It is not as quick as “native” keyboard layouts, but you can type text in many languages without switching the keyboard layout.
  • It has many other characters that are not present in the Russian standard keyboard layout for editing in wiki, markdown and other formats: [ ] { } ~, mathematical symbols: ≈ ÷ ∞ ° ‰ ≤ < > ≥ × •
  • It leaves the numbers. Compared too many other keyboard layouts (see below), this layout does not “steal” the number row. You still can type numbers as usual.

Dead keys

As I mentioned above, dead keys is a powerful feature for composing letters. It is harder to write, but the layout can cover many letters.

These dead keys work

diaeresis ӱ ӥ ӓ ӟ ӝ ӹ ӧ ӵ ӛ ё ї ӫ
double acute ӳ
breve ӑ ӗ ў й
macron ӣ ӯ

These do not work for now (but maybe in future):

cedilla ҫ ҙ
bar ғ ұ
hook ң ҳ қ

So many variants of similar letters

A big challenge in creating a Russian Extended keyboard layout is the fact that languages use different letters for the same sounds (meaning similar sounds).

  • /œ/ is ө (Tatar, Bashkir, Sakha…), and ӧ in (Altay, Udmurt, Mari…)
  • /y/ is ӳ (Chuvash), ӱ (Mari, Altay, Khakas), ү (Tatar, Bashkir, Sakha)
  • /ŋ/ is ҥ (Altay, Sakha, Mari), and ң (Tatar, Bashkir, Khakas, Khanty)

Well, the sounds are not the same, but they are similar. The Swedish Ä is not the same as the German Ä either. If we had a more united Cyrillic script, it would be easier to create a keyboard layout and to read and learn each others’ languages.

The letters from different languages are compare in my Google document.

Some “Native” keyboard layouts of the minority languages of the Russian Federation

chuvash-xkb-kbd

udmurt-xkb-kbd

mari-xkb-kbd

komi-xkb-kbd

kalmyk-xkb-kbd

bashkir-xkb-kbd

ossetian-xkb-kbd

sakha-xkb-kbd

tatar-xkb-kbd

 

 

Other Cyrillic keyboard layouts (outside the Russian Federation)

ukrainian-xkb-kbd

belarusian-xkb-kbd

tajik-xkb-kbd

bulgarian-xkb-kbd

kazakh-xkb-kbd

mongol-xkb-kbd

 

Chuvash localization

Recently I wanted to add Chuvash localization to the jQuery UI datepicker. Unfortunately, my pull request was rejected. The reason is that jQuery UI will be using Globalize framework:

Selection_003

The jQuery Globalize framework relies on CLDR, so

What is Unicode CLDR (Common Locale Data Repository)?

The Unicode CLDR provides key building blocks for software to support the world’s languages, with the largest and most extensive standard repository of locale data available. This data is used by a wide spectrum of companies for their software internationalization and localization, adapting software to the conventions of different languages for such common software tasks

Today there is no Chuvash locale in the CLDR project. So it it is time to add it.

I have filed a ticket on CLDR.

Other Chuvash localization projects

A Chuvash locale exists in a couple of projects:

Code

Just to be complete, here is the Chuvash locale for jQuery UI datepicker that I wanted to add:

It is time to standardize the Chuvash Keyboard Layout

Proto-Bulgarian Runes. Wonder if they are supported in Unicode :)

Proto-Bulgarian Runes (Chuvash language is the closest language to the Proto-Bulgar language). Wonder if they are supported in Unicode :)

The Chuvash Computer Keyboard layouts have existed since 2001, but due to the lack for Unicode support we were forced to use the look-alike letters  from other latin-based keyboard layouts. On Linux The Chuvash keyboard layout was added in 2007 and Linux is still the only operating system that has a native keyboard layout for Chuvash language. On Windows we have used the Keyboard Layout Creator and distributed it as an executable file.

Today, when Windows XP is not supported anymore, the majority of users now have full support for the correct Chuvash letters from the Extended Cyrillic table. These four Chuvash letters are “additional” to the Russian alphabet: ӐӖҪ and Ӳ.

Now when new “keyboards” appear on Android, in web browser (they use the standardized letters) and hopefully in Windows and iOS, we have to consider put the correct letters into the keyboard layouts. For Linux the /usr/share/X11/xkb/symbols/ru file has to be updated:

Impact

This switch will have a huge impact on the Chuvash language. Much of content on forums, websites and Chuvash Wikipedia will be hardly searchable. But we have to do it, to standardize and prepare for the future. The Chuvash language Committee is not against it, despite it has not been updated the guidelines for using letters from 2009.

Edit 2014-04-30

The bug in the freedesktop bugzilla was solved very quickly. In fact, in the new Ubuntu 14.04 you’ll find a correct keyboard layout:

chuvash-keyboard-map

Here is the source code:
chuvash-keyboard-xkb