Discussion:
SUBJECT: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
(too old to reply)
Egor Kobylkin
2018-07-17 19:34:34 UTC
Permalink
Dear locale maintainers,

fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"

https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]

add Cyrillic transliteration table translit_cyrillic file

https://sourceware.org/bugzilla/attachment.cgi?id=8591 [7]

to localedata/locales/ and include it in all your locales going forward.

Patch included inline below.


From this patch I have excluded locales that already mention cyrillic or
have a transliteration table for it:
az_AZ
iso14651_t1_common
ky_KG
mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
***@cyrillic

Their maintainers are requested to make an explicit decision on how and
whether at all to include this patch.



Current bug effect:

The glibc wiki explicitly lists this use case as the test example

https://sourceware.org/glibc/wiki/Locales#Testing_Locales :

LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt

currently it fails on Cyrillic texts in most locales including ru_RU [1]
[8] [9]:

LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt |grep CYRILLIC

CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.

- It produces a string of question marks and spaces.

This is what it should produce and it does so after the patch applied:

CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.


Root problem and the fix:

The root problem is the missing transliteration table that I am
supplying here. Furthermore it has to be referenced/included into the
active locale at the compliation time to be used by iconv.



COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.

While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration has only ASCII codes but still can be read by a native
speaker. Among other things it is useful for processing the Cyrillic
texts and filenames by programs or on systems that are not specifically
prepared to work with Cyrillic, don't have corresponding fonts installed
or can't handle UTF-8.

The transliteration table itself is attached as a file translit_cyrillic
[7]. Its content (mapping) is based on GOST 7.79-2000 official source
(Federal Agency on Technical Regulating and Metrology Of Russian
Federation [2]). Technically an independent but identical source [3] was
used and prepared in a spreadsheet [6].

The documentation suggests that the transliteration tables inclusion is
done by adding *include "translit_cyrillic";""* string into LC_CTYPE
translit_start section
http://man7.org/linux/man-pages/man5/locale.5.html [5]
Practically I have searched for all locales that have a
translit_start/end stance and generated a patch for them.

The Cyrillic transliteration of e.g. Russian text may have already
worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
have their transliteration tables included inline.
However it would not be the standard Cyrillic transliteration as
described above.
I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files but have
received not reply so far except from Volodymyr Lisivka
<***@gmail.com> (uk_UA) who has confirmed the exclusion.

Links:

[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=8590
[7] translit_cyrillic https://sourceware.org/bugzilla/attachment.cgi?id=8591
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=8618

Best regards,
Egor Kobylkin

---
2018-07-17 Egor Kobylkin <***@kobylkin.com>

[BZ #2872]
* locales/translit_cyrillic: add Russian GOST 7.79-2000 transliteration
table from Cyrillic to Latin.
* locales/C: add include "translit_cyrillic";"" to LC_CTYPE translit
section.
* locales/aa_DJ: likewise
* locales/af_ZA: likewise
* locales/ak_GH: likewise
* locales/am_ET: likewise
* locales/ar_EG: likewise
* locales/be_BY: likewise
* locales/bem_ZM: likewise
* locales/ber_DZ: likewise
* locales/ber_MA: likewise
* locales/bg_BG: likewise
* locales/bi_VU: likewise
* locales/bn_BD: likewise
* locales/bo_CN: likewise
* locales/ca_ES: likewise
* locales/ce_RU: likewise
* locales/cs_CZ: likewise
* locales/cv_RU: likewise
* locales/cy_GB: likewise
* locales/da_DK: likewise
* locales/de_DE: likewise
* locales/dv_MV: likewise
* locales/dz_BT: likewise
* locales/el_GR: likewise
* locales/en_GB: likewise
* locales/en_NG: likewise
* locales/en_ZM: likewise
* locales/es_CU: likewise
* locales/es_ES: likewise
* locales/et_EE: likewise
* locales/fa_IR: likewise
* locales/ff_SN: likewise
* locales/fi_FI: likewise
* locales/fr_FR: likewise
* locales/ga_IE: likewise
* locales/gd_GB: likewise
* locales/gu_IN: likewise
* locales/gv_GB: likewise
* locales/he_IL: likewise
* locales/hi_IN: likewise
* locales/hif_FJ: likewise
* locales/hr_HR: likewise
* locales/ht_HT: likewise
* locales/hu_HU: likewise
* locales/hy_AM: likewise
* locales/id_ID: likewise
* locales/is_IS: likewise
* locales/it_IT: likewise
* locales/ja_JP: likewise
* locales/kk_KZ: likewise
* locales/km_KH: likewise
* locales/kn_IN: likewise
* locales/ko_KR: likewise
* locales/ks_IN: likewise
* locales/kw_GB: likewise
* locales/lb_LU: likewise
* locales/lg_UG: likewise
* locales/lij_IT: likewise
* locales/ln_CD: likewise
* locales/lo_LA: likewise
* locales/lt_LT: likewise
* locales/lv_LV: likewise
* locales/mg_MG: likewise
* locales/mhr_RU: likewise
* locales/mk_MK: likewise
* locales/ml_IN: likewise
* locales/ms_MY: likewise
* locales/mt_MT: likewise
* locales/***@latin: likewise
* locales/nb_NO: likewise
* locales/ne_NP: likewise
* locales/nhn_MX: likewise
* locales/niu_NU: likewise
* locales/niu_NZ: likewise
* locales/nl_NL: likewise
* locales/nr_ZA: likewise
* locales/oc_FR: likewise
* locales/om_KE: likewise
* locales/or_IN: likewise
* locales/os_RU: likewise
* locales/pa_IN: likewise
* locales/pa_PK: likewise
* locales/pl_PL: likewise
* locales/pt_PT: likewise
* locales/quz_PE: likewise
* locales/ro_RO: likewise
* locales/ru_RU: likewise
* locales/rw_RW: likewise
* locales/sa_IN: likewise
* locales/sd_IN: likewise
* locales/***@devanagari: likewise
* locales/sd_PK: likewise
* locales/se_NO: likewise
* locales/sgs_LT: likewise
* locales/si_LK: likewise
* locales/sk_SK: likewise
* locales/sl_SI: likewise
* locales/sm_WS: likewise
* locales/so_SO: likewise
* locales/sq_AL: likewise
* locales/ss_ZA: likewise
* locales/st_ZA: likewise
* locales/sv_SE: likewise
* locales/sw_KE: likewise
* locales/ta_IN: likewise
* locales/te_IN: likewise
* locales/th_TH: likewise
* locales/ti_ET: likewise
* locales/tn_ZA: likewise
* locales/to_TO: likewise
* locales/tpi_PG: likewise
* locales/tr_TR: likewise
* locales/ts_ZA: likewise
* locales/unm_US: likewise
* locales/ur_IN: likewise
* locales/ur_PK: likewise
* locales/ve_ZA: likewise
* locales/vi_VN: likewise
* locales/wa_BE: likewise
* locales/wo_SN: likewise
* locales/xh_ZA: likewise
* locales/yi_US: likewise
* locales/zh_CN: likewise
* locales/zu_ZA: likewise


diff -uNr a/localedata/locales/C b/localedata/locales/C
--- a/localedata/locales/C 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/C 2018-07-17 17:55:47.000000000 +0000
@@ -2292,6 +2292,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
--- a/localedata/locales/aa_DJ 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/aa_DJ 2018-07-17 17:55:47.000000000 +0000
@@ -70,6 +70,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
--- a/localedata/locales/af_ZA 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/af_ZA 2018-07-17 17:55:47.000000000 +0000
@@ -72,6 +72,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
--- a/localedata/locales/ak_GH 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/ak_GH 2018-07-17 17:55:47.000000000 +0000
@@ -56,6 +56,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
--- a/localedata/locales/am_ET 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/am_ET 2018-07-17 17:55:47.000000000 +0000
@@ -1396,6 +1396,7 @@
<U137A> <U0060><U0039><U0030>
<U137B> <U0060><U0031><U0030><U0030>
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
--- a/localedata/locales/ar_EG 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/ar_EG 2018-07-17 17:55:48.000000000 +0000
@@ -44,6 +44,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
--- a/localedata/locales/be_BY 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/be_BY 2018-07-17 17:55:48.000000000 +0000
@@ -69,6 +69,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
--- a/localedata/locales/bem_ZM 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bem_ZM 2018-07-17 17:55:48.000000000 +0000
@@ -42,6 +42,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
--- a/localedata/locales/ber_DZ 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ber_DZ 2018-07-17 17:55:48.000000000 +0000
@@ -166,6 +166,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
--- a/localedata/locales/ber_MA 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ber_MA 2018-07-17 17:55:48.000000000 +0000
@@ -86,6 +86,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
--- a/localedata/locales/bg_BG 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bg_BG 2018-07-17 17:55:48.000000000 +0000
@@ -49,6 +49,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
--- a/localedata/locales/bi_VU 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bi_VU 2018-07-17 17:55:48.000000000 +0000
@@ -39,6 +39,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
--- a/localedata/locales/bn_BD 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bn_BD 2018-07-17 17:55:48.000000000 +0000
@@ -63,6 +63,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
--- a/localedata/locales/bo_CN 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bo_CN 2018-07-17 17:55:48.000000000 +0000
@@ -43,6 +43,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
--- a/localedata/locales/ca_ES 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ca_ES 2018-07-17 17:55:48.000000000 +0000
@@ -72,6 +72,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
--- a/localedata/locales/ce_RU 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ce_RU 2018-07-17 17:55:48.000000000 +0000
@@ -39,6 +39,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
--- a/localedata/locales/cs_CZ 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/cs_CZ 2018-07-17 17:55:48.000000000 +0000
@@ -2311,6 +2311,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
--- a/localedata/locales/cv_RU 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/cv_RU 2018-07-17 17:55:48.000000000 +0000
@@ -109,6 +109,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
--- a/localedata/locales/cy_GB 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/cy_GB 2018-07-17 17:55:48.000000000 +0000
@@ -69,6 +69,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
--- a/localedata/locales/da_DK 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/da_DK 2018-07-17 17:55:48.000000000 +0000
@@ -167,6 +167,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
--- a/localedata/locales/de_DE 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/de_DE 2018-07-17 17:55:48.000000000 +0000
@@ -78,6 +78,7 @@
% DOUBLE HIGH-REVERSED-9 QUOTATION MARK
<U201F> <U00AB>;<U0022>

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
--- a/localedata/locales/dv_MV 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/dv_MV 2018-07-17 17:55:48.000000000 +0000
@@ -52,6 +52,7 @@
include "translit_combining";""


+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
--- a/localedata/locales/dz_BT 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/dz_BT 2018-07-17 17:55:48.000000000 +0000
@@ -60,6 +60,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
--- a/localedata/locales/el_GR 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/el_GR 2018-07-17 17:55:48.000000000 +0000
@@ -59,6 +59,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
--- a/localedata/locales/en_GB 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/en_GB 2018-07-17 17:55:48.000000000 +0000
@@ -55,6 +55,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
--- a/localedata/locales/en_NG 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/en_NG 2018-07-17 17:55:48.000000000 +0000
@@ -50,6 +50,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
--- a/localedata/locales/en_ZM 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/en_ZM 2018-07-17 17:55:48.000000000 +0000
@@ -42,6 +42,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
--- a/localedata/locales/es_CU 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/es_CU 2018-07-17 17:55:48.000000000 +0000
@@ -59,6 +59,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
--- a/localedata/locales/es_ES 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/es_ES 2018-07-17 17:55:49.000000000 +0000
@@ -73,6 +73,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
--- a/localedata/locales/et_EE 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/et_EE 2018-07-17 17:55:49.000000000 +0000
@@ -109,6 +109,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
--- a/localedata/locales/fa_IR 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/fa_IR 2018-07-17 17:55:49.000000000 +0000
@@ -79,6 +79,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
--- a/localedata/locales/ff_SN 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/ff_SN 2018-07-17 17:55:49.000000000 +0000
@@ -42,6 +42,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
--- a/localedata/locales/fi_FI 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/fi_FI 2018-07-17 17:55:49.000000000 +0000
@@ -137,6 +137,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
--- a/localedata/locales/fr_FR 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/fr_FR 2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@
% In France, accents are simply omitted if they cannot be represented.
include "translit_combining";""

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
--- a/localedata/locales/ga_IE 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/ga_IE 2018-07-17 17:55:49.000000000 +0000
@@ -54,6 +54,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
--- a/localedata/locales/gd_GB 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/gd_GB 2018-07-17 17:55:49.000000000 +0000
@@ -47,6 +47,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
--- a/localedata/locales/gu_IN 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/gu_IN 2018-07-17 17:55:49.000000000 +0000
@@ -62,6 +62,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
--- a/localedata/locales/gv_GB 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/gv_GB 2018-07-17 17:55:49.000000000 +0000
@@ -57,6 +57,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
--- a/localedata/locales/he_IL 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/he_IL 2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
--- a/localedata/locales/hi_IN 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hi_IN 2018-07-17 17:55:49.000000000 +0000
@@ -61,6 +61,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
--- a/localedata/locales/hif_FJ 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hif_FJ 2018-07-17 17:55:49.000000000 +0000
@@ -37,6 +37,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
--- a/localedata/locales/hr_HR 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hr_HR 2018-07-17 17:55:49.000000000 +0000
@@ -153,6 +153,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
--- a/localedata/locales/ht_HT 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/ht_HT 2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
--- a/localedata/locales/hu_HU 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hu_HU 2018-07-17 17:55:49.000000000 +0000
@@ -478,6 +478,7 @@
<U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
<U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
--- a/localedata/locales/hy_AM 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/hy_AM 2018-07-17 17:55:49.000000000 +0000
@@ -77,6 +77,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
--- a/localedata/locales/id_ID 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/id_ID 2018-07-17 17:55:49.000000000 +0000
@@ -55,6 +55,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
--- a/localedata/locales/is_IS 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/is_IS 2018-07-17 17:55:49.000000000 +0000
@@ -2161,6 +2161,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
--- a/localedata/locales/it_IT 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/it_IT 2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
--- a/localedata/locales/ja_JP 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/ja_JP 2018-07-17 17:55:49.000000000 +0000
@@ -1682,6 +1682,7 @@
include "translit_combining";""
include "translit_cjk_variants";""

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
--- a/localedata/locales/kk_KZ 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/kk_KZ 2018-07-17 17:55:50.000000000 +0000
@@ -158,6 +158,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
--- a/localedata/locales/km_KH 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/km_KH 2018-07-17 17:55:50.000000000 +0000
@@ -873,6 +873,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
--- a/localedata/locales/kn_IN 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/kn_IN 2018-07-17 17:55:50.000000000 +0000
@@ -63,6 +63,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
--- a/localedata/locales/ko_KR 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/ko_KR 2018-07-17 17:55:50.000000000 +0000
@@ -6099,6 +6099,7 @@
include "translit_combining";""
include "translit_hangul";""

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
--- a/localedata/locales/ks_IN 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/ks_IN 2018-07-17 17:55:50.000000000 +0000
@@ -46,6 +46,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
--- a/localedata/locales/kw_GB 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/kw_GB 2018-07-17 17:55:50.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
--- a/localedata/locales/lb_LU 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/lb_LU 2018-07-17 17:55:50.000000000 +0000
@@ -78,6 +78,7 @@
% LATIN SMALL LETTER E WITH CIRCUMFLEX
<U00EA> "<U0065><U005E>"

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
--- a/localedata/locales/lg_UG 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/lg_UG 2018-07-17 17:55:50.000000000 +0000
@@ -57,6 +57,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
--- a/localedata/locales/lij_IT 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/lij_IT 2018-07-17 17:55:50.000000000 +0000
@@ -47,6 +47,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
--- a/localedata/locales/ln_CD 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ln_CD 2018-07-17 17:55:50.000000000 +0000
@@ -39,6 +39,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
--- a/localedata/locales/lo_LA 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/lo_LA 2018-07-17 17:55:50.000000000 +0000
@@ -51,6 +51,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
--- a/localedata/locales/lt_LT 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/lt_LT 2018-07-17 17:55:50.000000000 +0000
@@ -77,6 +77,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
--- a/localedata/locales/lv_LV 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/lv_LV 2018-07-17 17:55:50.000000000 +0000
@@ -2122,6 +2122,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
--- a/localedata/locales/mg_MG 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mg_MG 2018-07-17 17:55:50.000000000 +0000
@@ -55,6 +55,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
--- a/localedata/locales/mhr_RU 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mhr_RU 2018-07-17 17:55:50.000000000 +0000
@@ -59,6 +59,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
--- a/localedata/locales/mk_MK 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mk_MK 2018-07-17 17:55:50.000000000 +0000
@@ -49,6 +49,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
--- a/localedata/locales/ml_IN 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ml_IN 2018-07-17 17:55:50.000000000 +0000
@@ -60,6 +60,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
%
diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
--- a/localedata/locales/ms_MY 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ms_MY 2018-07-17 17:55:50.000000000 +0000
@@ -45,6 +45,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
--- a/localedata/locales/mt_MT 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mt_MT 2018-07-17 17:55:50.000000000 +0000
@@ -47,6 +47,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/***@latin
b/localedata/locales/***@latin
--- a/localedata/locales/***@latin 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/***@latin 2018-07-17 17:55:50.000000000 +0000
@@ -53,6 +53,7 @@
% accents are simply omitted if they cannot be represented.
include "translit_combining";""

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
--- a/localedata/locales/nb_NO 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/nb_NO 2018-07-17 17:55:50.000000000 +0000
@@ -154,6 +154,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
--- a/localedata/locales/ne_NP 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ne_NP 2018-07-17 17:55:50.000000000 +0000
@@ -43,6 +43,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
--- a/localedata/locales/nhn_MX 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/nhn_MX 2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
--- a/localedata/locales/niu_NU 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/niu_NU 2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
--- a/localedata/locales/niu_NZ 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/niu_NZ 2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
--- a/localedata/locales/nl_NL 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/nl_NL 2018-07-17 17:55:51.000000000 +0000
@@ -57,6 +57,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
--- a/localedata/locales/nr_ZA 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/nr_ZA 2018-07-17 17:55:51.000000000 +0000
@@ -66,6 +66,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
--- a/localedata/locales/oc_FR 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/oc_FR 2018-07-17 17:55:51.000000000 +0000
@@ -62,6 +62,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
--- a/localedata/locales/om_KE 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/om_KE 2018-07-17 17:55:51.000000000 +0000
@@ -140,6 +140,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
--- a/localedata/locales/or_IN 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/or_IN 2018-07-17 17:55:51.000000000 +0000
@@ -62,6 +62,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
--- a/localedata/locales/os_RU 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/os_RU 2018-07-17 17:55:51.000000000 +0000
@@ -70,6 +70,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
--- a/localedata/locales/pa_IN 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pa_IN 2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
--- a/localedata/locales/pa_PK 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pa_PK 2018-07-17 17:55:51.000000000 +0000
@@ -58,6 +58,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
--- a/localedata/locales/pl_PL 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pl_PL 2018-07-17 17:55:51.000000000 +0000
@@ -142,6 +142,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
--- a/localedata/locales/pt_PT 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pt_PT 2018-07-17 17:55:51.000000000 +0000
@@ -59,6 +59,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
--- a/localedata/locales/quz_PE 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/quz_PE 2018-07-17 17:55:51.000000000 +0000
@@ -57,6 +57,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
--- a/localedata/locales/ro_RO 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/ro_RO 2018-07-17 17:55:51.000000000 +0000
@@ -144,6 +144,7 @@
<U0162> "<U021A>";"<U0054>"
<U0163> "<U021B>";"<U0074>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
--- a/localedata/locales/ru_RU 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/ru_RU 2018-07-17 17:55:51.000000000 +0000
@@ -74,6 +74,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
--- a/localedata/locales/rw_RW 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/rw_RW 2018-07-17 17:55:51.000000000 +0000
@@ -45,6 +45,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
--- a/localedata/locales/sa_IN 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sa_IN 2018-07-17 17:55:51.000000000 +0000
@@ -44,6 +44,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
--- a/localedata/locales/sd_IN 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sd_IN 2018-07-17 17:55:51.000000000 +0000
@@ -46,6 +46,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/***@devanagari
b/localedata/locales/***@devanagari
--- a/localedata/locales/***@devanagari 2018-07-17 17:49:19.000000000
+0000
+++ b/localedata/locales/***@devanagari 2018-07-17 17:55:51.000000000
+0000
@@ -44,6 +44,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
--- a/localedata/locales/sd_PK 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sd_PK 2018-07-17 17:55:51.000000000 +0000
@@ -39,6 +39,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
--- a/localedata/locales/se_NO 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/se_NO 2018-07-17 17:55:51.000000000 +0000
@@ -205,6 +205,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
--- a/localedata/locales/sgs_LT 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sgs_LT 2018-07-17 17:55:52.000000000 +0000
@@ -59,6 +59,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
--- a/localedata/locales/si_LK 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/si_LK 2018-07-17 17:55:52.000000000 +0000
@@ -45,6 +45,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
--- a/localedata/locales/sk_SK 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sk_SK 2018-07-17 17:55:52.000000000 +0000
@@ -68,6 +68,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
--- a/localedata/locales/sl_SI 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sl_SI 2018-07-17 17:55:52.000000000 +0000
@@ -91,6 +91,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
--- a/localedata/locales/sm_WS 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sm_WS 2018-07-17 17:55:52.000000000 +0000
@@ -37,6 +37,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
--- a/localedata/locales/so_SO 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/so_SO 2018-07-17 17:55:52.000000000 +0000
@@ -70,6 +70,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
--- a/localedata/locales/sq_AL 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sq_AL 2018-07-17 17:55:52.000000000 +0000
@@ -45,6 +45,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
--- a/localedata/locales/ss_ZA 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/ss_ZA 2018-07-17 17:55:52.000000000 +0000
@@ -68,6 +68,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
--- a/localedata/locales/st_ZA 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/st_ZA 2018-07-17 17:55:52.000000000 +0000
@@ -64,6 +64,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
--- a/localedata/locales/sv_SE 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sv_SE 2018-07-17 17:55:52.000000000 +0000
@@ -139,6 +139,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
--- a/localedata/locales/sw_KE 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sw_KE 2018-07-17 17:55:52.000000000 +0000
@@ -44,6 +44,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
--- a/localedata/locales/ta_IN 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/ta_IN 2018-07-17 17:55:52.000000000 +0000
@@ -63,6 +63,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
--- a/localedata/locales/te_IN 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/te_IN 2018-07-17 17:55:52.000000000 +0000
@@ -63,6 +63,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
--- a/localedata/locales/th_TH 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/th_TH 2018-07-17 17:55:52.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
--- a/localedata/locales/ti_ET 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/ti_ET 2018-07-17 17:55:52.000000000 +0000
@@ -866,6 +866,7 @@
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>

include "translit_combining";""
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
--- a/localedata/locales/tn_ZA 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/tn_ZA 2018-07-17 17:55:52.000000000 +0000
@@ -69,6 +69,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
--- a/localedata/locales/to_TO 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/to_TO 2018-07-17 17:55:52.000000000 +0000
@@ -36,6 +36,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
--- a/localedata/locales/tpi_PG 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/tpi_PG 2018-07-17 17:55:52.000000000 +0000
@@ -37,6 +37,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
--- a/localedata/locales/tr_TR 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/tr_TR 2018-07-17 17:55:52.000000000 +0000
@@ -2430,6 +2430,7 @@

% TURKISH LIRA SIGN
<U20BA> "<U0054><U004C>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/translit_cyrillic
b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000
+0000
+++ b/localedata/locales/translit_cyrillic 2018-07-17 17:55:52.000000000
+0000
@@ -0,0 +1,151 @@
+escape_char /
+comment_char %
+
+% Transliterations that converts cyrillic letters to ascii symbols
inspired by GOST 7.79-2000
+% https://sourceware.org/bugzilla/show_bug.cgi?id=2872
+% Generated from UnicodeData.txt with
+% https://sourceware.org/bugzilla/attachment.cgi?id=8590
+% Up to three characters are required to do a reversible transliteration.
+
+LC_CTYPE
+
+translit_start
+
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> "<U0059><U004F>";<U0059>
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> "<U005A><U0048>";<U005A>
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> "<U0043><U005A>";<U0043>
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> "<U0043><U0048>";<U0043>
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> "<U0053><U0048>";<U0053>
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> "<U0053><U0048><U0048>";<U0053>
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> "<U0060><U0060>";<U0060>
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> "<U0059><U0027>";<U0059>
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> "<U0045><U0060>";<U0045>
+% CYRILLIC CAPITAL LETTER YU
+<U042E> "<U0059><U0055>";<U0059>
+% CYRILLIC CAPITAL LETTER YA
+<U042F> "<U0059><U0041>";<U0059>
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> "<U007A><U0068>";<U007A>
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> "<U0063><U007A>";<U0063>
+% CYRILLIC SMALL LETTER CHE
+<U0447> "<U0063><U0068>";<U0063>
+% CYRILLIC SMALL LETTER SHA
+<U0448> "<U0073><U0068>";<U0073>
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> "<U0073><U0068><U0068>";<U0073>
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> "<U0060><U0060>";<U0060>
+% CYRILLIC SMALL LETTER YERU
+<U044B> "<U0079><U0027>";<U0079>
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> "<U0065><U0060>";<U0065>
+% CYRILLIC SMALL LETTER YU
+<U044E> "<U0079><U0075>";<U0079>
+% CYRILLIC SMALL LETTER YA
+<U044F> "<U0079><U0061>";<U0079>
+% CYRILLIC SMALL LETTER IO
+<U0451> "<U0079><U006F>";<U0079>
+
+
+translit_end
+
+END LC_CTYPE
diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
--- a/localedata/locales/ts_ZA 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ts_ZA 2018-07-17 17:55:52.000000000 +0000
@@ -64,6 +64,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
--- a/localedata/locales/unm_US 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/unm_US 2018-07-17 17:55:52.000000000 +0000
@@ -48,6 +48,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
--- a/localedata/locales/ur_IN 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ur_IN 2018-07-17 17:55:53.000000000 +0000
@@ -46,6 +46,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
--- a/localedata/locales/ur_PK 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ur_PK 2018-07-17 17:55:53.000000000 +0000
@@ -58,6 +58,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
--- a/localedata/locales/ve_ZA 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ve_ZA 2018-07-17 17:55:53.000000000 +0000
@@ -67,6 +67,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
--- a/localedata/locales/vi_VN 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/vi_VN 2018-07-17 17:55:53.000000000 +0000
@@ -58,6 +58,7 @@
% dong sign -> d// -> dd
<U20AB> "<U0111>";"<U0064><U0064>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
--- a/localedata/locales/wa_BE 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/wa_BE 2018-07-17 17:55:53.000000000 +0000
@@ -69,6 +69,7 @@
<U00C5> "<U0041><U030A>";"<U0041>";"<U0041><U0055>"
<U00E5> "<U0061><U030A>";"<U0061>";"<U0061><U0075>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
--- a/localedata/locales/wo_SN 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/wo_SN 2018-07-17 17:55:53.000000000 +0000
@@ -55,6 +55,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
--- a/localedata/locales/xh_ZA 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/xh_ZA 2018-07-17 17:55:53.000000000 +0000
@@ -66,6 +66,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
--- a/localedata/locales/yi_US 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/yi_US 2018-07-17 17:55:53.000000000 +0000
@@ -73,6 +73,7 @@
<U05F0> "<U05D5><U05D5>";"<U0077><U0077>"
<U05F1> "<U05D5><U05D9>";"<U0077><U006A>"
<U05F2> "<U05D9><U05D9>";"<U006A><U006A>"
+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
--- a/localedata/locales/zh_CN 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/zh_CN 2018-07-17 17:55:53.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end

class "hanzi"; /
diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
--- a/localedata/locales/zu_ZA 2018-07-17 17:49:22.000000000 +0000
+++ b/localedata/locales/zu_ZA 2018-07-17 17:55:53.000000000 +0000
@@ -70,6 +70,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
Carlos O'Donell
2018-07-17 19:40:54 UTC
Permalink
Post by Egor Kobylkin
Dear locale maintainers,
fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
We are currently preparing for the 2.28 release and it may take
a while to review this change and the structure of the changes,
and the data itself.

Is it OK if this material is reviewed for 2.29 inclusion (after
August 1st)?

Cheers,
Carlos.
Egor Kobylkin
2018-07-17 19:50:40 UTC
Permalink
Post by Carlos O'Donell
Post by Egor Kobylkin
Dear locale maintainers,
fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
We are currently preparing for the 2.28 release and it may take
a while to review this change and the structure of the changes,
and the data itself.
Is it OK if this material is reviewed for 2.29 inclusion (after
August 1st)?
It's fine with me to postpone it for for 2.29 inclusion (after August 1st).
Should I send a reminder in August?

Bests,
Egor
Carlos O'Donell
2018-07-17 19:59:27 UTC
Permalink
Post by Egor Kobylkin
Post by Carlos O'Donell
Post by Egor Kobylkin
Dear locale maintainers,
fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
We are currently preparing for the 2.28 release and it may take
a while to review this change and the structure of the changes,
and the data itself.
Is it OK if this material is reviewed for 2.29 inclusion (after
August 1st)?
It's fine with me to postpone it for for 2.29 inclusion (after August 1st).
Should I send a reminder in August?
Yes please, ping the original patches again in August and we can
review. In the meantime others may feel free to review, but we won't
consider them for inclusion yet e.g. don't block the release.
--
Cheers,
Carlos.
Egor Kobylkin
2018-08-06 19:00:30 UTC
Permalink
Dear locale maintainers,

fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"

https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]

add Cyrillic transliteration table translit_cyrillic file

https://sourceware.org/bugzilla/attachment.cgi?id=8591 [7]

to localedata/locales/ and include it in all your locales going forward.

Patch included inline below.

This is a re-submission for the consideration for 2.29 on a request from
Carlos O'Donell https://sourceware.org/ml/libc-alpha/2018-07/msg00506.html

From this patch I have excluded locales that already mention cyrillic or
have a transliteration table for it:
az_AZ
iso14651_t1_common
ky_KG
mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
***@cyrillic

Their maintainers are requested to make an explicit decision on how and
whether at all to include this patch.



Current bug effect:

The glibc wiki explicitly lists this use case as the test example

https://sourceware.org/glibc/wiki/Locales#Testing_Locales :

LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt

currently it fails on Cyrillic texts in most locales including ru_RU [1]
[8] [9]:

LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt |grep CYRILLIC

CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.

- It produces a string of question marks and spaces.

This is what it should produce and it does so after the patch applied:

CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.


Root problem and the fix:

The root problem is the missing transliteration table that I am
supplying here. Furthermore it has to be referenced/included into the
active locale at the compilation time to be used by iconv.



COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.

While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration has only ASCII codes but still can be read by a native
speaker. Among other things it is useful for processing the Cyrillic
texts and filenames by programs or on systems that are not specifically
prepared to work with Cyrillic, don't have corresponding fonts installed
or can't handle UTF-8.

The transliteration table itself is attached as a file translit_cyrillic
[7]. Its content (mapping) is based on GOST 7.79-2000 official source
(Federal Agency on Technical Regulating and Metrology Of Russian
Federation [2]). Technically an independent but identical source [3] was
used and prepared in a spreadsheet [6].

The documentation suggests that the transliteration tables inclusion is
done by adding *include "translit_cyrillic";""* string into LC_CTYPE
translit_start section
http://man7.org/linux/man-pages/man5/locale.5.html [5]
Practically I have searched for all locales that have a
translit_start/end stance and generated a patch for them.

The Cyrillic transliteration of e.g. Russian text may have already
worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
have their transliteration tables included inline.
However it would not be the standard Russian Cyrillic transliteration as
described above.
I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files. Volodymyr
Lisivka <***@gmail.com>, Max Kutny <***@gmail.com> (uk_UA),
ДаМОлП КегаМ <***@gnome.org> (sr_YU, sr_CS) have confirmed the
exclusion.

Links:

[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=8590
[7] translit_cyrillic https://sourceware.org/bugzilla/attachment.cgi?id=8591
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=8618

Best regards,
Egor Kobylkin

---
2018-07-17 Egor Kobylkin <***@kobylkin.com>

[BZ #2872]
* locales/translit_cyrillic: add Russian GOST 7.79-2000 transliteration
table from Cyrillic to Latin.
* locales/C: add include "translit_cyrillic";"" to LC_CTYPE translit
section.
* locales/aa_DJ: likewise
* locales/af_ZA: likewise
* locales/ak_GH: likewise
* locales/am_ET: likewise
* locales/ar_EG: likewise
* locales/be_BY: likewise
* locales/bem_ZM: likewise
* locales/ber_DZ: likewise
* locales/ber_MA: likewise
* locales/bg_BG: likewise
* locales/bi_VU: likewise
* locales/bn_BD: likewise
* locales/bo_CN: likewise
* locales/ca_ES: likewise
* locales/ce_RU: likewise
* locales/cs_CZ: likewise
* locales/cv_RU: likewise
* locales/cy_GB: likewise
* locales/da_DK: likewise
* locales/de_DE: likewise
* locales/dv_MV: likewise
* locales/dz_BT: likewise
* locales/el_GR: likewise
* locales/en_GB: likewise
* locales/en_NG: likewise
* locales/en_ZM: likewise
* locales/es_CU: likewise
* locales/es_ES: likewise
* locales/et_EE: likewise
* locales/fa_IR: likewise
* locales/ff_SN: likewise
* locales/fi_FI: likewise
* locales/fr_FR: likewise
* locales/ga_IE: likewise
* locales/gd_GB: likewise
* locales/gu_IN: likewise
* locales/gv_GB: likewise
* locales/he_IL: likewise
* locales/hi_IN: likewise
* locales/hif_FJ: likewise
* locales/hr_HR: likewise
* locales/ht_HT: likewise
* locales/hu_HU: likewise
* locales/hy_AM: likewise
* locales/id_ID: likewise
* locales/is_IS: likewise
* locales/it_IT: likewise
* locales/ja_JP: likewise
* locales/kk_KZ: likewise
* locales/km_KH: likewise
* locales/kn_IN: likewise
* locales/ko_KR: likewise
* locales/ks_IN: likewise
* locales/kw_GB: likewise
* locales/lb_LU: likewise
* locales/lg_UG: likewise
* locales/lij_IT: likewise
* locales/ln_CD: likewise
* locales/lo_LA: likewise
* locales/lt_LT: likewise
* locales/lv_LV: likewise
* locales/mg_MG: likewise
* locales/mhr_RU: likewise
* locales/mk_MK: likewise
* locales/ml_IN: likewise
* locales/ms_MY: likewise
* locales/mt_MT: likewise
* locales/***@latin: likewise
* locales/nb_NO: likewise
* locales/ne_NP: likewise
* locales/nhn_MX: likewise
* locales/niu_NU: likewise
* locales/niu_NZ: likewise
* locales/nl_NL: likewise
* locales/nr_ZA: likewise
* locales/oc_FR: likewise
* locales/om_KE: likewise
* locales/or_IN: likewise
* locales/os_RU: likewise
* locales/pa_IN: likewise
* locales/pa_PK: likewise
* locales/pl_PL: likewise
* locales/pt_PT: likewise
* locales/quz_PE: likewise
* locales/ro_RO: likewise
* locales/ru_RU: likewise
* locales/rw_RW: likewise
* locales/sa_IN: likewise
* locales/sd_IN: likewise
* locales/***@devanagari: likewise
* locales/sd_PK: likewise
* locales/se_NO: likewise
* locales/sgs_LT: likewise
* locales/si_LK: likewise
* locales/sk_SK: likewise
* locales/sl_SI: likewise
* locales/sm_WS: likewise
* locales/so_SO: likewise
* locales/sq_AL: likewise
* locales/ss_ZA: likewise
* locales/st_ZA: likewise
* locales/sv_SE: likewise
* locales/sw_KE: likewise
* locales/ta_IN: likewise
* locales/te_IN: likewise
* locales/th_TH: likewise
* locales/ti_ET: likewise
* locales/tn_ZA: likewise
* locales/to_TO: likewise
* locales/tpi_PG: likewise
* locales/tr_TR: likewise
* locales/ts_ZA: likewise
* locales/unm_US: likewise
* locales/ur_IN: likewise
* locales/ur_PK: likewise
* locales/ve_ZA: likewise
* locales/vi_VN: likewise
* locales/wa_BE: likewise
* locales/wo_SN: likewise
* locales/xh_ZA: likewise
* locales/yi_US: likewise
* locales/zh_CN: likewise
* locales/zu_ZA: likewise


diff -uNr a/localedata/locales/C b/localedata/locales/C
--- a/localedata/locales/C 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/C 2018-07-17 17:55:47.000000000 +0000
@@ -2292,6 +2292,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
--- a/localedata/locales/aa_DJ 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/aa_DJ 2018-07-17 17:55:47.000000000 +0000
@@ -70,6 +70,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
--- a/localedata/locales/af_ZA 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/af_ZA 2018-07-17 17:55:47.000000000 +0000
@@ -72,6 +72,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
--- a/localedata/locales/ak_GH 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/ak_GH 2018-07-17 17:55:47.000000000 +0000
@@ -56,6 +56,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
--- a/localedata/locales/am_ET 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/am_ET 2018-07-17 17:55:47.000000000 +0000
@@ -1396,6 +1396,7 @@
<U137A> <U0060><U0039><U0030>
<U137B> <U0060><U0031><U0030><U0030>
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
--- a/localedata/locales/ar_EG 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/ar_EG 2018-07-17 17:55:48.000000000 +0000
@@ -44,6 +44,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
--- a/localedata/locales/be_BY 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/be_BY 2018-07-17 17:55:48.000000000 +0000
@@ -69,6 +69,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
--- a/localedata/locales/bem_ZM 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bem_ZM 2018-07-17 17:55:48.000000000 +0000
@@ -42,6 +42,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
--- a/localedata/locales/ber_DZ 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ber_DZ 2018-07-17 17:55:48.000000000 +0000
@@ -166,6 +166,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
--- a/localedata/locales/ber_MA 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ber_MA 2018-07-17 17:55:48.000000000 +0000
@@ -86,6 +86,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
--- a/localedata/locales/bg_BG 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bg_BG 2018-07-17 17:55:48.000000000 +0000
@@ -49,6 +49,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
--- a/localedata/locales/bi_VU 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bi_VU 2018-07-17 17:55:48.000000000 +0000
@@ -39,6 +39,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
--- a/localedata/locales/bn_BD 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bn_BD 2018-07-17 17:55:48.000000000 +0000
@@ -63,6 +63,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
--- a/localedata/locales/bo_CN 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bo_CN 2018-07-17 17:55:48.000000000 +0000
@@ -43,6 +43,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
--- a/localedata/locales/ca_ES 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ca_ES 2018-07-17 17:55:48.000000000 +0000
@@ -72,6 +72,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
--- a/localedata/locales/ce_RU 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ce_RU 2018-07-17 17:55:48.000000000 +0000
@@ -39,6 +39,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
--- a/localedata/locales/cs_CZ 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/cs_CZ 2018-07-17 17:55:48.000000000 +0000
@@ -2311,6 +2311,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
--- a/localedata/locales/cv_RU 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/cv_RU 2018-07-17 17:55:48.000000000 +0000
@@ -109,6 +109,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
--- a/localedata/locales/cy_GB 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/cy_GB 2018-07-17 17:55:48.000000000 +0000
@@ -69,6 +69,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
--- a/localedata/locales/da_DK 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/da_DK 2018-07-17 17:55:48.000000000 +0000
@@ -167,6 +167,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
--- a/localedata/locales/de_DE 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/de_DE 2018-07-17 17:55:48.000000000 +0000
@@ -78,6 +78,7 @@
% DOUBLE HIGH-REVERSED-9 QUOTATION MARK
<U201F> <U00AB>;<U0022>

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
--- a/localedata/locales/dv_MV 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/dv_MV 2018-07-17 17:55:48.000000000 +0000
@@ -52,6 +52,7 @@
include "translit_combining";""


+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
--- a/localedata/locales/dz_BT 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/dz_BT 2018-07-17 17:55:48.000000000 +0000
@@ -60,6 +60,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
--- a/localedata/locales/el_GR 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/el_GR 2018-07-17 17:55:48.000000000 +0000
@@ -59,6 +59,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
--- a/localedata/locales/en_GB 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/en_GB 2018-07-17 17:55:48.000000000 +0000
@@ -55,6 +55,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
--- a/localedata/locales/en_NG 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/en_NG 2018-07-17 17:55:48.000000000 +0000
@@ -50,6 +50,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
--- a/localedata/locales/en_ZM 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/en_ZM 2018-07-17 17:55:48.000000000 +0000
@@ -42,6 +42,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
--- a/localedata/locales/es_CU 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/es_CU 2018-07-17 17:55:48.000000000 +0000
@@ -59,6 +59,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
--- a/localedata/locales/es_ES 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/es_ES 2018-07-17 17:55:49.000000000 +0000
@@ -73,6 +73,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
--- a/localedata/locales/et_EE 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/et_EE 2018-07-17 17:55:49.000000000 +0000
@@ -109,6 +109,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
--- a/localedata/locales/fa_IR 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/fa_IR 2018-07-17 17:55:49.000000000 +0000
@@ -79,6 +79,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
--- a/localedata/locales/ff_SN 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/ff_SN 2018-07-17 17:55:49.000000000 +0000
@@ -42,6 +42,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
--- a/localedata/locales/fi_FI 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/fi_FI 2018-07-17 17:55:49.000000000 +0000
@@ -137,6 +137,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
--- a/localedata/locales/fr_FR 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/fr_FR 2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@
% In France, accents are simply omitted if they cannot be represented.
include "translit_combining";""

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
--- a/localedata/locales/ga_IE 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/ga_IE 2018-07-17 17:55:49.000000000 +0000
@@ -54,6 +54,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
--- a/localedata/locales/gd_GB 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/gd_GB 2018-07-17 17:55:49.000000000 +0000
@@ -47,6 +47,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
--- a/localedata/locales/gu_IN 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/gu_IN 2018-07-17 17:55:49.000000000 +0000
@@ -62,6 +62,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
--- a/localedata/locales/gv_GB 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/gv_GB 2018-07-17 17:55:49.000000000 +0000
@@ -57,6 +57,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
--- a/localedata/locales/he_IL 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/he_IL 2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
--- a/localedata/locales/hi_IN 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hi_IN 2018-07-17 17:55:49.000000000 +0000
@@ -61,6 +61,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
--- a/localedata/locales/hif_FJ 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hif_FJ 2018-07-17 17:55:49.000000000 +0000
@@ -37,6 +37,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
--- a/localedata/locales/hr_HR 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hr_HR 2018-07-17 17:55:49.000000000 +0000
@@ -153,6 +153,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
--- a/localedata/locales/ht_HT 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/ht_HT 2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
--- a/localedata/locales/hu_HU 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hu_HU 2018-07-17 17:55:49.000000000 +0000
@@ -478,6 +478,7 @@
<U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
<U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
--- a/localedata/locales/hy_AM 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/hy_AM 2018-07-17 17:55:49.000000000 +0000
@@ -77,6 +77,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
--- a/localedata/locales/id_ID 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/id_ID 2018-07-17 17:55:49.000000000 +0000
@@ -55,6 +55,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
--- a/localedata/locales/is_IS 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/is_IS 2018-07-17 17:55:49.000000000 +0000
@@ -2161,6 +2161,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
--- a/localedata/locales/it_IT 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/it_IT 2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
--- a/localedata/locales/ja_JP 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/ja_JP 2018-07-17 17:55:49.000000000 +0000
@@ -1682,6 +1682,7 @@
include "translit_combining";""
include "translit_cjk_variants";""

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
--- a/localedata/locales/kk_KZ 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/kk_KZ 2018-07-17 17:55:50.000000000 +0000
@@ -158,6 +158,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
--- a/localedata/locales/km_KH 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/km_KH 2018-07-17 17:55:50.000000000 +0000
@@ -873,6 +873,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
--- a/localedata/locales/kn_IN 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/kn_IN 2018-07-17 17:55:50.000000000 +0000
@@ -63,6 +63,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
--- a/localedata/locales/ko_KR 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/ko_KR 2018-07-17 17:55:50.000000000 +0000
@@ -6099,6 +6099,7 @@
include "translit_combining";""
include "translit_hangul";""

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
--- a/localedata/locales/ks_IN 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/ks_IN 2018-07-17 17:55:50.000000000 +0000
@@ -46,6 +46,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
--- a/localedata/locales/kw_GB 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/kw_GB 2018-07-17 17:55:50.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
--- a/localedata/locales/lb_LU 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/lb_LU 2018-07-17 17:55:50.000000000 +0000
@@ -78,6 +78,7 @@
% LATIN SMALL LETTER E WITH CIRCUMFLEX
<U00EA> "<U0065><U005E>"

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
--- a/localedata/locales/lg_UG 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/lg_UG 2018-07-17 17:55:50.000000000 +0000
@@ -57,6 +57,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
--- a/localedata/locales/lij_IT 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/lij_IT 2018-07-17 17:55:50.000000000 +0000
@@ -47,6 +47,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
--- a/localedata/locales/ln_CD 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ln_CD 2018-07-17 17:55:50.000000000 +0000
@@ -39,6 +39,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
--- a/localedata/locales/lo_LA 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/lo_LA 2018-07-17 17:55:50.000000000 +0000
@@ -51,6 +51,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
--- a/localedata/locales/lt_LT 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/lt_LT 2018-07-17 17:55:50.000000000 +0000
@@ -77,6 +77,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
--- a/localedata/locales/lv_LV 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/lv_LV 2018-07-17 17:55:50.000000000 +0000
@@ -2122,6 +2122,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
--- a/localedata/locales/mg_MG 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mg_MG 2018-07-17 17:55:50.000000000 +0000
@@ -55,6 +55,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
--- a/localedata/locales/mhr_RU 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mhr_RU 2018-07-17 17:55:50.000000000 +0000
@@ -59,6 +59,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
--- a/localedata/locales/mk_MK 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mk_MK 2018-07-17 17:55:50.000000000 +0000
@@ -49,6 +49,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
--- a/localedata/locales/ml_IN 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ml_IN 2018-07-17 17:55:50.000000000 +0000
@@ -60,6 +60,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
%
diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
--- a/localedata/locales/ms_MY 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ms_MY 2018-07-17 17:55:50.000000000 +0000
@@ -45,6 +45,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
--- a/localedata/locales/mt_MT 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mt_MT 2018-07-17 17:55:50.000000000 +0000
@@ -47,6 +47,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/***@latin
b/localedata/locales/***@latin
--- a/localedata/locales/***@latin 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/***@latin 2018-07-17 17:55:50.000000000 +0000
@@ -53,6 +53,7 @@
% accents are simply omitted if they cannot be represented.
include "translit_combining";""

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
--- a/localedata/locales/nb_NO 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/nb_NO 2018-07-17 17:55:50.000000000 +0000
@@ -154,6 +154,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
--- a/localedata/locales/ne_NP 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ne_NP 2018-07-17 17:55:50.000000000 +0000
@@ -43,6 +43,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
--- a/localedata/locales/nhn_MX 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/nhn_MX 2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
--- a/localedata/locales/niu_NU 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/niu_NU 2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
--- a/localedata/locales/niu_NZ 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/niu_NZ 2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
--- a/localedata/locales/nl_NL 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/nl_NL 2018-07-17 17:55:51.000000000 +0000
@@ -57,6 +57,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
--- a/localedata/locales/nr_ZA 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/nr_ZA 2018-07-17 17:55:51.000000000 +0000
@@ -66,6 +66,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
--- a/localedata/locales/oc_FR 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/oc_FR 2018-07-17 17:55:51.000000000 +0000
@@ -62,6 +62,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
--- a/localedata/locales/om_KE 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/om_KE 2018-07-17 17:55:51.000000000 +0000
@@ -140,6 +140,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
--- a/localedata/locales/or_IN 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/or_IN 2018-07-17 17:55:51.000000000 +0000
@@ -62,6 +62,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
--- a/localedata/locales/os_RU 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/os_RU 2018-07-17 17:55:51.000000000 +0000
@@ -70,6 +70,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
--- a/localedata/locales/pa_IN 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pa_IN 2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
--- a/localedata/locales/pa_PK 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pa_PK 2018-07-17 17:55:51.000000000 +0000
@@ -58,6 +58,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
--- a/localedata/locales/pl_PL 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pl_PL 2018-07-17 17:55:51.000000000 +0000
@@ -142,6 +142,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
--- a/localedata/locales/pt_PT 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pt_PT 2018-07-17 17:55:51.000000000 +0000
@@ -59,6 +59,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
--- a/localedata/locales/quz_PE 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/quz_PE 2018-07-17 17:55:51.000000000 +0000
@@ -57,6 +57,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
--- a/localedata/locales/ro_RO 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/ro_RO 2018-07-17 17:55:51.000000000 +0000
@@ -144,6 +144,7 @@
<U0162> "<U021A>";"<U0054>"
<U0163> "<U021B>";"<U0074>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
--- a/localedata/locales/ru_RU 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/ru_RU 2018-07-17 17:55:51.000000000 +0000
@@ -74,6 +74,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
--- a/localedata/locales/rw_RW 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/rw_RW 2018-07-17 17:55:51.000000000 +0000
@@ -45,6 +45,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
--- a/localedata/locales/sa_IN 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sa_IN 2018-07-17 17:55:51.000000000 +0000
@@ -44,6 +44,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
--- a/localedata/locales/sd_IN 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sd_IN 2018-07-17 17:55:51.000000000 +0000
@@ -46,6 +46,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/***@devanagari
b/localedata/locales/***@devanagari
--- a/localedata/locales/***@devanagari 2018-07-17 17:49:19.000000000
+0000
+++ b/localedata/locales/***@devanagari 2018-07-17 17:55:51.000000000
+0000
@@ -44,6 +44,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
--- a/localedata/locales/sd_PK 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sd_PK 2018-07-17 17:55:51.000000000 +0000
@@ -39,6 +39,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
--- a/localedata/locales/se_NO 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/se_NO 2018-07-17 17:55:51.000000000 +0000
@@ -205,6 +205,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
--- a/localedata/locales/sgs_LT 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sgs_LT 2018-07-17 17:55:52.000000000 +0000
@@ -59,6 +59,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
--- a/localedata/locales/si_LK 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/si_LK 2018-07-17 17:55:52.000000000 +0000
@@ -45,6 +45,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
--- a/localedata/locales/sk_SK 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sk_SK 2018-07-17 17:55:52.000000000 +0000
@@ -68,6 +68,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
--- a/localedata/locales/sl_SI 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sl_SI 2018-07-17 17:55:52.000000000 +0000
@@ -91,6 +91,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
--- a/localedata/locales/sm_WS 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sm_WS 2018-07-17 17:55:52.000000000 +0000
@@ -37,6 +37,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
--- a/localedata/locales/so_SO 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/so_SO 2018-07-17 17:55:52.000000000 +0000
@@ -70,6 +70,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
--- a/localedata/locales/sq_AL 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sq_AL 2018-07-17 17:55:52.000000000 +0000
@@ -45,6 +45,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
--- a/localedata/locales/ss_ZA 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/ss_ZA 2018-07-17 17:55:52.000000000 +0000
@@ -68,6 +68,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
--- a/localedata/locales/st_ZA 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/st_ZA 2018-07-17 17:55:52.000000000 +0000
@@ -64,6 +64,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
--- a/localedata/locales/sv_SE 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sv_SE 2018-07-17 17:55:52.000000000 +0000
@@ -139,6 +139,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
--- a/localedata/locales/sw_KE 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sw_KE 2018-07-17 17:55:52.000000000 +0000
@@ -44,6 +44,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
--- a/localedata/locales/ta_IN 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/ta_IN 2018-07-17 17:55:52.000000000 +0000
@@ -63,6 +63,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
--- a/localedata/locales/te_IN 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/te_IN 2018-07-17 17:55:52.000000000 +0000
@@ -63,6 +63,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
--- a/localedata/locales/th_TH 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/th_TH 2018-07-17 17:55:52.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
--- a/localedata/locales/ti_ET 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/ti_ET 2018-07-17 17:55:52.000000000 +0000
@@ -866,6 +866,7 @@
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>

include "translit_combining";""
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
--- a/localedata/locales/tn_ZA 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/tn_ZA 2018-07-17 17:55:52.000000000 +0000
@@ -69,6 +69,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
--- a/localedata/locales/to_TO 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/to_TO 2018-07-17 17:55:52.000000000 +0000
@@ -36,6 +36,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
--- a/localedata/locales/tpi_PG 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/tpi_PG 2018-07-17 17:55:52.000000000 +0000
@@ -37,6 +37,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
--- a/localedata/locales/tr_TR 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/tr_TR 2018-07-17 17:55:52.000000000 +0000
@@ -2430,6 +2430,7 @@

% TURKISH LIRA SIGN
<U20BA> "<U0054><U004C>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/translit_cyrillic
b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000
+0000
+++ b/localedata/locales/translit_cyrillic 2018-07-17 17:55:52.000000000
+0000
@@ -0,0 +1,151 @@
+escape_char /
+comment_char %
+
+% Transliterations that converts cyrillic letters to ascii symbols
inspired by GOST 7.79-2000
+% https://sourceware.org/bugzilla/show_bug.cgi?id=2872
+% Generated from UnicodeData.txt with
+% https://sourceware.org/bugzilla/attachment.cgi?id=8590
+% Up to three characters are required to do a reversible transliteration.
+
+LC_CTYPE
+
+translit_start
+
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> "<U0059><U004F>";<U0059>
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> "<U005A><U0048>";<U005A>
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> "<U0043><U005A>";<U0043>
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> "<U0043><U0048>";<U0043>
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> "<U0053><U0048>";<U0053>
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> "<U0053><U0048><U0048>";<U0053>
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> "<U0060><U0060>";<U0060>
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> "<U0059><U0027>";<U0059>
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> "<U0045><U0060>";<U0045>
+% CYRILLIC CAPITAL LETTER YU
+<U042E> "<U0059><U0055>";<U0059>
+% CYRILLIC CAPITAL LETTER YA
+<U042F> "<U0059><U0041>";<U0059>
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> "<U007A><U0068>";<U007A>
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> "<U0063><U007A>";<U0063>
+% CYRILLIC SMALL LETTER CHE
+<U0447> "<U0063><U0068>";<U0063>
+% CYRILLIC SMALL LETTER SHA
+<U0448> "<U0073><U0068>";<U0073>
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> "<U0073><U0068><U0068>";<U0073>
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> "<U0060><U0060>";<U0060>
+% CYRILLIC SMALL LETTER YERU
+<U044B> "<U0079><U0027>";<U0079>
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> "<U0065><U0060>";<U0065>
+% CYRILLIC SMALL LETTER YU
+<U044E> "<U0079><U0075>";<U0079>
+% CYRILLIC SMALL LETTER YA
+<U044F> "<U0079><U0061>";<U0079>
+% CYRILLIC SMALL LETTER IO
+<U0451> "<U0079><U006F>";<U0079>
+
+
+translit_end
+
+END LC_CTYPE
diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
--- a/localedata/locales/ts_ZA 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ts_ZA 2018-07-17 17:55:52.000000000 +0000
@@ -64,6 +64,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
--- a/localedata/locales/unm_US 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/unm_US 2018-07-17 17:55:52.000000000 +0000
@@ -48,6 +48,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
--- a/localedata/locales/ur_IN 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ur_IN 2018-07-17 17:55:53.000000000 +0000
@@ -46,6 +46,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
--- a/localedata/locales/ur_PK 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ur_PK 2018-07-17 17:55:53.000000000 +0000
@@ -58,6 +58,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
--- a/localedata/locales/ve_ZA 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ve_ZA 2018-07-17 17:55:53.000000000 +0000
@@ -67,6 +67,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
--- a/localedata/locales/vi_VN 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/vi_VN 2018-07-17 17:55:53.000000000 +0000
@@ -58,6 +58,7 @@
% dong sign -> d// -> dd
<U20AB> "<U0111>";"<U0064><U0064>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
--- a/localedata/locales/wa_BE 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/wa_BE 2018-07-17 17:55:53.000000000 +0000
@@ -69,6 +69,7 @@
<U00C5> "<U0041><U030A>";"<U0041>";"<U0041><U0055>"
<U00E5> "<U0061><U030A>";"<U0061>";"<U0061><U0075>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
--- a/localedata/locales/wo_SN 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/wo_SN 2018-07-17 17:55:53.000000000 +0000
@@ -55,6 +55,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
--- a/localedata/locales/xh_ZA 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/xh_ZA 2018-07-17 17:55:53.000000000 +0000
@@ -66,6 +66,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
--- a/localedata/locales/yi_US 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/yi_US 2018-07-17 17:55:53.000000000 +0000
@@ -73,6 +73,7 @@
<U05F0> "<U05D5><U05D5>";"<U0077><U0077>"
<U05F1> "<U05D5><U05D9>";"<U0077><U006A>"
<U05F2> "<U05D9><U05D9>";"<U006A><U006A>"
+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
--- a/localedata/locales/zh_CN 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/zh_CN 2018-07-17 17:55:53.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end

class "hanzi"; /
diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
--- a/localedata/locales/zu_ZA 2018-07-17 17:49:22.000000000 +0000
+++ b/localedata/locales/zu_ZA 2018-07-17 17:55:53.000000000 +0000
@@ -70,6 +70,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
Egor Kobylkin
2018-10-03 08:26:40 UTC
Permalink
Ping.

Absent of feedback I am wondering if anything could be missing in this
patch from the maintainers standpoint. More than two months have passed
since the original submission.

If I can be of assistance, please do not hesitate to contact me,
Egor Kobylkin
Post by Egor Kobylkin
Dear locale maintainers,
fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]
add Cyrillic transliteration table translit_cyrillic file
https://sourceware.org/bugzilla/attachment.cgi?id=8591 [7]
to localedata/locales/ and include it in all your locales going forward.
Patch included inline below.
This is a re-submission for the consideration for 2.29 on a request from
Carlos O'Donell https://sourceware.org/ml/libc-alpha/2018-07/msg00506.html
From this patch I have excluded locales that already mention cyrillic or
az_AZ
iso14651_t1_common
ky_KG
mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
Their maintainers are requested to make an explicit decision on how and
whether at all to include this patch.
The glibc wiki explicitly lists this use case as the test example
LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt
currently it fails on Cyrillic texts in most locales including ru_RU [1]
LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt |grep CYRILLIC
CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.
- It produces a string of question marks and spaces.
CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.
The root problem is the missing transliteration table that I am
supplying here. Furthermore it has to be referenced/included into the
active locale at the compilation time to be used by iconv.
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.
While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration has only ASCII codes but still can be read by a native
speaker. Among other things it is useful for processing the Cyrillic
texts and filenames by programs or on systems that are not specifically
prepared to work with Cyrillic, don't have corresponding fonts installed
or can't handle UTF-8.
The transliteration table itself is attached as a file translit_cyrillic
[7]. Its content (mapping) is based on GOST 7.79-2000 official source
(Federal Agency on Technical Regulating and Metrology Of Russian
Federation [2]). Technically an independent but identical source [3] was
used and prepared in a spreadsheet [6].
The documentation suggests that the transliteration tables inclusion is
done by adding *include "translit_cyrillic";""* string into LC_CTYPE
translit_start section
http://man7.org/linux/man-pages/man5/locale.5.html [5]
Practically I have searched for all locales that have a
translit_start/end stance and generated a patch for them.
The Cyrillic transliteration of e.g. Russian text may have already
worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
have their transliteration tables included inline.
However it would not be the standard Russian Cyrillic transliteration as
described above.
I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files. Volodymyr
exclusion.
[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=8590
[7] translit_cyrillic https://sourceware.org/bugzilla/attachment.cgi?id=8591
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=8618
Best regards,
Egor Kobylkin
---
[BZ #2872]
* locales/translit_cyrillic: add Russian GOST 7.79-2000 transliteration
table from Cyrillic to Latin.
* locales/C: add include "translit_cyrillic";"" to LC_CTYPE translit
section.
* locales/aa_DJ: likewise
* locales/af_ZA: likewise
* locales/ak_GH: likewise
* locales/am_ET: likewise
* locales/ar_EG: likewise
* locales/be_BY: likewise
* locales/bem_ZM: likewise
* locales/ber_DZ: likewise
* locales/ber_MA: likewise
* locales/bg_BG: likewise
* locales/bi_VU: likewise
* locales/bn_BD: likewise
* locales/bo_CN: likewise
* locales/ca_ES: likewise
* locales/ce_RU: likewise
* locales/cs_CZ: likewise
* locales/cv_RU: likewise
* locales/cy_GB: likewise
* locales/da_DK: likewise
* locales/de_DE: likewise
* locales/dv_MV: likewise
* locales/dz_BT: likewise
* locales/el_GR: likewise
* locales/en_GB: likewise
* locales/en_NG: likewise
* locales/en_ZM: likewise
* locales/es_CU: likewise
* locales/es_ES: likewise
* locales/et_EE: likewise
* locales/fa_IR: likewise
* locales/ff_SN: likewise
* locales/fi_FI: likewise
* locales/fr_FR: likewise
* locales/ga_IE: likewise
* locales/gd_GB: likewise
* locales/gu_IN: likewise
* locales/gv_GB: likewise
* locales/he_IL: likewise
* locales/hi_IN: likewise
* locales/hif_FJ: likewise
* locales/hr_HR: likewise
* locales/ht_HT: likewise
* locales/hu_HU: likewise
* locales/hy_AM: likewise
* locales/id_ID: likewise
* locales/is_IS: likewise
* locales/it_IT: likewise
* locales/ja_JP: likewise
* locales/kk_KZ: likewise
* locales/km_KH: likewise
* locales/kn_IN: likewise
* locales/ko_KR: likewise
* locales/ks_IN: likewise
* locales/kw_GB: likewise
* locales/lb_LU: likewise
* locales/lg_UG: likewise
* locales/lij_IT: likewise
* locales/ln_CD: likewise
* locales/lo_LA: likewise
* locales/lt_LT: likewise
* locales/lv_LV: likewise
* locales/mg_MG: likewise
* locales/mhr_RU: likewise
* locales/mk_MK: likewise
* locales/ml_IN: likewise
* locales/ms_MY: likewise
* locales/mt_MT: likewise
* locales/nb_NO: likewise
* locales/ne_NP: likewise
* locales/nhn_MX: likewise
* locales/niu_NU: likewise
* locales/niu_NZ: likewise
* locales/nl_NL: likewise
* locales/nr_ZA: likewise
* locales/oc_FR: likewise
* locales/om_KE: likewise
* locales/or_IN: likewise
* locales/os_RU: likewise
* locales/pa_IN: likewise
* locales/pa_PK: likewise
* locales/pl_PL: likewise
* locales/pt_PT: likewise
* locales/quz_PE: likewise
* locales/ro_RO: likewise
* locales/ru_RU: likewise
* locales/rw_RW: likewise
* locales/sa_IN: likewise
* locales/sd_IN: likewise
* locales/sd_PK: likewise
* locales/se_NO: likewise
* locales/sgs_LT: likewise
* locales/si_LK: likewise
* locales/sk_SK: likewise
* locales/sl_SI: likewise
* locales/sm_WS: likewise
* locales/so_SO: likewise
* locales/sq_AL: likewise
* locales/ss_ZA: likewise
* locales/st_ZA: likewise
* locales/sv_SE: likewise
* locales/sw_KE: likewise
* locales/ta_IN: likewise
* locales/te_IN: likewise
* locales/th_TH: likewise
* locales/ti_ET: likewise
* locales/tn_ZA: likewise
* locales/to_TO: likewise
* locales/tpi_PG: likewise
* locales/tr_TR: likewise
* locales/ts_ZA: likewise
* locales/unm_US: likewise
* locales/ur_IN: likewise
* locales/ur_PK: likewise
* locales/ve_ZA: likewise
* locales/vi_VN: likewise
* locales/wa_BE: likewise
* locales/wo_SN: likewise
* locales/xh_ZA: likewise
* locales/yi_US: likewise
* locales/zh_CN: likewise
* locales/zu_ZA: likewise
diff -uNr a/localedata/locales/C b/localedata/locales/C
--- a/localedata/locales/C 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/C 2018-07-17 17:55:47.000000000 +0000
@@ -2292,6 +2292,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
--- a/localedata/locales/aa_DJ 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/aa_DJ 2018-07-17 17:55:47.000000000 +0000
@@ -70,6 +70,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
--- a/localedata/locales/af_ZA 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/af_ZA 2018-07-17 17:55:47.000000000 +0000
@@ -72,6 +72,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
--- a/localedata/locales/ak_GH 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/ak_GH 2018-07-17 17:55:47.000000000 +0000
@@ -56,6 +56,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
--- a/localedata/locales/am_ET 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/am_ET 2018-07-17 17:55:47.000000000 +0000
@@ -1396,6 +1396,7 @@
<U137A> <U0060><U0039><U0030>
<U137B> <U0060><U0031><U0030><U0030>
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
--- a/localedata/locales/ar_EG 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/ar_EG 2018-07-17 17:55:48.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
--- a/localedata/locales/be_BY 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/be_BY 2018-07-17 17:55:48.000000000 +0000
@@ -69,6 +69,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
--- a/localedata/locales/bem_ZM 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bem_ZM 2018-07-17 17:55:48.000000000 +0000
@@ -42,6 +42,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
--- a/localedata/locales/ber_DZ 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ber_DZ 2018-07-17 17:55:48.000000000 +0000
@@ -166,6 +166,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
--- a/localedata/locales/ber_MA 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ber_MA 2018-07-17 17:55:48.000000000 +0000
@@ -86,6 +86,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
--- a/localedata/locales/bg_BG 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bg_BG 2018-07-17 17:55:48.000000000 +0000
@@ -49,6 +49,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
--- a/localedata/locales/bi_VU 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bi_VU 2018-07-17 17:55:48.000000000 +0000
@@ -39,6 +39,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
--- a/localedata/locales/bn_BD 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bn_BD 2018-07-17 17:55:48.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
--- a/localedata/locales/bo_CN 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bo_CN 2018-07-17 17:55:48.000000000 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
--- a/localedata/locales/ca_ES 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ca_ES 2018-07-17 17:55:48.000000000 +0000
@@ -72,6 +72,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
--- a/localedata/locales/ce_RU 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ce_RU 2018-07-17 17:55:48.000000000 +0000
@@ -39,6 +39,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
--- a/localedata/locales/cs_CZ 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/cs_CZ 2018-07-17 17:55:48.000000000 +0000
@@ -2311,6 +2311,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
--- a/localedata/locales/cv_RU 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/cv_RU 2018-07-17 17:55:48.000000000 +0000
@@ -109,6 +109,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
--- a/localedata/locales/cy_GB 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/cy_GB 2018-07-17 17:55:48.000000000 +0000
@@ -69,6 +69,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
--- a/localedata/locales/da_DK 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/da_DK 2018-07-17 17:55:48.000000000 +0000
@@ -167,6 +167,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
--- a/localedata/locales/de_DE 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/de_DE 2018-07-17 17:55:48.000000000 +0000
@@ -78,6 +78,7 @@
% DOUBLE HIGH-REVERSED-9 QUOTATION MARK
<U201F> <U00AB>;<U0022>
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
--- a/localedata/locales/dv_MV 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/dv_MV 2018-07-17 17:55:48.000000000 +0000
@@ -52,6 +52,7 @@
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
--- a/localedata/locales/dz_BT 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/dz_BT 2018-07-17 17:55:48.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
--- a/localedata/locales/el_GR 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/el_GR 2018-07-17 17:55:48.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
--- a/localedata/locales/en_GB 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/en_GB 2018-07-17 17:55:48.000000000 +0000
@@ -55,6 +55,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
--- a/localedata/locales/en_NG 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/en_NG 2018-07-17 17:55:48.000000000 +0000
@@ -50,6 +50,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
--- a/localedata/locales/en_ZM 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/en_ZM 2018-07-17 17:55:48.000000000 +0000
@@ -42,6 +42,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
--- a/localedata/locales/es_CU 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/es_CU 2018-07-17 17:55:48.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
--- a/localedata/locales/es_ES 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/es_ES 2018-07-17 17:55:49.000000000 +0000
@@ -73,6 +73,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
--- a/localedata/locales/et_EE 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/et_EE 2018-07-17 17:55:49.000000000 +0000
@@ -109,6 +109,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
--- a/localedata/locales/fa_IR 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/fa_IR 2018-07-17 17:55:49.000000000 +0000
@@ -79,6 +79,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
--- a/localedata/locales/ff_SN 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/ff_SN 2018-07-17 17:55:49.000000000 +0000
@@ -42,6 +42,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
--- a/localedata/locales/fi_FI 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/fi_FI 2018-07-17 17:55:49.000000000 +0000
@@ -137,6 +137,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
--- a/localedata/locales/fr_FR 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/fr_FR 2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@
% In France, accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
--- a/localedata/locales/ga_IE 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/ga_IE 2018-07-17 17:55:49.000000000 +0000
@@ -54,6 +54,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
--- a/localedata/locales/gd_GB 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/gd_GB 2018-07-17 17:55:49.000000000 +0000
@@ -47,6 +47,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
--- a/localedata/locales/gu_IN 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/gu_IN 2018-07-17 17:55:49.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
--- a/localedata/locales/gv_GB 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/gv_GB 2018-07-17 17:55:49.000000000 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
--- a/localedata/locales/he_IL 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/he_IL 2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
--- a/localedata/locales/hi_IN 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hi_IN 2018-07-17 17:55:49.000000000 +0000
@@ -61,6 +61,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
--- a/localedata/locales/hif_FJ 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hif_FJ 2018-07-17 17:55:49.000000000 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
--- a/localedata/locales/hr_HR 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hr_HR 2018-07-17 17:55:49.000000000 +0000
@@ -153,6 +153,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
--- a/localedata/locales/ht_HT 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/ht_HT 2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
--- a/localedata/locales/hu_HU 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hu_HU 2018-07-17 17:55:49.000000000 +0000
@@ -478,6 +478,7 @@
<U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
<U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
--- a/localedata/locales/hy_AM 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/hy_AM 2018-07-17 17:55:49.000000000 +0000
@@ -77,6 +77,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
--- a/localedata/locales/id_ID 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/id_ID 2018-07-17 17:55:49.000000000 +0000
@@ -55,6 +55,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
--- a/localedata/locales/is_IS 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/is_IS 2018-07-17 17:55:49.000000000 +0000
@@ -2161,6 +2161,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
--- a/localedata/locales/it_IT 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/it_IT 2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
--- a/localedata/locales/ja_JP 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/ja_JP 2018-07-17 17:55:49.000000000 +0000
@@ -1682,6 +1682,7 @@
include "translit_combining";""
include "translit_cjk_variants";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
--- a/localedata/locales/kk_KZ 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/kk_KZ 2018-07-17 17:55:50.000000000 +0000
@@ -158,6 +158,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
--- a/localedata/locales/km_KH 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/km_KH 2018-07-17 17:55:50.000000000 +0000
@@ -873,6 +873,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
--- a/localedata/locales/kn_IN 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/kn_IN 2018-07-17 17:55:50.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
--- a/localedata/locales/ko_KR 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/ko_KR 2018-07-17 17:55:50.000000000 +0000
@@ -6099,6 +6099,7 @@
include "translit_combining";""
include "translit_hangul";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
--- a/localedata/locales/ks_IN 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/ks_IN 2018-07-17 17:55:50.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
--- a/localedata/locales/kw_GB 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/kw_GB 2018-07-17 17:55:50.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
--- a/localedata/locales/lb_LU 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/lb_LU 2018-07-17 17:55:50.000000000 +0000
@@ -78,6 +78,7 @@
% LATIN SMALL LETTER E WITH CIRCUMFLEX
<U00EA> "<U0065><U005E>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
--- a/localedata/locales/lg_UG 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/lg_UG 2018-07-17 17:55:50.000000000 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
--- a/localedata/locales/lij_IT 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/lij_IT 2018-07-17 17:55:50.000000000 +0000
@@ -47,6 +47,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
--- a/localedata/locales/ln_CD 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ln_CD 2018-07-17 17:55:50.000000000 +0000
@@ -39,6 +39,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
--- a/localedata/locales/lo_LA 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/lo_LA 2018-07-17 17:55:50.000000000 +0000
@@ -51,6 +51,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
--- a/localedata/locales/lt_LT 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/lt_LT 2018-07-17 17:55:50.000000000 +0000
@@ -77,6 +77,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
--- a/localedata/locales/lv_LV 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/lv_LV 2018-07-17 17:55:50.000000000 +0000
@@ -2122,6 +2122,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
--- a/localedata/locales/mg_MG 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mg_MG 2018-07-17 17:55:50.000000000 +0000
@@ -55,6 +55,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
--- a/localedata/locales/mhr_RU 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mhr_RU 2018-07-17 17:55:50.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
--- a/localedata/locales/mk_MK 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mk_MK 2018-07-17 17:55:50.000000000 +0000
@@ -49,6 +49,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
--- a/localedata/locales/ml_IN 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ml_IN 2018-07-17 17:55:50.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
%
diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
--- a/localedata/locales/ms_MY 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ms_MY 2018-07-17 17:55:50.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
--- a/localedata/locales/mt_MT 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mt_MT 2018-07-17 17:55:50.000000000 +0000
@@ -47,6 +47,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
@@ -53,6 +53,7 @@
% accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
--- a/localedata/locales/nb_NO 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/nb_NO 2018-07-17 17:55:50.000000000 +0000
@@ -154,6 +154,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
--- a/localedata/locales/ne_NP 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ne_NP 2018-07-17 17:55:50.000000000 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
--- a/localedata/locales/nhn_MX 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/nhn_MX 2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
--- a/localedata/locales/niu_NU 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/niu_NU 2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
--- a/localedata/locales/niu_NZ 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/niu_NZ 2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
--- a/localedata/locales/nl_NL 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/nl_NL 2018-07-17 17:55:51.000000000 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
--- a/localedata/locales/nr_ZA 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/nr_ZA 2018-07-17 17:55:51.000000000 +0000
@@ -66,6 +66,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
--- a/localedata/locales/oc_FR 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/oc_FR 2018-07-17 17:55:51.000000000 +0000
@@ -62,6 +62,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
--- a/localedata/locales/om_KE 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/om_KE 2018-07-17 17:55:51.000000000 +0000
@@ -140,6 +140,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
--- a/localedata/locales/or_IN 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/or_IN 2018-07-17 17:55:51.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
--- a/localedata/locales/os_RU 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/os_RU 2018-07-17 17:55:51.000000000 +0000
@@ -70,6 +70,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
--- a/localedata/locales/pa_IN 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pa_IN 2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
--- a/localedata/locales/pa_PK 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pa_PK 2018-07-17 17:55:51.000000000 +0000
@@ -58,6 +58,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
--- a/localedata/locales/pl_PL 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pl_PL 2018-07-17 17:55:51.000000000 +0000
@@ -142,6 +142,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
--- a/localedata/locales/pt_PT 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pt_PT 2018-07-17 17:55:51.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
--- a/localedata/locales/quz_PE 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/quz_PE 2018-07-17 17:55:51.000000000 +0000
@@ -57,6 +57,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
--- a/localedata/locales/ro_RO 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/ro_RO 2018-07-17 17:55:51.000000000 +0000
@@ -144,6 +144,7 @@
<U0162> "<U021A>";"<U0054>"
<U0163> "<U021B>";"<U0074>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
--- a/localedata/locales/ru_RU 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/ru_RU 2018-07-17 17:55:51.000000000 +0000
@@ -74,6 +74,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
--- a/localedata/locales/rw_RW 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/rw_RW 2018-07-17 17:55:51.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
--- a/localedata/locales/sa_IN 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sa_IN 2018-07-17 17:55:51.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
--- a/localedata/locales/sd_IN 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sd_IN 2018-07-17 17:55:51.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
+0000
+0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
--- a/localedata/locales/sd_PK 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sd_PK 2018-07-17 17:55:51.000000000 +0000
@@ -39,6 +39,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
--- a/localedata/locales/se_NO 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/se_NO 2018-07-17 17:55:51.000000000 +0000
@@ -205,6 +205,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
--- a/localedata/locales/sgs_LT 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sgs_LT 2018-07-17 17:55:52.000000000 +0000
@@ -59,6 +59,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
--- a/localedata/locales/si_LK 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/si_LK 2018-07-17 17:55:52.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
--- a/localedata/locales/sk_SK 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sk_SK 2018-07-17 17:55:52.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
--- a/localedata/locales/sl_SI 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sl_SI 2018-07-17 17:55:52.000000000 +0000
@@ -91,6 +91,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
--- a/localedata/locales/sm_WS 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sm_WS 2018-07-17 17:55:52.000000000 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
--- a/localedata/locales/so_SO 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/so_SO 2018-07-17 17:55:52.000000000 +0000
@@ -70,6 +70,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
--- a/localedata/locales/sq_AL 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sq_AL 2018-07-17 17:55:52.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
--- a/localedata/locales/ss_ZA 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/ss_ZA 2018-07-17 17:55:52.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
--- a/localedata/locales/st_ZA 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/st_ZA 2018-07-17 17:55:52.000000000 +0000
@@ -64,6 +64,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
--- a/localedata/locales/sv_SE 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sv_SE 2018-07-17 17:55:52.000000000 +0000
@@ -139,6 +139,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
--- a/localedata/locales/sw_KE 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sw_KE 2018-07-17 17:55:52.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
--- a/localedata/locales/ta_IN 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/ta_IN 2018-07-17 17:55:52.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
--- a/localedata/locales/te_IN 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/te_IN 2018-07-17 17:55:52.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
--- a/localedata/locales/th_TH 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/th_TH 2018-07-17 17:55:52.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
--- a/localedata/locales/ti_ET 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/ti_ET 2018-07-17 17:55:52.000000000 +0000
@@ -866,6 +866,7 @@
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
--- a/localedata/locales/tn_ZA 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/tn_ZA 2018-07-17 17:55:52.000000000 +0000
@@ -69,6 +69,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
--- a/localedata/locales/to_TO 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/to_TO 2018-07-17 17:55:52.000000000 +0000
@@ -36,6 +36,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
--- a/localedata/locales/tpi_PG 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/tpi_PG 2018-07-17 17:55:52.000000000 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
--- a/localedata/locales/tr_TR 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/tr_TR 2018-07-17 17:55:52.000000000 +0000
@@ -2430,6 +2430,7 @@
% TURKISH LIRA SIGN
<U20BA> "<U0054><U004C>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/translit_cyrillic
b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000
+0000
+++ b/localedata/locales/translit_cyrillic 2018-07-17 17:55:52.000000000
+0000
@@ -0,0 +1,151 @@
+escape_char /
+comment_char %
+
+% Transliterations that converts cyrillic letters to ascii symbols
inspired by GOST 7.79-2000
+% https://sourceware.org/bugzilla/show_bug.cgi?id=2872
+% Generated from UnicodeData.txt with
+% https://sourceware.org/bugzilla/attachment.cgi?id=8590
+% Up to three characters are required to do a reversible transliteration.
+
+LC_CTYPE
+
+translit_start
+
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> "<U0059><U004F>";<U0059>
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> "<U005A><U0048>";<U005A>
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> "<U0043><U005A>";<U0043>
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> "<U0043><U0048>";<U0043>
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> "<U0053><U0048>";<U0053>
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> "<U0053><U0048><U0048>";<U0053>
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> "<U0060><U0060>";<U0060>
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> "<U0059><U0027>";<U0059>
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> "<U0045><U0060>";<U0045>
+% CYRILLIC CAPITAL LETTER YU
+<U042E> "<U0059><U0055>";<U0059>
+% CYRILLIC CAPITAL LETTER YA
+<U042F> "<U0059><U0041>";<U0059>
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> "<U007A><U0068>";<U007A>
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> "<U0063><U007A>";<U0063>
+% CYRILLIC SMALL LETTER CHE
+<U0447> "<U0063><U0068>";<U0063>
+% CYRILLIC SMALL LETTER SHA
+<U0448> "<U0073><U0068>";<U0073>
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> "<U0073><U0068><U0068>";<U0073>
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> "<U0060><U0060>";<U0060>
+% CYRILLIC SMALL LETTER YERU
+<U044B> "<U0079><U0027>";<U0079>
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> "<U0065><U0060>";<U0065>
+% CYRILLIC SMALL LETTER YU
+<U044E> "<U0079><U0075>";<U0079>
+% CYRILLIC SMALL LETTER YA
+<U044F> "<U0079><U0061>";<U0079>
+% CYRILLIC SMALL LETTER IO
+<U0451> "<U0079><U006F>";<U0079>
+
+
+translit_end
+
+END LC_CTYPE
diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
--- a/localedata/locales/ts_ZA 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ts_ZA 2018-07-17 17:55:52.000000000 +0000
@@ -64,6 +64,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
--- a/localedata/locales/unm_US 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/unm_US 2018-07-17 17:55:52.000000000 +0000
@@ -48,6 +48,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
--- a/localedata/locales/ur_IN 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ur_IN 2018-07-17 17:55:53.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
--- a/localedata/locales/ur_PK 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ur_PK 2018-07-17 17:55:53.000000000 +0000
@@ -58,6 +58,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
--- a/localedata/locales/ve_ZA 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ve_ZA 2018-07-17 17:55:53.000000000 +0000
@@ -67,6 +67,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
--- a/localedata/locales/vi_VN 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/vi_VN 2018-07-17 17:55:53.000000000 +0000
@@ -58,6 +58,7 @@
% dong sign -> d// -> dd
<U20AB> "<U0111>";"<U0064><U0064>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
--- a/localedata/locales/wa_BE 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/wa_BE 2018-07-17 17:55:53.000000000 +0000
@@ -69,6 +69,7 @@
<U00C5> "<U0041><U030A>";"<U0041>";"<U0041><U0055>"
<U00E5> "<U0061><U030A>";"<U0061>";"<U0061><U0075>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
--- a/localedata/locales/wo_SN 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/wo_SN 2018-07-17 17:55:53.000000000 +0000
@@ -55,6 +55,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
--- a/localedata/locales/xh_ZA 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/xh_ZA 2018-07-17 17:55:53.000000000 +0000
@@ -66,6 +66,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
--- a/localedata/locales/yi_US 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/yi_US 2018-07-17 17:55:53.000000000 +0000
@@ -73,6 +73,7 @@
<U05F0> "<U05D5><U05D5>";"<U0077><U0077>"
<U05F1> "<U05D5><U05D9>";"<U0077><U006A>"
<U05F2> "<U05D9><U05D9>";"<U006A><U006A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
--- a/localedata/locales/zh_CN 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/zh_CN 2018-07-17 17:55:53.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
class "hanzi"; /
diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
--- a/localedata/locales/zu_ZA 2018-07-17 17:49:22.000000000 +0000
+++ b/localedata/locales/zu_ZA 2018-07-17 17:55:53.000000000 +0000
@@ -70,6 +70,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
Keld Simonsen
2018-10-03 09:19:49 UTC
Permalink
Hi

Please note that translitteration of Cyrillic to latin is not universal.
There are different schemes for for example German, English and Danish, and
there is also an ISO standard for it.

But do go forward with fixing this bug.

Best regards
Keld
Post by Egor Kobylkin
Ping.
Absent of feedback I am wondering if anything could be missing in this
patch from the maintainers standpoint. More than two months have passed
since the original submission.
If I can be of assistance, please do not hesitate to contact me,
Egor Kobylkin
Post by Egor Kobylkin
Dear locale maintainers,
fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]
add Cyrillic transliteration table translit_cyrillic file
https://sourceware.org/bugzilla/attachment.cgi?id=8591 [7]
to localedata/locales/ and include it in all your locales going forward.
Patch included inline below.
This is a re-submission for the consideration for 2.29 on a request from
Carlos O'Donell https://sourceware.org/ml/libc-alpha/2018-07/msg00506.html
From this patch I have excluded locales that already mention cyrillic or
az_AZ
iso14651_t1_common
ky_KG
mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
Their maintainers are requested to make an explicit decision on how and
whether at all to include this patch.
The glibc wiki explicitly lists this use case as the test example
LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt
currently it fails on Cyrillic texts in most locales including ru_RU [1]
LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt |grep CYRILLIC
CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.
- It produces a string of question marks and spaces.
CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.
The root problem is the missing transliteration table that I am
supplying here. Furthermore it has to be referenced/included into the
active locale at the compilation time to be used by iconv.
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.
While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration has only ASCII codes but still can be read by a native
speaker. Among other things it is useful for processing the Cyrillic
texts and filenames by programs or on systems that are not specifically
prepared to work with Cyrillic, don't have corresponding fonts installed
or can't handle UTF-8.
The transliteration table itself is attached as a file translit_cyrillic
[7]. Its content (mapping) is based on GOST 7.79-2000 official source
(Federal Agency on Technical Regulating and Metrology Of Russian
Federation [2]). Technically an independent but identical source [3] was
used and prepared in a spreadsheet [6].
The documentation suggests that the transliteration tables inclusion is
done by adding *include "translit_cyrillic";""* string into LC_CTYPE
translit_start section
http://man7.org/linux/man-pages/man5/locale.5.html [5]
Practically I have searched for all locales that have a
translit_start/end stance and generated a patch for them.
The Cyrillic transliteration of e.g. Russian text may have already
worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
have their transliteration tables included inline.
However it would not be the standard Russian Cyrillic transliteration as
described above.
I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files. Volodymyr
exclusion.
[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=8590
[7] translit_cyrillic https://sourceware.org/bugzilla/attachment.cgi?id=8591
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=8618
Best regards,
Egor Kobylkin
---
[BZ #2872]
* locales/translit_cyrillic: add Russian GOST 7.79-2000 transliteration
table from Cyrillic to Latin.
* locales/C: add include "translit_cyrillic";"" to LC_CTYPE translit
section.
* locales/aa_DJ: likewise
* locales/af_ZA: likewise
* locales/ak_GH: likewise
* locales/am_ET: likewise
* locales/ar_EG: likewise
* locales/be_BY: likewise
* locales/bem_ZM: likewise
* locales/ber_DZ: likewise
* locales/ber_MA: likewise
* locales/bg_BG: likewise
* locales/bi_VU: likewise
* locales/bn_BD: likewise
* locales/bo_CN: likewise
* locales/ca_ES: likewise
* locales/ce_RU: likewise
* locales/cs_CZ: likewise
* locales/cv_RU: likewise
* locales/cy_GB: likewise
* locales/da_DK: likewise
* locales/de_DE: likewise
* locales/dv_MV: likewise
* locales/dz_BT: likewise
* locales/el_GR: likewise
* locales/en_GB: likewise
* locales/en_NG: likewise
* locales/en_ZM: likewise
* locales/es_CU: likewise
* locales/es_ES: likewise
* locales/et_EE: likewise
* locales/fa_IR: likewise
* locales/ff_SN: likewise
* locales/fi_FI: likewise
* locales/fr_FR: likewise
* locales/ga_IE: likewise
* locales/gd_GB: likewise
* locales/gu_IN: likewise
* locales/gv_GB: likewise
* locales/he_IL: likewise
* locales/hi_IN: likewise
* locales/hif_FJ: likewise
* locales/hr_HR: likewise
* locales/ht_HT: likewise
* locales/hu_HU: likewise
* locales/hy_AM: likewise
* locales/id_ID: likewise
* locales/is_IS: likewise
* locales/it_IT: likewise
* locales/ja_JP: likewise
* locales/kk_KZ: likewise
* locales/km_KH: likewise
* locales/kn_IN: likewise
* locales/ko_KR: likewise
* locales/ks_IN: likewise
* locales/kw_GB: likewise
* locales/lb_LU: likewise
* locales/lg_UG: likewise
* locales/lij_IT: likewise
* locales/ln_CD: likewise
* locales/lo_LA: likewise
* locales/lt_LT: likewise
* locales/lv_LV: likewise
* locales/mg_MG: likewise
* locales/mhr_RU: likewise
* locales/mk_MK: likewise
* locales/ml_IN: likewise
* locales/ms_MY: likewise
* locales/mt_MT: likewise
* locales/nb_NO: likewise
* locales/ne_NP: likewise
* locales/nhn_MX: likewise
* locales/niu_NU: likewise
* locales/niu_NZ: likewise
* locales/nl_NL: likewise
* locales/nr_ZA: likewise
* locales/oc_FR: likewise
* locales/om_KE: likewise
* locales/or_IN: likewise
* locales/os_RU: likewise
* locales/pa_IN: likewise
* locales/pa_PK: likewise
* locales/pl_PL: likewise
* locales/pt_PT: likewise
* locales/quz_PE: likewise
* locales/ro_RO: likewise
* locales/ru_RU: likewise
* locales/rw_RW: likewise
* locales/sa_IN: likewise
* locales/sd_IN: likewise
* locales/sd_PK: likewise
* locales/se_NO: likewise
* locales/sgs_LT: likewise
* locales/si_LK: likewise
* locales/sk_SK: likewise
* locales/sl_SI: likewise
* locales/sm_WS: likewise
* locales/so_SO: likewise
* locales/sq_AL: likewise
* locales/ss_ZA: likewise
* locales/st_ZA: likewise
* locales/sv_SE: likewise
* locales/sw_KE: likewise
* locales/ta_IN: likewise
* locales/te_IN: likewise
* locales/th_TH: likewise
* locales/ti_ET: likewise
* locales/tn_ZA: likewise
* locales/to_TO: likewise
* locales/tpi_PG: likewise
* locales/tr_TR: likewise
* locales/ts_ZA: likewise
* locales/unm_US: likewise
* locales/ur_IN: likewise
* locales/ur_PK: likewise
* locales/ve_ZA: likewise
* locales/vi_VN: likewise
* locales/wa_BE: likewise
* locales/wo_SN: likewise
* locales/xh_ZA: likewise
* locales/yi_US: likewise
* locales/zh_CN: likewise
* locales/zu_ZA: likewise
diff -uNr a/localedata/locales/C b/localedata/locales/C
--- a/localedata/locales/C 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/C 2018-07-17 17:55:47.000000000 +0000
@@ -2292,6 +2292,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
--- a/localedata/locales/aa_DJ 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/aa_DJ 2018-07-17 17:55:47.000000000 +0000
@@ -70,6 +70,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
--- a/localedata/locales/af_ZA 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/af_ZA 2018-07-17 17:55:47.000000000 +0000
@@ -72,6 +72,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
--- a/localedata/locales/ak_GH 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/ak_GH 2018-07-17 17:55:47.000000000 +0000
@@ -56,6 +56,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
--- a/localedata/locales/am_ET 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/am_ET 2018-07-17 17:55:47.000000000 +0000
@@ -1396,6 +1396,7 @@
<U137A> <U0060><U0039><U0030>
<U137B> <U0060><U0031><U0030><U0030>
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
--- a/localedata/locales/ar_EG 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/ar_EG 2018-07-17 17:55:48.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
--- a/localedata/locales/be_BY 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/be_BY 2018-07-17 17:55:48.000000000 +0000
@@ -69,6 +69,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
--- a/localedata/locales/bem_ZM 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bem_ZM 2018-07-17 17:55:48.000000000 +0000
@@ -42,6 +42,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
--- a/localedata/locales/ber_DZ 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ber_DZ 2018-07-17 17:55:48.000000000 +0000
@@ -166,6 +166,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
--- a/localedata/locales/ber_MA 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ber_MA 2018-07-17 17:55:48.000000000 +0000
@@ -86,6 +86,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
--- a/localedata/locales/bg_BG 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bg_BG 2018-07-17 17:55:48.000000000 +0000
@@ -49,6 +49,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
--- a/localedata/locales/bi_VU 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bi_VU 2018-07-17 17:55:48.000000000 +0000
@@ -39,6 +39,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
--- a/localedata/locales/bn_BD 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bn_BD 2018-07-17 17:55:48.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
--- a/localedata/locales/bo_CN 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bo_CN 2018-07-17 17:55:48.000000000 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
--- a/localedata/locales/ca_ES 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ca_ES 2018-07-17 17:55:48.000000000 +0000
@@ -72,6 +72,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
--- a/localedata/locales/ce_RU 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ce_RU 2018-07-17 17:55:48.000000000 +0000
@@ -39,6 +39,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
--- a/localedata/locales/cs_CZ 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/cs_CZ 2018-07-17 17:55:48.000000000 +0000
@@ -2311,6 +2311,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
--- a/localedata/locales/cv_RU 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/cv_RU 2018-07-17 17:55:48.000000000 +0000
@@ -109,6 +109,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
--- a/localedata/locales/cy_GB 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/cy_GB 2018-07-17 17:55:48.000000000 +0000
@@ -69,6 +69,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
--- a/localedata/locales/da_DK 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/da_DK 2018-07-17 17:55:48.000000000 +0000
@@ -167,6 +167,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
--- a/localedata/locales/de_DE 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/de_DE 2018-07-17 17:55:48.000000000 +0000
@@ -78,6 +78,7 @@
% DOUBLE HIGH-REVERSED-9 QUOTATION MARK
<U201F> <U00AB>;<U0022>
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
--- a/localedata/locales/dv_MV 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/dv_MV 2018-07-17 17:55:48.000000000 +0000
@@ -52,6 +52,7 @@
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
--- a/localedata/locales/dz_BT 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/dz_BT 2018-07-17 17:55:48.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
--- a/localedata/locales/el_GR 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/el_GR 2018-07-17 17:55:48.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
--- a/localedata/locales/en_GB 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/en_GB 2018-07-17 17:55:48.000000000 +0000
@@ -55,6 +55,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
--- a/localedata/locales/en_NG 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/en_NG 2018-07-17 17:55:48.000000000 +0000
@@ -50,6 +50,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
--- a/localedata/locales/en_ZM 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/en_ZM 2018-07-17 17:55:48.000000000 +0000
@@ -42,6 +42,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
--- a/localedata/locales/es_CU 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/es_CU 2018-07-17 17:55:48.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
--- a/localedata/locales/es_ES 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/es_ES 2018-07-17 17:55:49.000000000 +0000
@@ -73,6 +73,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
--- a/localedata/locales/et_EE 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/et_EE 2018-07-17 17:55:49.000000000 +0000
@@ -109,6 +109,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
--- a/localedata/locales/fa_IR 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/fa_IR 2018-07-17 17:55:49.000000000 +0000
@@ -79,6 +79,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
--- a/localedata/locales/ff_SN 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/ff_SN 2018-07-17 17:55:49.000000000 +0000
@@ -42,6 +42,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
--- a/localedata/locales/fi_FI 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/fi_FI 2018-07-17 17:55:49.000000000 +0000
@@ -137,6 +137,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
--- a/localedata/locales/fr_FR 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/fr_FR 2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@
% In France, accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
--- a/localedata/locales/ga_IE 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/ga_IE 2018-07-17 17:55:49.000000000 +0000
@@ -54,6 +54,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
--- a/localedata/locales/gd_GB 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/gd_GB 2018-07-17 17:55:49.000000000 +0000
@@ -47,6 +47,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
--- a/localedata/locales/gu_IN 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/gu_IN 2018-07-17 17:55:49.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
--- a/localedata/locales/gv_GB 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/gv_GB 2018-07-17 17:55:49.000000000 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
--- a/localedata/locales/he_IL 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/he_IL 2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
--- a/localedata/locales/hi_IN 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hi_IN 2018-07-17 17:55:49.000000000 +0000
@@ -61,6 +61,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
--- a/localedata/locales/hif_FJ 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hif_FJ 2018-07-17 17:55:49.000000000 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
--- a/localedata/locales/hr_HR 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hr_HR 2018-07-17 17:55:49.000000000 +0000
@@ -153,6 +153,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
--- a/localedata/locales/ht_HT 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/ht_HT 2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
--- a/localedata/locales/hu_HU 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hu_HU 2018-07-17 17:55:49.000000000 +0000
@@ -478,6 +478,7 @@
<U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
<U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
--- a/localedata/locales/hy_AM 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/hy_AM 2018-07-17 17:55:49.000000000 +0000
@@ -77,6 +77,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
--- a/localedata/locales/id_ID 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/id_ID 2018-07-17 17:55:49.000000000 +0000
@@ -55,6 +55,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
--- a/localedata/locales/is_IS 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/is_IS 2018-07-17 17:55:49.000000000 +0000
@@ -2161,6 +2161,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
--- a/localedata/locales/it_IT 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/it_IT 2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
--- a/localedata/locales/ja_JP 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/ja_JP 2018-07-17 17:55:49.000000000 +0000
@@ -1682,6 +1682,7 @@
include "translit_combining";""
include "translit_cjk_variants";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
--- a/localedata/locales/kk_KZ 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/kk_KZ 2018-07-17 17:55:50.000000000 +0000
@@ -158,6 +158,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
--- a/localedata/locales/km_KH 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/km_KH 2018-07-17 17:55:50.000000000 +0000
@@ -873,6 +873,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
--- a/localedata/locales/kn_IN 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/kn_IN 2018-07-17 17:55:50.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
--- a/localedata/locales/ko_KR 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/ko_KR 2018-07-17 17:55:50.000000000 +0000
@@ -6099,6 +6099,7 @@
include "translit_combining";""
include "translit_hangul";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
--- a/localedata/locales/ks_IN 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/ks_IN 2018-07-17 17:55:50.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
--- a/localedata/locales/kw_GB 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/kw_GB 2018-07-17 17:55:50.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
--- a/localedata/locales/lb_LU 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/lb_LU 2018-07-17 17:55:50.000000000 +0000
@@ -78,6 +78,7 @@
% LATIN SMALL LETTER E WITH CIRCUMFLEX
<U00EA> "<U0065><U005E>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
--- a/localedata/locales/lg_UG 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/lg_UG 2018-07-17 17:55:50.000000000 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
--- a/localedata/locales/lij_IT 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/lij_IT 2018-07-17 17:55:50.000000000 +0000
@@ -47,6 +47,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
--- a/localedata/locales/ln_CD 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ln_CD 2018-07-17 17:55:50.000000000 +0000
@@ -39,6 +39,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
--- a/localedata/locales/lo_LA 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/lo_LA 2018-07-17 17:55:50.000000000 +0000
@@ -51,6 +51,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
--- a/localedata/locales/lt_LT 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/lt_LT 2018-07-17 17:55:50.000000000 +0000
@@ -77,6 +77,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
--- a/localedata/locales/lv_LV 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/lv_LV 2018-07-17 17:55:50.000000000 +0000
@@ -2122,6 +2122,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
--- a/localedata/locales/mg_MG 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mg_MG 2018-07-17 17:55:50.000000000 +0000
@@ -55,6 +55,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
--- a/localedata/locales/mhr_RU 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mhr_RU 2018-07-17 17:55:50.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
--- a/localedata/locales/mk_MK 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mk_MK 2018-07-17 17:55:50.000000000 +0000
@@ -49,6 +49,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
--- a/localedata/locales/ml_IN 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ml_IN 2018-07-17 17:55:50.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
%
diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
--- a/localedata/locales/ms_MY 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ms_MY 2018-07-17 17:55:50.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
--- a/localedata/locales/mt_MT 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mt_MT 2018-07-17 17:55:50.000000000 +0000
@@ -47,6 +47,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
@@ -53,6 +53,7 @@
% accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
--- a/localedata/locales/nb_NO 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/nb_NO 2018-07-17 17:55:50.000000000 +0000
@@ -154,6 +154,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
--- a/localedata/locales/ne_NP 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ne_NP 2018-07-17 17:55:50.000000000 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
--- a/localedata/locales/nhn_MX 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/nhn_MX 2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
--- a/localedata/locales/niu_NU 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/niu_NU 2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
--- a/localedata/locales/niu_NZ 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/niu_NZ 2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
--- a/localedata/locales/nl_NL 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/nl_NL 2018-07-17 17:55:51.000000000 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
--- a/localedata/locales/nr_ZA 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/nr_ZA 2018-07-17 17:55:51.000000000 +0000
@@ -66,6 +66,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
--- a/localedata/locales/oc_FR 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/oc_FR 2018-07-17 17:55:51.000000000 +0000
@@ -62,6 +62,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
--- a/localedata/locales/om_KE 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/om_KE 2018-07-17 17:55:51.000000000 +0000
@@ -140,6 +140,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
--- a/localedata/locales/or_IN 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/or_IN 2018-07-17 17:55:51.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
--- a/localedata/locales/os_RU 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/os_RU 2018-07-17 17:55:51.000000000 +0000
@@ -70,6 +70,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
--- a/localedata/locales/pa_IN 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pa_IN 2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
--- a/localedata/locales/pa_PK 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pa_PK 2018-07-17 17:55:51.000000000 +0000
@@ -58,6 +58,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
--- a/localedata/locales/pl_PL 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pl_PL 2018-07-17 17:55:51.000000000 +0000
@@ -142,6 +142,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
--- a/localedata/locales/pt_PT 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pt_PT 2018-07-17 17:55:51.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
--- a/localedata/locales/quz_PE 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/quz_PE 2018-07-17 17:55:51.000000000 +0000
@@ -57,6 +57,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
--- a/localedata/locales/ro_RO 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/ro_RO 2018-07-17 17:55:51.000000000 +0000
@@ -144,6 +144,7 @@
<U0162> "<U021A>";"<U0054>"
<U0163> "<U021B>";"<U0074>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
--- a/localedata/locales/ru_RU 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/ru_RU 2018-07-17 17:55:51.000000000 +0000
@@ -74,6 +74,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
--- a/localedata/locales/rw_RW 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/rw_RW 2018-07-17 17:55:51.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
--- a/localedata/locales/sa_IN 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sa_IN 2018-07-17 17:55:51.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
--- a/localedata/locales/sd_IN 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sd_IN 2018-07-17 17:55:51.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
+0000
+0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
--- a/localedata/locales/sd_PK 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sd_PK 2018-07-17 17:55:51.000000000 +0000
@@ -39,6 +39,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
--- a/localedata/locales/se_NO 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/se_NO 2018-07-17 17:55:51.000000000 +0000
@@ -205,6 +205,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
--- a/localedata/locales/sgs_LT 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sgs_LT 2018-07-17 17:55:52.000000000 +0000
@@ -59,6 +59,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
--- a/localedata/locales/si_LK 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/si_LK 2018-07-17 17:55:52.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
--- a/localedata/locales/sk_SK 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sk_SK 2018-07-17 17:55:52.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
--- a/localedata/locales/sl_SI 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sl_SI 2018-07-17 17:55:52.000000000 +0000
@@ -91,6 +91,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
--- a/localedata/locales/sm_WS 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sm_WS 2018-07-17 17:55:52.000000000 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
--- a/localedata/locales/so_SO 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/so_SO 2018-07-17 17:55:52.000000000 +0000
@@ -70,6 +70,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
--- a/localedata/locales/sq_AL 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sq_AL 2018-07-17 17:55:52.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
--- a/localedata/locales/ss_ZA 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/ss_ZA 2018-07-17 17:55:52.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
--- a/localedata/locales/st_ZA 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/st_ZA 2018-07-17 17:55:52.000000000 +0000
@@ -64,6 +64,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
--- a/localedata/locales/sv_SE 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sv_SE 2018-07-17 17:55:52.000000000 +0000
@@ -139,6 +139,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
--- a/localedata/locales/sw_KE 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sw_KE 2018-07-17 17:55:52.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
--- a/localedata/locales/ta_IN 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/ta_IN 2018-07-17 17:55:52.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
--- a/localedata/locales/te_IN 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/te_IN 2018-07-17 17:55:52.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
--- a/localedata/locales/th_TH 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/th_TH 2018-07-17 17:55:52.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
--- a/localedata/locales/ti_ET 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/ti_ET 2018-07-17 17:55:52.000000000 +0000
@@ -866,6 +866,7 @@
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
--- a/localedata/locales/tn_ZA 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/tn_ZA 2018-07-17 17:55:52.000000000 +0000
@@ -69,6 +69,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
--- a/localedata/locales/to_TO 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/to_TO 2018-07-17 17:55:52.000000000 +0000
@@ -36,6 +36,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
--- a/localedata/locales/tpi_PG 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/tpi_PG 2018-07-17 17:55:52.000000000 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
--- a/localedata/locales/tr_TR 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/tr_TR 2018-07-17 17:55:52.000000000 +0000
@@ -2430,6 +2430,7 @@
% TURKISH LIRA SIGN
<U20BA> "<U0054><U004C>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/translit_cyrillic
b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000
+0000
+++ b/localedata/locales/translit_cyrillic 2018-07-17 17:55:52.000000000
+0000
@@ -0,0 +1,151 @@
+escape_char /
+comment_char %
+
+% Transliterations that converts cyrillic letters to ascii symbols
inspired by GOST 7.79-2000
+% https://sourceware.org/bugzilla/show_bug.cgi?id=2872
+% Generated from UnicodeData.txt with
+% https://sourceware.org/bugzilla/attachment.cgi?id=8590
+% Up to three characters are required to do a reversible transliteration.
+
+LC_CTYPE
+
+translit_start
+
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> "<U0059><U004F>";<U0059>
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> "<U005A><U0048>";<U005A>
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> "<U0043><U005A>";<U0043>
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> "<U0043><U0048>";<U0043>
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> "<U0053><U0048>";<U0053>
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> "<U0053><U0048><U0048>";<U0053>
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> "<U0060><U0060>";<U0060>
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> "<U0059><U0027>";<U0059>
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> "<U0045><U0060>";<U0045>
+% CYRILLIC CAPITAL LETTER YU
+<U042E> "<U0059><U0055>";<U0059>
+% CYRILLIC CAPITAL LETTER YA
+<U042F> "<U0059><U0041>";<U0059>
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> "<U007A><U0068>";<U007A>
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> "<U0063><U007A>";<U0063>
+% CYRILLIC SMALL LETTER CHE
+<U0447> "<U0063><U0068>";<U0063>
+% CYRILLIC SMALL LETTER SHA
+<U0448> "<U0073><U0068>";<U0073>
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> "<U0073><U0068><U0068>";<U0073>
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> "<U0060><U0060>";<U0060>
+% CYRILLIC SMALL LETTER YERU
+<U044B> "<U0079><U0027>";<U0079>
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> "<U0065><U0060>";<U0065>
+% CYRILLIC SMALL LETTER YU
+<U044E> "<U0079><U0075>";<U0079>
+% CYRILLIC SMALL LETTER YA
+<U044F> "<U0079><U0061>";<U0079>
+% CYRILLIC SMALL LETTER IO
+<U0451> "<U0079><U006F>";<U0079>
+
+
+translit_end
+
+END LC_CTYPE
diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
--- a/localedata/locales/ts_ZA 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ts_ZA 2018-07-17 17:55:52.000000000 +0000
@@ -64,6 +64,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
--- a/localedata/locales/unm_US 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/unm_US 2018-07-17 17:55:52.000000000 +0000
@@ -48,6 +48,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
--- a/localedata/locales/ur_IN 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ur_IN 2018-07-17 17:55:53.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
--- a/localedata/locales/ur_PK 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ur_PK 2018-07-17 17:55:53.000000000 +0000
@@ -58,6 +58,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
--- a/localedata/locales/ve_ZA 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ve_ZA 2018-07-17 17:55:53.000000000 +0000
@@ -67,6 +67,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
--- a/localedata/locales/vi_VN 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/vi_VN 2018-07-17 17:55:53.000000000 +0000
@@ -58,6 +58,7 @@
% dong sign -> d// -> dd
<U20AB> "<U0111>";"<U0064><U0064>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
--- a/localedata/locales/wa_BE 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/wa_BE 2018-07-17 17:55:53.000000000 +0000
@@ -69,6 +69,7 @@
<U00C5> "<U0041><U030A>";"<U0041>";"<U0041><U0055>"
<U00E5> "<U0061><U030A>";"<U0061>";"<U0061><U0075>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
--- a/localedata/locales/wo_SN 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/wo_SN 2018-07-17 17:55:53.000000000 +0000
@@ -55,6 +55,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
--- a/localedata/locales/xh_ZA 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/xh_ZA 2018-07-17 17:55:53.000000000 +0000
@@ -66,6 +66,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
--- a/localedata/locales/yi_US 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/yi_US 2018-07-17 17:55:53.000000000 +0000
@@ -73,6 +73,7 @@
<U05F0> "<U05D5><U05D5>";"<U0077><U0077>"
<U05F1> "<U05D5><U05D9>";"<U0077><U006A>"
<U05F2> "<U05D9><U05D9>";"<U006A><U006A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
--- a/localedata/locales/zh_CN 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/zh_CN 2018-07-17 17:55:53.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
class "hanzi"; /
diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
--- a/localedata/locales/zu_ZA 2018-07-17 17:49:22.000000000 +0000
+++ b/localedata/locales/zu_ZA 2018-07-17 17:55:53.000000000 +0000
@@ -70,6 +70,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
Egor Kobylkin
2018-10-03 09:32:00 UTC
Permalink
Post by Keld Simonsen
Hi
Please note that translitteration of Cyrillic to latin is not universal.
There are different schemes for for example German, English and Danish, and
there is also an ISO standard for it.
Thanks for your feedback, Keld!

Could the locale maintainers that wouldn't like to include this patch
explicitly state so here?

That is:
- In the case that there is a different preferred cyrillic
transliteration table for any specific locale their maintainers may want
to point me to it so I can supply a separate table/patch.
- Or they could state explicitly that for some reason they would like to
exclude their locale from the patch for a default cyrillic
transliteration altogether.

--Egor
Post by Keld Simonsen
But do go forward with fixing this bug.
Best regards
Keld
Post by Egor Kobylkin
Ping.
Absent of feedback I am wondering if anything could be missing in this
patch from the maintainers standpoint. More than two months have passed
since the original submission.
If I can be of assistance, please do not hesitate to contact me,
Egor Kobylkin
Post by Egor Kobylkin
Dear locale maintainers,
fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]
add Cyrillic transliteration table translit_cyrillic file
https://sourceware.org/bugzilla/attachment.cgi?id=8591 [7]
to localedata/locales/ and include it in all your locales going forward.
Patch included inline below.
This is a re-submission for the consideration for 2.29 on a request from
Carlos O'Donell https://sourceware.org/ml/libc-alpha/2018-07/msg00506.html
From this patch I have excluded locales that already mention cyrillic or
az_AZ
iso14651_t1_common
ky_KG
mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
Their maintainers are requested to make an explicit decision on how and
whether at all to include this patch.
The glibc wiki explicitly lists this use case as the test example
LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt
currently it fails on Cyrillic texts in most locales including ru_RU [1]
LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt |grep CYRILLIC
CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.
- It produces a string of question marks and spaces.
CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.
The root problem is the missing transliteration table that I am
supplying here. Furthermore it has to be referenced/included into the
active locale at the compilation time to be used by iconv.
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.
While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration has only ASCII codes but still can be read by a native
speaker. Among other things it is useful for processing the Cyrillic
texts and filenames by programs or on systems that are not specifically
prepared to work with Cyrillic, don't have corresponding fonts installed
or can't handle UTF-8.
The transliteration table itself is attached as a file translit_cyrillic
[7]. Its content (mapping) is based on GOST 7.79-2000 official source
(Federal Agency on Technical Regulating and Metrology Of Russian
Federation [2]). Technically an independent but identical source [3] was
used and prepared in a spreadsheet [6].
The documentation suggests that the transliteration tables inclusion is
done by adding *include "translit_cyrillic";""* string into LC_CTYPE
translit_start section
http://man7.org/linux/man-pages/man5/locale.5.html [5]
Practically I have searched for all locales that have a
translit_start/end stance and generated a patch for them.
The Cyrillic transliteration of e.g. Russian text may have already
worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
have their transliteration tables included inline.
However it would not be the standard Russian Cyrillic transliteration as
described above.
I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files. Volodymyr
exclusion.
[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=8590
[7] translit_cyrillic https://sourceware.org/bugzilla/attachment.cgi?id=8591
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=8618
Best regards,
Egor Kobylkin
---
[BZ #2872]
* locales/translit_cyrillic: add Russian GOST 7.79-2000 transliteration
table from Cyrillic to Latin.
* locales/C: add include "translit_cyrillic";"" to LC_CTYPE translit
section.
* locales/aa_DJ: likewise
* locales/af_ZA: likewise
* locales/ak_GH: likewise
* locales/am_ET: likewise
* locales/ar_EG: likewise
* locales/be_BY: likewise
* locales/bem_ZM: likewise
* locales/ber_DZ: likewise
* locales/ber_MA: likewise
* locales/bg_BG: likewise
* locales/bi_VU: likewise
* locales/bn_BD: likewise
* locales/bo_CN: likewise
* locales/ca_ES: likewise
* locales/ce_RU: likewise
* locales/cs_CZ: likewise
* locales/cv_RU: likewise
* locales/cy_GB: likewise
* locales/da_DK: likewise
* locales/de_DE: likewise
* locales/dv_MV: likewise
* locales/dz_BT: likewise
* locales/el_GR: likewise
* locales/en_GB: likewise
* locales/en_NG: likewise
* locales/en_ZM: likewise
* locales/es_CU: likewise
* locales/es_ES: likewise
* locales/et_EE: likewise
* locales/fa_IR: likewise
* locales/ff_SN: likewise
* locales/fi_FI: likewise
* locales/fr_FR: likewise
* locales/ga_IE: likewise
* locales/gd_GB: likewise
* locales/gu_IN: likewise
* locales/gv_GB: likewise
* locales/he_IL: likewise
* locales/hi_IN: likewise
* locales/hif_FJ: likewise
* locales/hr_HR: likewise
* locales/ht_HT: likewise
* locales/hu_HU: likewise
* locales/hy_AM: likewise
* locales/id_ID: likewise
* locales/is_IS: likewise
* locales/it_IT: likewise
* locales/ja_JP: likewise
* locales/kk_KZ: likewise
* locales/km_KH: likewise
* locales/kn_IN: likewise
* locales/ko_KR: likewise
* locales/ks_IN: likewise
* locales/kw_GB: likewise
* locales/lb_LU: likewise
* locales/lg_UG: likewise
* locales/lij_IT: likewise
* locales/ln_CD: likewise
* locales/lo_LA: likewise
* locales/lt_LT: likewise
* locales/lv_LV: likewise
* locales/mg_MG: likewise
* locales/mhr_RU: likewise
* locales/mk_MK: likewise
* locales/ml_IN: likewise
* locales/ms_MY: likewise
* locales/mt_MT: likewise
* locales/nb_NO: likewise
* locales/ne_NP: likewise
* locales/nhn_MX: likewise
* locales/niu_NU: likewise
* locales/niu_NZ: likewise
* locales/nl_NL: likewise
* locales/nr_ZA: likewise
* locales/oc_FR: likewise
* locales/om_KE: likewise
* locales/or_IN: likewise
* locales/os_RU: likewise
* locales/pa_IN: likewise
* locales/pa_PK: likewise
* locales/pl_PL: likewise
* locales/pt_PT: likewise
* locales/quz_PE: likewise
* locales/ro_RO: likewise
* locales/ru_RU: likewise
* locales/rw_RW: likewise
* locales/sa_IN: likewise
* locales/sd_IN: likewise
* locales/sd_PK: likewise
* locales/se_NO: likewise
* locales/sgs_LT: likewise
* locales/si_LK: likewise
* locales/sk_SK: likewise
* locales/sl_SI: likewise
* locales/sm_WS: likewise
* locales/so_SO: likewise
* locales/sq_AL: likewise
* locales/ss_ZA: likewise
* locales/st_ZA: likewise
* locales/sv_SE: likewise
* locales/sw_KE: likewise
* locales/ta_IN: likewise
* locales/te_IN: likewise
* locales/th_TH: likewise
* locales/ti_ET: likewise
* locales/tn_ZA: likewise
* locales/to_TO: likewise
* locales/tpi_PG: likewise
* locales/tr_TR: likewise
* locales/ts_ZA: likewise
* locales/unm_US: likewise
* locales/ur_IN: likewise
* locales/ur_PK: likewise
* locales/ve_ZA: likewise
* locales/vi_VN: likewise
* locales/wa_BE: likewise
* locales/wo_SN: likewise
* locales/xh_ZA: likewise
* locales/yi_US: likewise
* locales/zh_CN: likewise
* locales/zu_ZA: likewise
diff -uNr a/localedata/locales/C b/localedata/locales/C
--- a/localedata/locales/C 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/C 2018-07-17 17:55:47.000000000 +0000
@@ -2292,6 +2292,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
--- a/localedata/locales/aa_DJ 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/aa_DJ 2018-07-17 17:55:47.000000000 +0000
@@ -70,6 +70,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
--- a/localedata/locales/af_ZA 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/af_ZA 2018-07-17 17:55:47.000000000 +0000
@@ -72,6 +72,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
--- a/localedata/locales/ak_GH 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/ak_GH 2018-07-17 17:55:47.000000000 +0000
@@ -56,6 +56,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
--- a/localedata/locales/am_ET 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/am_ET 2018-07-17 17:55:47.000000000 +0000
@@ -1396,6 +1396,7 @@
<U137A> <U0060><U0039><U0030>
<U137B> <U0060><U0031><U0030><U0030>
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
--- a/localedata/locales/ar_EG 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/ar_EG 2018-07-17 17:55:48.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
--- a/localedata/locales/be_BY 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/be_BY 2018-07-17 17:55:48.000000000 +0000
@@ -69,6 +69,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
--- a/localedata/locales/bem_ZM 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bem_ZM 2018-07-17 17:55:48.000000000 +0000
@@ -42,6 +42,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
--- a/localedata/locales/ber_DZ 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ber_DZ 2018-07-17 17:55:48.000000000 +0000
@@ -166,6 +166,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
--- a/localedata/locales/ber_MA 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ber_MA 2018-07-17 17:55:48.000000000 +0000
@@ -86,6 +86,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
--- a/localedata/locales/bg_BG 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bg_BG 2018-07-17 17:55:48.000000000 +0000
@@ -49,6 +49,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
--- a/localedata/locales/bi_VU 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bi_VU 2018-07-17 17:55:48.000000000 +0000
@@ -39,6 +39,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
--- a/localedata/locales/bn_BD 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bn_BD 2018-07-17 17:55:48.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
--- a/localedata/locales/bo_CN 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bo_CN 2018-07-17 17:55:48.000000000 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
--- a/localedata/locales/ca_ES 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ca_ES 2018-07-17 17:55:48.000000000 +0000
@@ -72,6 +72,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
--- a/localedata/locales/ce_RU 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ce_RU 2018-07-17 17:55:48.000000000 +0000
@@ -39,6 +39,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
--- a/localedata/locales/cs_CZ 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/cs_CZ 2018-07-17 17:55:48.000000000 +0000
@@ -2311,6 +2311,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
--- a/localedata/locales/cv_RU 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/cv_RU 2018-07-17 17:55:48.000000000 +0000
@@ -109,6 +109,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
--- a/localedata/locales/cy_GB 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/cy_GB 2018-07-17 17:55:48.000000000 +0000
@@ -69,6 +69,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
--- a/localedata/locales/da_DK 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/da_DK 2018-07-17 17:55:48.000000000 +0000
@@ -167,6 +167,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
--- a/localedata/locales/de_DE 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/de_DE 2018-07-17 17:55:48.000000000 +0000
@@ -78,6 +78,7 @@
% DOUBLE HIGH-REVERSED-9 QUOTATION MARK
<U201F> <U00AB>;<U0022>
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
--- a/localedata/locales/dv_MV 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/dv_MV 2018-07-17 17:55:48.000000000 +0000
@@ -52,6 +52,7 @@
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
--- a/localedata/locales/dz_BT 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/dz_BT 2018-07-17 17:55:48.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
--- a/localedata/locales/el_GR 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/el_GR 2018-07-17 17:55:48.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
--- a/localedata/locales/en_GB 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/en_GB 2018-07-17 17:55:48.000000000 +0000
@@ -55,6 +55,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
--- a/localedata/locales/en_NG 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/en_NG 2018-07-17 17:55:48.000000000 +0000
@@ -50,6 +50,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
--- a/localedata/locales/en_ZM 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/en_ZM 2018-07-17 17:55:48.000000000 +0000
@@ -42,6 +42,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
--- a/localedata/locales/es_CU 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/es_CU 2018-07-17 17:55:48.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
--- a/localedata/locales/es_ES 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/es_ES 2018-07-17 17:55:49.000000000 +0000
@@ -73,6 +73,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
--- a/localedata/locales/et_EE 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/et_EE 2018-07-17 17:55:49.000000000 +0000
@@ -109,6 +109,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
--- a/localedata/locales/fa_IR 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/fa_IR 2018-07-17 17:55:49.000000000 +0000
@@ -79,6 +79,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
--- a/localedata/locales/ff_SN 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/ff_SN 2018-07-17 17:55:49.000000000 +0000
@@ -42,6 +42,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
--- a/localedata/locales/fi_FI 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/fi_FI 2018-07-17 17:55:49.000000000 +0000
@@ -137,6 +137,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
--- a/localedata/locales/fr_FR 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/fr_FR 2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@
% In France, accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
--- a/localedata/locales/ga_IE 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/ga_IE 2018-07-17 17:55:49.000000000 +0000
@@ -54,6 +54,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
--- a/localedata/locales/gd_GB 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/gd_GB 2018-07-17 17:55:49.000000000 +0000
@@ -47,6 +47,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
--- a/localedata/locales/gu_IN 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/gu_IN 2018-07-17 17:55:49.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
--- a/localedata/locales/gv_GB 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/gv_GB 2018-07-17 17:55:49.000000000 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
--- a/localedata/locales/he_IL 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/he_IL 2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
--- a/localedata/locales/hi_IN 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hi_IN 2018-07-17 17:55:49.000000000 +0000
@@ -61,6 +61,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
--- a/localedata/locales/hif_FJ 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hif_FJ 2018-07-17 17:55:49.000000000 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
--- a/localedata/locales/hr_HR 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hr_HR 2018-07-17 17:55:49.000000000 +0000
@@ -153,6 +153,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
--- a/localedata/locales/ht_HT 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/ht_HT 2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
--- a/localedata/locales/hu_HU 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hu_HU 2018-07-17 17:55:49.000000000 +0000
@@ -478,6 +478,7 @@
<U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
<U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
--- a/localedata/locales/hy_AM 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/hy_AM 2018-07-17 17:55:49.000000000 +0000
@@ -77,6 +77,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
--- a/localedata/locales/id_ID 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/id_ID 2018-07-17 17:55:49.000000000 +0000
@@ -55,6 +55,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
--- a/localedata/locales/is_IS 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/is_IS 2018-07-17 17:55:49.000000000 +0000
@@ -2161,6 +2161,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
--- a/localedata/locales/it_IT 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/it_IT 2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
--- a/localedata/locales/ja_JP 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/ja_JP 2018-07-17 17:55:49.000000000 +0000
@@ -1682,6 +1682,7 @@
include "translit_combining";""
include "translit_cjk_variants";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
--- a/localedata/locales/kk_KZ 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/kk_KZ 2018-07-17 17:55:50.000000000 +0000
@@ -158,6 +158,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
--- a/localedata/locales/km_KH 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/km_KH 2018-07-17 17:55:50.000000000 +0000
@@ -873,6 +873,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
--- a/localedata/locales/kn_IN 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/kn_IN 2018-07-17 17:55:50.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
--- a/localedata/locales/ko_KR 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/ko_KR 2018-07-17 17:55:50.000000000 +0000
@@ -6099,6 +6099,7 @@
include "translit_combining";""
include "translit_hangul";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
--- a/localedata/locales/ks_IN 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/ks_IN 2018-07-17 17:55:50.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
--- a/localedata/locales/kw_GB 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/kw_GB 2018-07-17 17:55:50.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
--- a/localedata/locales/lb_LU 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/lb_LU 2018-07-17 17:55:50.000000000 +0000
@@ -78,6 +78,7 @@
% LATIN SMALL LETTER E WITH CIRCUMFLEX
<U00EA> "<U0065><U005E>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
--- a/localedata/locales/lg_UG 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/lg_UG 2018-07-17 17:55:50.000000000 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
--- a/localedata/locales/lij_IT 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/lij_IT 2018-07-17 17:55:50.000000000 +0000
@@ -47,6 +47,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
--- a/localedata/locales/ln_CD 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ln_CD 2018-07-17 17:55:50.000000000 +0000
@@ -39,6 +39,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
--- a/localedata/locales/lo_LA 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/lo_LA 2018-07-17 17:55:50.000000000 +0000
@@ -51,6 +51,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
--- a/localedata/locales/lt_LT 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/lt_LT 2018-07-17 17:55:50.000000000 +0000
@@ -77,6 +77,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
--- a/localedata/locales/lv_LV 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/lv_LV 2018-07-17 17:55:50.000000000 +0000
@@ -2122,6 +2122,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
--- a/localedata/locales/mg_MG 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mg_MG 2018-07-17 17:55:50.000000000 +0000
@@ -55,6 +55,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
--- a/localedata/locales/mhr_RU 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mhr_RU 2018-07-17 17:55:50.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
--- a/localedata/locales/mk_MK 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mk_MK 2018-07-17 17:55:50.000000000 +0000
@@ -49,6 +49,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
--- a/localedata/locales/ml_IN 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ml_IN 2018-07-17 17:55:50.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
%
diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
--- a/localedata/locales/ms_MY 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ms_MY 2018-07-17 17:55:50.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
--- a/localedata/locales/mt_MT 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mt_MT 2018-07-17 17:55:50.000000000 +0000
@@ -47,6 +47,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
@@ -53,6 +53,7 @@
% accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
--- a/localedata/locales/nb_NO 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/nb_NO 2018-07-17 17:55:50.000000000 +0000
@@ -154,6 +154,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
--- a/localedata/locales/ne_NP 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ne_NP 2018-07-17 17:55:50.000000000 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
--- a/localedata/locales/nhn_MX 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/nhn_MX 2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
--- a/localedata/locales/niu_NU 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/niu_NU 2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
--- a/localedata/locales/niu_NZ 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/niu_NZ 2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
--- a/localedata/locales/nl_NL 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/nl_NL 2018-07-17 17:55:51.000000000 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
--- a/localedata/locales/nr_ZA 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/nr_ZA 2018-07-17 17:55:51.000000000 +0000
@@ -66,6 +66,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
--- a/localedata/locales/oc_FR 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/oc_FR 2018-07-17 17:55:51.000000000 +0000
@@ -62,6 +62,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
--- a/localedata/locales/om_KE 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/om_KE 2018-07-17 17:55:51.000000000 +0000
@@ -140,6 +140,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
--- a/localedata/locales/or_IN 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/or_IN 2018-07-17 17:55:51.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
--- a/localedata/locales/os_RU 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/os_RU 2018-07-17 17:55:51.000000000 +0000
@@ -70,6 +70,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
--- a/localedata/locales/pa_IN 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pa_IN 2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
--- a/localedata/locales/pa_PK 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pa_PK 2018-07-17 17:55:51.000000000 +0000
@@ -58,6 +58,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
--- a/localedata/locales/pl_PL 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pl_PL 2018-07-17 17:55:51.000000000 +0000
@@ -142,6 +142,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
--- a/localedata/locales/pt_PT 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pt_PT 2018-07-17 17:55:51.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
--- a/localedata/locales/quz_PE 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/quz_PE 2018-07-17 17:55:51.000000000 +0000
@@ -57,6 +57,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
--- a/localedata/locales/ro_RO 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/ro_RO 2018-07-17 17:55:51.000000000 +0000
@@ -144,6 +144,7 @@
<U0162> "<U021A>";"<U0054>"
<U0163> "<U021B>";"<U0074>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
--- a/localedata/locales/ru_RU 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/ru_RU 2018-07-17 17:55:51.000000000 +0000
@@ -74,6 +74,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
--- a/localedata/locales/rw_RW 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/rw_RW 2018-07-17 17:55:51.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
--- a/localedata/locales/sa_IN 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sa_IN 2018-07-17 17:55:51.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
--- a/localedata/locales/sd_IN 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sd_IN 2018-07-17 17:55:51.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
+0000
+0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
--- a/localedata/locales/sd_PK 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sd_PK 2018-07-17 17:55:51.000000000 +0000
@@ -39,6 +39,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
--- a/localedata/locales/se_NO 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/se_NO 2018-07-17 17:55:51.000000000 +0000
@@ -205,6 +205,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
--- a/localedata/locales/sgs_LT 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sgs_LT 2018-07-17 17:55:52.000000000 +0000
@@ -59,6 +59,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
--- a/localedata/locales/si_LK 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/si_LK 2018-07-17 17:55:52.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
--- a/localedata/locales/sk_SK 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sk_SK 2018-07-17 17:55:52.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
--- a/localedata/locales/sl_SI 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sl_SI 2018-07-17 17:55:52.000000000 +0000
@@ -91,6 +91,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
--- a/localedata/locales/sm_WS 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sm_WS 2018-07-17 17:55:52.000000000 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
--- a/localedata/locales/so_SO 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/so_SO 2018-07-17 17:55:52.000000000 +0000
@@ -70,6 +70,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
--- a/localedata/locales/sq_AL 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sq_AL 2018-07-17 17:55:52.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
--- a/localedata/locales/ss_ZA 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/ss_ZA 2018-07-17 17:55:52.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
--- a/localedata/locales/st_ZA 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/st_ZA 2018-07-17 17:55:52.000000000 +0000
@@ -64,6 +64,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
--- a/localedata/locales/sv_SE 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sv_SE 2018-07-17 17:55:52.000000000 +0000
@@ -139,6 +139,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
--- a/localedata/locales/sw_KE 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sw_KE 2018-07-17 17:55:52.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
--- a/localedata/locales/ta_IN 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/ta_IN 2018-07-17 17:55:52.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
--- a/localedata/locales/te_IN 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/te_IN 2018-07-17 17:55:52.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
--- a/localedata/locales/th_TH 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/th_TH 2018-07-17 17:55:52.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
--- a/localedata/locales/ti_ET 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/ti_ET 2018-07-17 17:55:52.000000000 +0000
@@ -866,6 +866,7 @@
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
--- a/localedata/locales/tn_ZA 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/tn_ZA 2018-07-17 17:55:52.000000000 +0000
@@ -69,6 +69,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
--- a/localedata/locales/to_TO 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/to_TO 2018-07-17 17:55:52.000000000 +0000
@@ -36,6 +36,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
--- a/localedata/locales/tpi_PG 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/tpi_PG 2018-07-17 17:55:52.000000000 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
--- a/localedata/locales/tr_TR 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/tr_TR 2018-07-17 17:55:52.000000000 +0000
@@ -2430,6 +2430,7 @@
% TURKISH LIRA SIGN
<U20BA> "<U0054><U004C>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/translit_cyrillic
b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000
+0000
+++ b/localedata/locales/translit_cyrillic 2018-07-17 17:55:52.000000000
+0000
@@ -0,0 +1,151 @@
+escape_char /
+comment_char %
+
+% Transliterations that converts cyrillic letters to ascii symbols
inspired by GOST 7.79-2000
+% https://sourceware.org/bugzilla/show_bug.cgi?id=2872
+% Generated from UnicodeData.txt with
+% https://sourceware.org/bugzilla/attachment.cgi?id=8590
+% Up to three characters are required to do a reversible transliteration.
+
+LC_CTYPE
+
+translit_start
+
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> "<U0059><U004F>";<U0059>
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> "<U005A><U0048>";<U005A>
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> "<U0043><U005A>";<U0043>
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> "<U0043><U0048>";<U0043>
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> "<U0053><U0048>";<U0053>
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> "<U0053><U0048><U0048>";<U0053>
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> "<U0060><U0060>";<U0060>
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> "<U0059><U0027>";<U0059>
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> "<U0045><U0060>";<U0045>
+% CYRILLIC CAPITAL LETTER YU
+<U042E> "<U0059><U0055>";<U0059>
+% CYRILLIC CAPITAL LETTER YA
+<U042F> "<U0059><U0041>";<U0059>
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> "<U007A><U0068>";<U007A>
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> "<U0063><U007A>";<U0063>
+% CYRILLIC SMALL LETTER CHE
+<U0447> "<U0063><U0068>";<U0063>
+% CYRILLIC SMALL LETTER SHA
+<U0448> "<U0073><U0068>";<U0073>
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> "<U0073><U0068><U0068>";<U0073>
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> "<U0060><U0060>";<U0060>
+% CYRILLIC SMALL LETTER YERU
+<U044B> "<U0079><U0027>";<U0079>
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> "<U0065><U0060>";<U0065>
+% CYRILLIC SMALL LETTER YU
+<U044E> "<U0079><U0075>";<U0079>
+% CYRILLIC SMALL LETTER YA
+<U044F> "<U0079><U0061>";<U0079>
+% CYRILLIC SMALL LETTER IO
+<U0451> "<U0079><U006F>";<U0079>
+
+
+translit_end
+
+END LC_CTYPE
diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
--- a/localedata/locales/ts_ZA 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ts_ZA 2018-07-17 17:55:52.000000000 +0000
@@ -64,6 +64,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
--- a/localedata/locales/unm_US 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/unm_US 2018-07-17 17:55:52.000000000 +0000
@@ -48,6 +48,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
--- a/localedata/locales/ur_IN 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ur_IN 2018-07-17 17:55:53.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
--- a/localedata/locales/ur_PK 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ur_PK 2018-07-17 17:55:53.000000000 +0000
@@ -58,6 +58,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
--- a/localedata/locales/ve_ZA 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ve_ZA 2018-07-17 17:55:53.000000000 +0000
@@ -67,6 +67,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
--- a/localedata/locales/vi_VN 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/vi_VN 2018-07-17 17:55:53.000000000 +0000
@@ -58,6 +58,7 @@
% dong sign -> d// -> dd
<U20AB> "<U0111>";"<U0064><U0064>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
--- a/localedata/locales/wa_BE 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/wa_BE 2018-07-17 17:55:53.000000000 +0000
@@ -69,6 +69,7 @@
<U00C5> "<U0041><U030A>";"<U0041>";"<U0041><U0055>"
<U00E5> "<U0061><U030A>";"<U0061>";"<U0061><U0075>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
--- a/localedata/locales/wo_SN 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/wo_SN 2018-07-17 17:55:53.000000000 +0000
@@ -55,6 +55,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
--- a/localedata/locales/xh_ZA 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/xh_ZA 2018-07-17 17:55:53.000000000 +0000
@@ -66,6 +66,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
--- a/localedata/locales/yi_US 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/yi_US 2018-07-17 17:55:53.000000000 +0000
@@ -73,6 +73,7 @@
<U05F0> "<U05D5><U05D5>";"<U0077><U0077>"
<U05F1> "<U05D5><U05D9>";"<U0077><U006A>"
<U05F2> "<U05D9><U05D9>";"<U006A><U006A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
--- a/localedata/locales/zh_CN 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/zh_CN 2018-07-17 17:55:53.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
class "hanzi"; /
diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
--- a/localedata/locales/zu_ZA 2018-07-17 17:49:22.000000000 +0000
+++ b/localedata/locales/zu_ZA 2018-07-17 17:55:53.000000000 +0000
@@ -70,6 +70,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
Marko Myllynen
2018-10-05 08:43:46 UTC
Permalink
Hi Egor,

Thanks for your patience with this one.
Post by Egor Kobylkin
Post by Keld Simonsen
Please note that translitteration of Cyrillic to latin is not universal.
There are different schemes for for example German, English and Danish, and
there is also an ISO standard for it.
Thanks for your feedback, Keld!
Could the locale maintainers that wouldn't like to include this patch
explicitly state so here?
- In the case that there is a different preferred cyrillic
transliteration table for any specific locale their maintainers may want
to point me to it so I can supply a separate table/patch.
- Or they could state explicitly that for some reason they would like to
exclude their locale from the patch for a default cyrillic
transliteration altogether.
The Wikipedia article https://en.wikipedia.org/wiki/ISO_9 helps to
understand that ISO 9:1995 and GOST 7.79-2000 System A are identical so
perhaps you could mention both ISO 9 and the Wikipedia article in the
commit log. translit_cyrillic includes every transliteration defined in
ISO 9:1995 and GOST 7.79-2000, correct?

I think those locales which already have Cyrillic transliteration
defined it would be best to leave them as-is (as you've done) unless
there are some issues with them, there's probably a good reason why they
have been added in the first place.

For other locales, using ISO 9 instead of not doing transliteration at
all may not be entirely correct but I'd suppose it's better to provide
at least some sort of transliteration (even if not entirely correct)
than sequences of question marks. But as you say, locale maintainers may
know better the case for individual locales.

Wrt language-specific differences Keld mentioned, Finnish Wikipedia
article on transliteration gives an example, see the table on right at
https://fi.wikipedia.org/wiki/Siirtokirjoitus for Russian /
international / Finnish / Swedish / English / French / German / Polish /
phonetic transliteration of a Russian name. (The table also shows that
for correct transliteration ASCII letters are not enough for some
languages.)

Some of the differences and language-specific aspects are probably
impossible to take fully into account within the locale system we have
today. For example, in Finnish (the tables at
http://jkorpela.fi/iso9.html8 and
https://fi.wikipedia.org/wiki/Ven%C3%A4j%C3%A4n_translitterointi might
also be helpful):

1) transliteration of Russian is mostly as per ISO 9 but with national
differences defined in SFS 4900
2) transliteration of Russian and Ukrainian names have some slight
differences according to http://jkorpela.fi/iso9.html8
3) transliteration of a letter depends on its position within a word or
pronunciation of adjacent letters, for example U+0435 becomes U+0065 (e)
except when at the beginning of a word it becomes U+006A U+0065 (je)

Hopefully we'll hear comments from others as well. Once your patch is
merged, I'll try to come up with the needed locale-specific changes for
fi_FI, some differences referred to in 1) above are straightforward to
implement but for 2) and 3) some compromises probably need to be made,
unfortunately.

Thanks,
Post by Egor Kobylkin
Post by Keld Simonsen
Post by Egor Kobylkin
Ping.
Absent of feedback I am wondering if anything could be missing in this
patch from the maintainers standpoint. More than two months have passed
since the original submission.
If I can be of assistance, please do not hesitate to contact me,
Egor Kobylkin
Post by Egor Kobylkin
Dear locale maintainers,
fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]
add Cyrillic transliteration table translit_cyrillic file
https://sourceware.org/bugzilla/attachment.cgi?id=8591 [7]
to localedata/locales/ and include it in all your locales going forward.
Patch included inline below.
This is a re-submission for the consideration for 2.29 on a request from
Carlos O'Donell https://sourceware.org/ml/libc-alpha/2018-07/msg00506.html
From this patch I have excluded locales that already mention cyrillic or
az_AZ
iso14651_t1_common
ky_KG
mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
Their maintainers are requested to make an explicit decision on how and
whether at all to include this patch.
The glibc wiki explicitly lists this use case as the test example
LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt
currently it fails on Cyrillic texts in most locales including ru_RU [1]
LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt |grep CYRILLIC
CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.
- It produces a string of question marks and spaces.
CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.
The root problem is the missing transliteration table that I am
supplying here. Furthermore it has to be referenced/included into the
active locale at the compilation time to be used by iconv.
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.
While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration has only ASCII codes but still can be read by a native
speaker. Among other things it is useful for processing the Cyrillic
texts and filenames by programs or on systems that are not specifically
prepared to work with Cyrillic, don't have corresponding fonts installed
or can't handle UTF-8.
The transliteration table itself is attached as a file translit_cyrillic
[7]. Its content (mapping) is based on GOST 7.79-2000 official source
(Federal Agency on Technical Regulating and Metrology Of Russian
Federation [2]). Technically an independent but identical source [3] was
used and prepared in a spreadsheet [6].
The documentation suggests that the transliteration tables inclusion is
done by adding *include "translit_cyrillic";""* string into LC_CTYPE
translit_start section
http://man7.org/linux/man-pages/man5/locale.5.html [5]
Practically I have searched for all locales that have a
translit_start/end stance and generated a patch for them.
The Cyrillic transliteration of e.g. Russian text may have already
worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
have their transliteration tables included inline.
However it would not be the standard Russian Cyrillic transliteration as
described above.
I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files. Volodymyr
exclusion.
[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=8590
[7] translit_cyrillic https://sourceware.org/bugzilla/attachment.cgi?id=8591
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=8618
Best regards,
Egor Kobylkin
---
[BZ #2872]
* locales/translit_cyrillic: add Russian GOST 7.79-2000 transliteration
table from Cyrillic to Latin.
* locales/C: add include "translit_cyrillic";"" to LC_CTYPE translit
section.
* locales/aa_DJ: likewise
* locales/af_ZA: likewise
* locales/ak_GH: likewise
* locales/am_ET: likewise
* locales/ar_EG: likewise
* locales/be_BY: likewise
* locales/bem_ZM: likewise
* locales/ber_DZ: likewise
* locales/ber_MA: likewise
* locales/bg_BG: likewise
* locales/bi_VU: likewise
* locales/bn_BD: likewise
* locales/bo_CN: likewise
* locales/ca_ES: likewise
* locales/ce_RU: likewise
* locales/cs_CZ: likewise
* locales/cv_RU: likewise
* locales/cy_GB: likewise
* locales/da_DK: likewise
* locales/de_DE: likewise
* locales/dv_MV: likewise
* locales/dz_BT: likewise
* locales/el_GR: likewise
* locales/en_GB: likewise
* locales/en_NG: likewise
* locales/en_ZM: likewise
* locales/es_CU: likewise
* locales/es_ES: likewise
* locales/et_EE: likewise
* locales/fa_IR: likewise
* locales/ff_SN: likewise
* locales/fi_FI: likewise
* locales/fr_FR: likewise
* locales/ga_IE: likewise
* locales/gd_GB: likewise
* locales/gu_IN: likewise
* locales/gv_GB: likewise
* locales/he_IL: likewise
* locales/hi_IN: likewise
* locales/hif_FJ: likewise
* locales/hr_HR: likewise
* locales/ht_HT: likewise
* locales/hu_HU: likewise
* locales/hy_AM: likewise
* locales/id_ID: likewise
* locales/is_IS: likewise
* locales/it_IT: likewise
* locales/ja_JP: likewise
* locales/kk_KZ: likewise
* locales/km_KH: likewise
* locales/kn_IN: likewise
* locales/ko_KR: likewise
* locales/ks_IN: likewise
* locales/kw_GB: likewise
* locales/lb_LU: likewise
* locales/lg_UG: likewise
* locales/lij_IT: likewise
* locales/ln_CD: likewise
* locales/lo_LA: likewise
* locales/lt_LT: likewise
* locales/lv_LV: likewise
* locales/mg_MG: likewise
* locales/mhr_RU: likewise
* locales/mk_MK: likewise
* locales/ml_IN: likewise
* locales/ms_MY: likewise
* locales/mt_MT: likewise
* locales/nb_NO: likewise
* locales/ne_NP: likewise
* locales/nhn_MX: likewise
* locales/niu_NU: likewise
* locales/niu_NZ: likewise
* locales/nl_NL: likewise
* locales/nr_ZA: likewise
* locales/oc_FR: likewise
* locales/om_KE: likewise
* locales/or_IN: likewise
* locales/os_RU: likewise
* locales/pa_IN: likewise
* locales/pa_PK: likewise
* locales/pl_PL: likewise
* locales/pt_PT: likewise
* locales/quz_PE: likewise
* locales/ro_RO: likewise
* locales/ru_RU: likewise
* locales/rw_RW: likewise
* locales/sa_IN: likewise
* locales/sd_IN: likewise
* locales/sd_PK: likewise
* locales/se_NO: likewise
* locales/sgs_LT: likewise
* locales/si_LK: likewise
* locales/sk_SK: likewise
* locales/sl_SI: likewise
* locales/sm_WS: likewise
* locales/so_SO: likewise
* locales/sq_AL: likewise
* locales/ss_ZA: likewise
* locales/st_ZA: likewise
* locales/sv_SE: likewise
* locales/sw_KE: likewise
* locales/ta_IN: likewise
* locales/te_IN: likewise
* locales/th_TH: likewise
* locales/ti_ET: likewise
* locales/tn_ZA: likewise
* locales/to_TO: likewise
* locales/tpi_PG: likewise
* locales/tr_TR: likewise
* locales/ts_ZA: likewise
* locales/unm_US: likewise
* locales/ur_IN: likewise
* locales/ur_PK: likewise
* locales/ve_ZA: likewise
* locales/vi_VN: likewise
* locales/wa_BE: likewise
* locales/wo_SN: likewise
* locales/xh_ZA: likewise
* locales/yi_US: likewise
* locales/zh_CN: likewise
* locales/zu_ZA: likewise
diff -uNr a/localedata/locales/C b/localedata/locales/C
--- a/localedata/locales/C 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/C 2018-07-17 17:55:47.000000000 +0000
@@ -2292,6 +2292,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
--- a/localedata/locales/aa_DJ 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/aa_DJ 2018-07-17 17:55:47.000000000 +0000
@@ -70,6 +70,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
--- a/localedata/locales/af_ZA 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/af_ZA 2018-07-17 17:55:47.000000000 +0000
@@ -72,6 +72,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
--- a/localedata/locales/ak_GH 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/ak_GH 2018-07-17 17:55:47.000000000 +0000
@@ -56,6 +56,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
--- a/localedata/locales/am_ET 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/am_ET 2018-07-17 17:55:47.000000000 +0000
@@ -1396,6 +1396,7 @@
<U137A> <U0060><U0039><U0030>
<U137B> <U0060><U0031><U0030><U0030>
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
--- a/localedata/locales/ar_EG 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/ar_EG 2018-07-17 17:55:48.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
--- a/localedata/locales/be_BY 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/be_BY 2018-07-17 17:55:48.000000000 +0000
@@ -69,6 +69,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
--- a/localedata/locales/bem_ZM 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bem_ZM 2018-07-17 17:55:48.000000000 +0000
@@ -42,6 +42,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
--- a/localedata/locales/ber_DZ 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ber_DZ 2018-07-17 17:55:48.000000000 +0000
@@ -166,6 +166,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
--- a/localedata/locales/ber_MA 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ber_MA 2018-07-17 17:55:48.000000000 +0000
@@ -86,6 +86,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
--- a/localedata/locales/bg_BG 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bg_BG 2018-07-17 17:55:48.000000000 +0000
@@ -49,6 +49,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
--- a/localedata/locales/bi_VU 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bi_VU 2018-07-17 17:55:48.000000000 +0000
@@ -39,6 +39,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
--- a/localedata/locales/bn_BD 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bn_BD 2018-07-17 17:55:48.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
--- a/localedata/locales/bo_CN 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bo_CN 2018-07-17 17:55:48.000000000 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
--- a/localedata/locales/ca_ES 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ca_ES 2018-07-17 17:55:48.000000000 +0000
@@ -72,6 +72,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
--- a/localedata/locales/ce_RU 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ce_RU 2018-07-17 17:55:48.000000000 +0000
@@ -39,6 +39,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
--- a/localedata/locales/cs_CZ 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/cs_CZ 2018-07-17 17:55:48.000000000 +0000
@@ -2311,6 +2311,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
--- a/localedata/locales/cv_RU 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/cv_RU 2018-07-17 17:55:48.000000000 +0000
@@ -109,6 +109,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
--- a/localedata/locales/cy_GB 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/cy_GB 2018-07-17 17:55:48.000000000 +0000
@@ -69,6 +69,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
--- a/localedata/locales/da_DK 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/da_DK 2018-07-17 17:55:48.000000000 +0000
@@ -167,6 +167,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
--- a/localedata/locales/de_DE 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/de_DE 2018-07-17 17:55:48.000000000 +0000
@@ -78,6 +78,7 @@
% DOUBLE HIGH-REVERSED-9 QUOTATION MARK
<U201F> <U00AB>;<U0022>
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
--- a/localedata/locales/dv_MV 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/dv_MV 2018-07-17 17:55:48.000000000 +0000
@@ -52,6 +52,7 @@
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
--- a/localedata/locales/dz_BT 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/dz_BT 2018-07-17 17:55:48.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
--- a/localedata/locales/el_GR 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/el_GR 2018-07-17 17:55:48.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
--- a/localedata/locales/en_GB 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/en_GB 2018-07-17 17:55:48.000000000 +0000
@@ -55,6 +55,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
--- a/localedata/locales/en_NG 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/en_NG 2018-07-17 17:55:48.000000000 +0000
@@ -50,6 +50,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
--- a/localedata/locales/en_ZM 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/en_ZM 2018-07-17 17:55:48.000000000 +0000
@@ -42,6 +42,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
--- a/localedata/locales/es_CU 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/es_CU 2018-07-17 17:55:48.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
--- a/localedata/locales/es_ES 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/es_ES 2018-07-17 17:55:49.000000000 +0000
@@ -73,6 +73,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
--- a/localedata/locales/et_EE 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/et_EE 2018-07-17 17:55:49.000000000 +0000
@@ -109,6 +109,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
--- a/localedata/locales/fa_IR 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/fa_IR 2018-07-17 17:55:49.000000000 +0000
@@ -79,6 +79,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
--- a/localedata/locales/ff_SN 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/ff_SN 2018-07-17 17:55:49.000000000 +0000
@@ -42,6 +42,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
--- a/localedata/locales/fi_FI 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/fi_FI 2018-07-17 17:55:49.000000000 +0000
@@ -137,6 +137,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
--- a/localedata/locales/fr_FR 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/fr_FR 2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@
% In France, accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
--- a/localedata/locales/ga_IE 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/ga_IE 2018-07-17 17:55:49.000000000 +0000
@@ -54,6 +54,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
--- a/localedata/locales/gd_GB 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/gd_GB 2018-07-17 17:55:49.000000000 +0000
@@ -47,6 +47,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
--- a/localedata/locales/gu_IN 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/gu_IN 2018-07-17 17:55:49.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
--- a/localedata/locales/gv_GB 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/gv_GB 2018-07-17 17:55:49.000000000 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
--- a/localedata/locales/he_IL 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/he_IL 2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
--- a/localedata/locales/hi_IN 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hi_IN 2018-07-17 17:55:49.000000000 +0000
@@ -61,6 +61,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
--- a/localedata/locales/hif_FJ 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hif_FJ 2018-07-17 17:55:49.000000000 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
--- a/localedata/locales/hr_HR 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hr_HR 2018-07-17 17:55:49.000000000 +0000
@@ -153,6 +153,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
--- a/localedata/locales/ht_HT 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/ht_HT 2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
--- a/localedata/locales/hu_HU 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hu_HU 2018-07-17 17:55:49.000000000 +0000
@@ -478,6 +478,7 @@
<U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
<U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
--- a/localedata/locales/hy_AM 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/hy_AM 2018-07-17 17:55:49.000000000 +0000
@@ -77,6 +77,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
--- a/localedata/locales/id_ID 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/id_ID 2018-07-17 17:55:49.000000000 +0000
@@ -55,6 +55,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
--- a/localedata/locales/is_IS 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/is_IS 2018-07-17 17:55:49.000000000 +0000
@@ -2161,6 +2161,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
--- a/localedata/locales/it_IT 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/it_IT 2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
--- a/localedata/locales/ja_JP 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/ja_JP 2018-07-17 17:55:49.000000000 +0000
@@ -1682,6 +1682,7 @@
include "translit_combining";""
include "translit_cjk_variants";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
--- a/localedata/locales/kk_KZ 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/kk_KZ 2018-07-17 17:55:50.000000000 +0000
@@ -158,6 +158,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
--- a/localedata/locales/km_KH 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/km_KH 2018-07-17 17:55:50.000000000 +0000
@@ -873,6 +873,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
--- a/localedata/locales/kn_IN 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/kn_IN 2018-07-17 17:55:50.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
--- a/localedata/locales/ko_KR 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/ko_KR 2018-07-17 17:55:50.000000000 +0000
@@ -6099,6 +6099,7 @@
include "translit_combining";""
include "translit_hangul";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
--- a/localedata/locales/ks_IN 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/ks_IN 2018-07-17 17:55:50.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
--- a/localedata/locales/kw_GB 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/kw_GB 2018-07-17 17:55:50.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
--- a/localedata/locales/lb_LU 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/lb_LU 2018-07-17 17:55:50.000000000 +0000
@@ -78,6 +78,7 @@
% LATIN SMALL LETTER E WITH CIRCUMFLEX
<U00EA> "<U0065><U005E>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
--- a/localedata/locales/lg_UG 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/lg_UG 2018-07-17 17:55:50.000000000 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
--- a/localedata/locales/lij_IT 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/lij_IT 2018-07-17 17:55:50.000000000 +0000
@@ -47,6 +47,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
--- a/localedata/locales/ln_CD 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ln_CD 2018-07-17 17:55:50.000000000 +0000
@@ -39,6 +39,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
--- a/localedata/locales/lo_LA 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/lo_LA 2018-07-17 17:55:50.000000000 +0000
@@ -51,6 +51,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
--- a/localedata/locales/lt_LT 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/lt_LT 2018-07-17 17:55:50.000000000 +0000
@@ -77,6 +77,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
--- a/localedata/locales/lv_LV 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/lv_LV 2018-07-17 17:55:50.000000000 +0000
@@ -2122,6 +2122,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
--- a/localedata/locales/mg_MG 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mg_MG 2018-07-17 17:55:50.000000000 +0000
@@ -55,6 +55,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
--- a/localedata/locales/mhr_RU 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mhr_RU 2018-07-17 17:55:50.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
--- a/localedata/locales/mk_MK 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mk_MK 2018-07-17 17:55:50.000000000 +0000
@@ -49,6 +49,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
--- a/localedata/locales/ml_IN 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ml_IN 2018-07-17 17:55:50.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
%
diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
--- a/localedata/locales/ms_MY 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ms_MY 2018-07-17 17:55:50.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
--- a/localedata/locales/mt_MT 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mt_MT 2018-07-17 17:55:50.000000000 +0000
@@ -47,6 +47,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
@@ -53,6 +53,7 @@
% accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
--- a/localedata/locales/nb_NO 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/nb_NO 2018-07-17 17:55:50.000000000 +0000
@@ -154,6 +154,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
--- a/localedata/locales/ne_NP 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ne_NP 2018-07-17 17:55:50.000000000 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
--- a/localedata/locales/nhn_MX 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/nhn_MX 2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
--- a/localedata/locales/niu_NU 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/niu_NU 2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
--- a/localedata/locales/niu_NZ 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/niu_NZ 2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
--- a/localedata/locales/nl_NL 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/nl_NL 2018-07-17 17:55:51.000000000 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
--- a/localedata/locales/nr_ZA 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/nr_ZA 2018-07-17 17:55:51.000000000 +0000
@@ -66,6 +66,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
--- a/localedata/locales/oc_FR 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/oc_FR 2018-07-17 17:55:51.000000000 +0000
@@ -62,6 +62,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
--- a/localedata/locales/om_KE 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/om_KE 2018-07-17 17:55:51.000000000 +0000
@@ -140,6 +140,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
--- a/localedata/locales/or_IN 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/or_IN 2018-07-17 17:55:51.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
--- a/localedata/locales/os_RU 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/os_RU 2018-07-17 17:55:51.000000000 +0000
@@ -70,6 +70,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
--- a/localedata/locales/pa_IN 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pa_IN 2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
--- a/localedata/locales/pa_PK 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pa_PK 2018-07-17 17:55:51.000000000 +0000
@@ -58,6 +58,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
--- a/localedata/locales/pl_PL 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pl_PL 2018-07-17 17:55:51.000000000 +0000
@@ -142,6 +142,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
--- a/localedata/locales/pt_PT 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pt_PT 2018-07-17 17:55:51.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
--- a/localedata/locales/quz_PE 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/quz_PE 2018-07-17 17:55:51.000000000 +0000
@@ -57,6 +57,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
--- a/localedata/locales/ro_RO 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/ro_RO 2018-07-17 17:55:51.000000000 +0000
@@ -144,6 +144,7 @@
<U0162> "<U021A>";"<U0054>"
<U0163> "<U021B>";"<U0074>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
--- a/localedata/locales/ru_RU 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/ru_RU 2018-07-17 17:55:51.000000000 +0000
@@ -74,6 +74,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
--- a/localedata/locales/rw_RW 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/rw_RW 2018-07-17 17:55:51.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
--- a/localedata/locales/sa_IN 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sa_IN 2018-07-17 17:55:51.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
--- a/localedata/locales/sd_IN 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sd_IN 2018-07-17 17:55:51.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
+0000
+0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
--- a/localedata/locales/sd_PK 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sd_PK 2018-07-17 17:55:51.000000000 +0000
@@ -39,6 +39,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
--- a/localedata/locales/se_NO 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/se_NO 2018-07-17 17:55:51.000000000 +0000
@@ -205,6 +205,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
--- a/localedata/locales/sgs_LT 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sgs_LT 2018-07-17 17:55:52.000000000 +0000
@@ -59,6 +59,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
--- a/localedata/locales/si_LK 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/si_LK 2018-07-17 17:55:52.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
--- a/localedata/locales/sk_SK 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sk_SK 2018-07-17 17:55:52.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
--- a/localedata/locales/sl_SI 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sl_SI 2018-07-17 17:55:52.000000000 +0000
@@ -91,6 +91,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
--- a/localedata/locales/sm_WS 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sm_WS 2018-07-17 17:55:52.000000000 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
--- a/localedata/locales/so_SO 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/so_SO 2018-07-17 17:55:52.000000000 +0000
@@ -70,6 +70,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
--- a/localedata/locales/sq_AL 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sq_AL 2018-07-17 17:55:52.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
--- a/localedata/locales/ss_ZA 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/ss_ZA 2018-07-17 17:55:52.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
--- a/localedata/locales/st_ZA 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/st_ZA 2018-07-17 17:55:52.000000000 +0000
@@ -64,6 +64,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
--- a/localedata/locales/sv_SE 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sv_SE 2018-07-17 17:55:52.000000000 +0000
@@ -139,6 +139,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
--- a/localedata/locales/sw_KE 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sw_KE 2018-07-17 17:55:52.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
--- a/localedata/locales/ta_IN 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/ta_IN 2018-07-17 17:55:52.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
--- a/localedata/locales/te_IN 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/te_IN 2018-07-17 17:55:52.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
--- a/localedata/locales/th_TH 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/th_TH 2018-07-17 17:55:52.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
--- a/localedata/locales/ti_ET 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/ti_ET 2018-07-17 17:55:52.000000000 +0000
@@ -866,6 +866,7 @@
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
--- a/localedata/locales/tn_ZA 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/tn_ZA 2018-07-17 17:55:52.000000000 +0000
@@ -69,6 +69,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
--- a/localedata/locales/to_TO 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/to_TO 2018-07-17 17:55:52.000000000 +0000
@@ -36,6 +36,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
--- a/localedata/locales/tpi_PG 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/tpi_PG 2018-07-17 17:55:52.000000000 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
--- a/localedata/locales/tr_TR 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/tr_TR 2018-07-17 17:55:52.000000000 +0000
@@ -2430,6 +2430,7 @@
% TURKISH LIRA SIGN
<U20BA> "<U0054><U004C>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/translit_cyrillic
b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000
+0000
+++ b/localedata/locales/translit_cyrillic 2018-07-17 17:55:52.000000000
+0000
@@ -0,0 +1,151 @@
+escape_char /
+comment_char %
+
+% Transliterations that converts cyrillic letters to ascii symbols
inspired by GOST 7.79-2000
+% https://sourceware.org/bugzilla/show_bug.cgi?id=2872
+% Generated from UnicodeData.txt with
+% https://sourceware.org/bugzilla/attachment.cgi?id=8590
+% Up to three characters are required to do a reversible transliteration.
+
+LC_CTYPE
+
+translit_start
+
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> "<U0059><U004F>";<U0059>
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> "<U005A><U0048>";<U005A>
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> "<U0043><U005A>";<U0043>
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> "<U0043><U0048>";<U0043>
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> "<U0053><U0048>";<U0053>
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> "<U0053><U0048><U0048>";<U0053>
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> "<U0060><U0060>";<U0060>
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> "<U0059><U0027>";<U0059>
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> "<U0045><U0060>";<U0045>
+% CYRILLIC CAPITAL LETTER YU
+<U042E> "<U0059><U0055>";<U0059>
+% CYRILLIC CAPITAL LETTER YA
+<U042F> "<U0059><U0041>";<U0059>
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> "<U007A><U0068>";<U007A>
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> "<U0063><U007A>";<U0063>
+% CYRILLIC SMALL LETTER CHE
+<U0447> "<U0063><U0068>";<U0063>
+% CYRILLIC SMALL LETTER SHA
+<U0448> "<U0073><U0068>";<U0073>
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> "<U0073><U0068><U0068>";<U0073>
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> "<U0060><U0060>";<U0060>
+% CYRILLIC SMALL LETTER YERU
+<U044B> "<U0079><U0027>";<U0079>
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> "<U0065><U0060>";<U0065>
+% CYRILLIC SMALL LETTER YU
+<U044E> "<U0079><U0075>";<U0079>
+% CYRILLIC SMALL LETTER YA
+<U044F> "<U0079><U0061>";<U0079>
+% CYRILLIC SMALL LETTER IO
+<U0451> "<U0079><U006F>";<U0079>
+
+
+translit_end
+
+END LC_CTYPE
diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
--- a/localedata/locales/ts_ZA 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ts_ZA 2018-07-17 17:55:52.000000000 +0000
@@ -64,6 +64,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
--- a/localedata/locales/unm_US 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/unm_US 2018-07-17 17:55:52.000000000 +0000
@@ -48,6 +48,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
--- a/localedata/locales/ur_IN 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ur_IN 2018-07-17 17:55:53.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
--- a/localedata/locales/ur_PK 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ur_PK 2018-07-17 17:55:53.000000000 +0000
@@ -58,6 +58,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
--- a/localedata/locales/ve_ZA 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ve_ZA 2018-07-17 17:55:53.000000000 +0000
@@ -67,6 +67,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
--- a/localedata/locales/vi_VN 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/vi_VN 2018-07-17 17:55:53.000000000 +0000
@@ -58,6 +58,7 @@
% dong sign -> d// -> dd
<U20AB> "<U0111>";"<U0064><U0064>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
--- a/localedata/locales/wa_BE 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/wa_BE 2018-07-17 17:55:53.000000000 +0000
@@ -69,6 +69,7 @@
<U00C5> "<U0041><U030A>";"<U0041>";"<U0041><U0055>"
<U00E5> "<U0061><U030A>";"<U0061>";"<U0061><U0075>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
--- a/localedata/locales/wo_SN 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/wo_SN 2018-07-17 17:55:53.000000000 +0000
@@ -55,6 +55,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
--- a/localedata/locales/xh_ZA 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/xh_ZA 2018-07-17 17:55:53.000000000 +0000
@@ -66,6 +66,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
--- a/localedata/locales/yi_US 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/yi_US 2018-07-17 17:55:53.000000000 +0000
@@ -73,6 +73,7 @@
<U05F0> "<U05D5><U05D5>";"<U0077><U0077>"
<U05F1> "<U05D5><U05D9>";"<U0077><U006A>"
<U05F2> "<U05D9><U05D9>";"<U006A><U006A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
--- a/localedata/locales/zh_CN 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/zh_CN 2018-07-17 17:55:53.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
class "hanzi"; /
diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
--- a/localedata/locales/zu_ZA 2018-07-17 17:49:22.000000000 +0000
+++ b/localedata/locales/zu_ZA 2018-07-17 17:55:53.000000000 +0000
@@ -70,6 +70,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
--
Marko Myllynen
Rafal Luzynski
2018-10-05 09:20:25 UTC
Permalink
Post by Egor Kobylkin
Post by Keld Simonsen
Hi
Please note that translitteration of Cyrillic to latin is not universal.
There are different schemes for for example German, English and Danish, and
there is also an ISO standard for it.
Thanks for your feedback, Keld!
Could the locale maintainers that wouldn't like to include this patch
explicitly state so here?
I think it is about me so I must reply. I am sorry about that and the sole
reason is my lack of time. I'm just a volunteer here, that means it's not
my regular job to work on locale data nor anything in glibc nor in any other
open source project. I do these things only in my free time which I don't
have much. Of course you will see my contributions here and there but they
are either trivial or take me months to complete. Your patches are on my
radar but I can't tell any ETA for them. Of course, there are other people
around here and they are all welcome to come and join.
Post by Egor Kobylkin
- In the case that there is a different preferred cyrillic
transliteration table for any specific locale their maintainers may want
to point me to it so I can supply a separate table/patch.
- Or they could state explicitly that for some reason they would like to
exclude their locale from the patch for a default cyrillic
transliteration altogether.
As Keld wrote, there are probably separate rules for every language so
I don't think you should treat your rules as universal and include them
in every locale. At first sight, it seems to me they work only for English
(as a destination locale). Also, although it is called "transliteration
from Cyrillic" it seems that it covers only Russian alphabet. What about
other languages which use Cyrillic alphabet but add their own diacritic
characters? Think about Belarusian, Ukrainian, Serbian, Chechen, Chuvash,
Mari, Ossetian, Yakut, Tatar, and more. What about languages which use
Cyrillic alphabet but transliterate their respective letters in a different
way than Russian? For example, Russian "Ъ" is (I think) usually skipped
in transliteration, I think you propose "``", but when transliterating from
Bulgarian they usually transliterate this as "ă".

Few remarks:

* I think you transliterate "щ" as "shh", wouldn't "shch" be better?
* You transliterate "ц" as "cz", wouldn't "ts" be better? By the way,
in Polish language "cz" is a correct transliteration of "ч".
* You transliterate "й" as "j", this is fine in many languages but wouldn't
"y" be better in English?
* In case of "е": how will you know if it is correct to transliterate it
to "e" or "ie" or "je" or "ye"?

These remarks are obviously incomplete, your patch deserves much more
attention to review.

Best regards,

Rafal
Egor Kobylkin
2018-10-05 10:36:29 UTC
Permalink
removed a png image attachment

Keld,Marko,Rafal, other locale maintainers,

this all is written with having in mind a minimal viable fix for this
bug asap. I want to avoid wasting maintainers time getting into
fundamental discussions here (although for perfectly good reasons).

I see three options:
1. those locale maintainers that are fine with using ISO
9:1995/GOST_7.79_System_B cyrillic transliteration table (Ru) include it
in their locales. https://sourceware.org/bugzilla/attachment.cgi?id=11289
2. those that that want to have a differing table can create their own
variety based on the spreadsheet I have prepared
https://sourceware.org/bugzilla/attachment.cgi?id=8590 and include it in
this patch.
3. those that want to omit a cyrillic transliteration altogether for now
state so and just carry over the bug #2872 from the year 2006.

Does this make sense to you?

Just to be super clear on this: the patch is a stopgap _ASCII_
transliteration table. ASCII being AMERICAN Standard Code for
Information Interchange, that is obviously orthogonal to any
transliteration rule of other countries. As such it is not explicitly
targeting transliteration standards of any country.

The fact that the patch is reflecting Russian variety of ISO
9:1995/GOST_7.79_System_B is because a) ISO 9:1995/GOST_7.79_System_B is
available and can be helpful to a majority of cyrillic users b) I have
access to it including via being proficient in Russian.

It is offered to all the respective locale maintainers as a stopgap
solution. Stopgap in the sense that it is better to have some
transliteration than not to have any at all and carry over the bug from
2006. That it may be a somewhat officially correct transliteration for
ru_RU is a bonus. In that sense I would dub the discussion on the
correctness for other languages "offtopic". Let me know if this is not OK.

You are all are correctly mentioning the deficiencies of this approach.
However, I couldn't find a better straightforward approach as of yet.
Happy to hear from you as on how this could be handled.

There is a danger of being caught in the web of language/country
differences. I propose just pruning the locales that are not comfortable
including this current table. We can address possible solutions in the
second wave of patching.

I am vary of getting into discussions on specific country variants just
because of the sheer complexity of this topic. It is probably better
addressed by respective maintainers of their locales. I do not see a
"one fits all" solution in this first wave possible.

I would like to have this "three options plan of action" vetted first
and then we could go to the specific detail. (Like, for instance, what
characters should be included in to the table, and in which
transliteration form.)

I am looking forward to your reply,
Egor Kobylkin

P.S. specifically as to how address languages other than Ru included in
GOST_7.79_System_B: we can take the first option left to right from that
table (Ru,By,Uk,Bg,Mk). Then it will technically work for all those
locales/languages but with errors where Ru supersedes their own variants.
Post by Rafal Luzynski
Post by Egor Kobylkin
Post by Keld Simonsen
Hi
Please note that translitteration of Cyrillic to latin is not universal.
There are different schemes for for example German, English and Danish, and
there is also an ISO standard for it.
Thanks for your feedback, Keld!
Could the locale maintainers that wouldn't like to include this patch
explicitly state so here?
I think it is about me so I must reply. I am sorry about that and the sole
reason is my lack of time. I'm just a volunteer here, that means it's not
my regular job to work on locale data nor anything in glibc nor in any other
open source project. I do these things only in my free time which I don't
have much. Of course you will see my contributions here and there but they
are either trivial or take me months to complete. Your patches are on my
radar but I can't tell any ETA for them. Of course, there are other people
around here and they are all welcome to come and join.
Post by Egor Kobylkin
- In the case that there is a different preferred cyrillic
transliteration table for any specific locale their maintainers may want
to point me to it so I can supply a separate table/patch.
- Or they could state explicitly that for some reason they would like to
exclude their locale from the patch for a default cyrillic
transliteration altogether.
As Keld wrote, there are probably separate rules for every language so
I don't think you should treat your rules as universal and include them
in every locale. At first sight, it seems to me they work only for English
(as a destination locale). Also, although it is called "transliteration
from Cyrillic" it seems that it covers only Russian alphabet. What about
other languages which use Cyrillic alphabet but add their own diacritic
characters? Think about Belarusian, Ukrainian, Serbian, Chechen, Chuvash,
Mari, Ossetian, Yakut, Tatar, and more. What about languages which use
Cyrillic alphabet but transliterate their respective letters in a different
way than Russian? For example, Russian "Ъ" is (I think) usually skipped
in transliteration, I think you propose "``", but when transliterating from
Bulgarian they usually transliterate this as "ă".
* I think you transliterate "щ" as "shh", wouldn't "shch" be better?
* You transliterate "ц" as "cz", wouldn't "ts" be better? By the way,
in Polish language "cz" is a correct transliteration of "ч".
* You transliterate "й" as "j", this is fine in many languages but wouldn't
"y" be better in English?
* In case of "е": how will you know if it is correct to transliterate it
to "e" or "ie" or "je" or "ye"?
These remarks are obviously incomplete, your patch deserves much more
attention to review.
Best regards,
Rafal
Rafal Luzynski
2018-10-08 22:04:55 UTC
Permalink
Post by Egor Kobylkin
[...]
1. those locale maintainers that are fine with using ISO
9:1995/GOST_7.79_System_B cyrillic transliteration table (Ru) include it
in their locales. https://sourceware.org/bugzilla/attachment.cgi?id=11289
2. those that that want to have a differing table can create their own
variety based on the spreadsheet I have prepared
https://sourceware.org/bugzilla/attachment.cgi?id=8590 and include it in
this patch.
3. those that want to omit a cyrillic transliteration altogether for now
state so and just carry over the bug #2872 from the year 2006.
Does this make sense to you?
The problem is that we don't have a separate maintainer for each locale,
we have only 2 maintainers for about 200 locales and we must represent
them all. Sometimes a locale may happen to be our own native locale or
of someone in this list, or it may be a locale which we accidentally can
speak as a foreign language, or we may have friends who can speak it.
Or it may be totally unknown and we still must somehow handle it.

I think that these transliteration rules should be included in multiple
locales on "opt-in" basis rather than "opt-out". I mean, we should not
include them in all locales unless someone explicitly provides a different
rules. Instead, I think we should add them (maybe with modification)
only to those locales where we have a good reason to think they will work.

Particularly, I think that those rules will not be helpful at all for
the languages which use neither Latin nor Cyrillic alphabet.
Post by Egor Kobylkin
[...]
The fact that the patch is reflecting Russian variety of ISO
9:1995/GOST_7.79_System_B is because a) ISO 9:1995/GOST_7.79_System_B is
available and can be helpful to a majority of cyrillic users b) I have
access to it including via being proficient in Russian.
I took a look at these standards and as first I doubted they may be
correct for English language now I understand they are created for
Russian users. Therefore I think it is pretty correct to include them
to Russian locale data. Will it be OK if we say that it is only for
Russian language? Will it be satisfying for you and/or your users?
Post by Egor Kobylkin
It is offered to all the respective locale maintainers as a stopgap
solution. Stopgap in the sense that it is better to have some
transliteration than not to have any at all and carry over the bug from
2006. That it may be a somewhat officially correct transliteration for
ru_RU is a bonus. In that sense I would dub the discussion on the
correctness for other languages "offtopic". Let me know if this is not OK.
If you refer to other languages than Russian which also use the Cyrillic
alphabet but need a different transliteration rules than Russian for
the same characters then it is OK for me now. I am afraid that the iconv
algorithm does not handle such case. Of course, we should add this missing
feature eventually but I do not volunteer to do it now.
Post by Egor Kobylkin
[...]
P.S. specifically as to how address languages other than Ru included in
GOST_7.79_System_B: we can take the first option left to right from that
table (Ru,By,Uk,Bg,Mk). Then it will technically work for all those
locales/languages but with errors where Ru supersedes their own variants.
Makes sense, as long as we cannot select the source language now.

But, while at this, is there anything that stops are from adding transliteration
rules for additional Cyrillic characters not used in Russian but used in
other languages?

Regards,

Rafal
Egor Kobylkin
2018-10-08 22:52:00 UTC
Permalink
Hi Rafal,
Post by Rafal Luzynski
But, while at this, is there anything that stops are from adding
transliteration rules for additional Cyrillic characters not used in
Russian but used in other languages?
Just to make sure we are not talking at cross purposes. Since your last
email on this topic on the suggestion from Marko I have already
implemented ISO 9 transliteration for all characters there are. This
should cover most if not all Slavic Cyrillic. You seem to have just
noticed and replied to this email of Marko as I write mine.

Pls also check the Spreadsheet version I have just uploaded
https://sourceware.org/bugzilla/attachment.cgi?id=11298

I am currently absorbing Marko's further suggestions and correction to
that one and will get back for more discussion once done there. I am
reading your suggestions and taking them to my heart, be sure of that.

Two professional translators independently indicated the difference
between transliteration and transcription to me. Transliteration is
normative (letter for letter) and transcription is phonetic - letter for
whatever combination of Latin letters in the target language that sounds
like it for a native speaker. While transliteration should be easy to
cover for all those languages via ISO 9, transcription is inherently
language specific. The problem is we are (mis)using the transcription as
transliteration to ASCII because ASCII set of characters does not allow
for proper transcription. Another problem is that to be really useful
the ASCII transliteration should work outside of source locale (i.e. not
only ru_RU but en_US, de_DE, en_DE, es_ES etc. or even just C locale).

In fact for myself I would be committed to do all work needed to cover
at least C, en_US, ru_RU, de_DE in that order. ru_RU as a "courtesy", I
am not really using it but hope more contributors for locales may come
because of that and fix my bugs :-).
Post by Rafal Luzynski
The problem is that we don't have a separate maintainer for each
locale, we have only 2 maintainers for about 200 locales and we must
represent them all.
It was not clear to me that glibc team can not fall back on the
individual locale maintainers to make the decision. But then it may make
the decision making even easier. If you guys have a list of requirements
(may be implicit until now) could you please shoot them my way? We can
also certainly just keep this thread up and have all issues ironed out.

Anyway hopefully with ISO 9 as a first column in the translit_cyrillic
we cover the issue of the completeness of transliteration now. What we
need to figure out is transcription/transliteration to ASCII - second
column.

Are we sharing the same view on this?

Speaking on decision making - maybe I can get an officially certified
court translator to answer our questions. Do you care to put a list
together of questions you would like answered to make a decision on the
table/inclusion into various locales?

Hope this helps,
Egor
Post by Rafal Luzynski
[...] I see three options: 1. those locale maintainers that are
fine with using ISO 9:1995/GOST_7.79_System_B cyrillic
transliteration table (Ru) include it in their locales.
https://sourceware.org/bugzilla/attachment.cgi?id=11289 2. those
that that want to have a differing table can create their own
variety based on the spreadsheet I have prepared
https://sourceware.org/bugzilla/attachment.cgi?id=8590 and include
it in this patch. 3. those that want to omit a cyrillic
transliteration altogether for now state so and just carry over the
bug #2872 from the year 2006.
Does this make sense to you?
The problem is that we don't have a separate maintainer for each
locale, we have only 2 maintainers for about 200 locales and we must
represent them all. Sometimes a locale may happen to be our own
native locale or of someone in this list, or it may be a locale which
we accidentally can speak as a foreign language, or we may have
friends who can speak it. Or it may be totally unknown and we still
must somehow handle it.
I think that these transliteration rules should be included in
multiple locales on "opt-in" basis rather than "opt-out". I mean, we
should not include them in all locales unless someone explicitly
provides a different rules. Instead, I think we should add them
(maybe with modification) only to those locales where we have a good
reason to think they will work.
Particularly, I think that those rules will not be helpful at all
for the languages which use neither Latin nor Cyrillic alphabet.
[...] The fact that the patch is reflecting Russian variety of ISO
9:1995/GOST_7.79_System_B is because a) ISO
9:1995/GOST_7.79_System_B is available and can be helpful to a
majority of cyrillic users b) I have access to it including via
being proficient in Russian.
I took a look at these standards and as first I doubted they may be
correct for English language now I understand they are created for
Russian users. Therefore I think it is pretty correct to include
them to Russian locale data. Will it be OK if we say that it is only
for Russian language? Will it be satisfying for you and/or your
users?
It is offered to all the respective locale maintainers as a
stopgap solution. Stopgap in the sense that it is better to have
some transliteration than not to have any at all and carry over the
bug from 2006. That it may be a somewhat officially correct
transliteration for ru_RU is a bonus. In that sense I would dub the
discussion on the correctness for other languages "offtopic". Let
me know if this is not OK.
If you refer to other languages than Russian which also use the
Cyrillic alphabet but need a different transliteration rules than
Russian for the same characters then it is OK for me now. I am
afraid that the iconv algorithm does not handle such case. Of
course, we should add this missing feature eventually but I do not
volunteer to do it now.
[...] P.S. specifically as to how address languages other than Ru
included in GOST_7.79_System_B: we can take the first option left
to right from that table (Ru,By,Uk,Bg,Mk). Then it will technically
work for all those locales/languages but with errors where Ru
supersedes their own variants.
Makes sense, as long as we cannot select the source language now.
But, while at this, is there anything that stops are from adding
transliteration rules for additional Cyrillic characters not used in
Russian but used in other languages?
Regards,
Rafal
Rafal Luzynski
2018-10-09 21:43:05 UTC
Permalink
Post by Egor Kobylkin
[...]
Just to make sure we are not talking at cross purposes. Since your last
email on this topic on the suggestion from Marko I have already
implemented ISO 9 transliteration for all characters there are. This
should cover most if not all Slavic Cyrillic. You seem to have just
noticed and replied to this email of Marko as I write mine.
That's great. I'm sorry about not noticing this before, as you can see
this only confirms that I'm unable to give a proper attention to your bug.
Post by Egor Kobylkin
Post by Rafal Luzynski
Are the duplicates here because some Cyrillic letters may have multiple
Latin transliterations depending on the context, for example Cyrillic IE
must be transliterated sometimes as "e", sometimes as "ie", sometimes
as "ye" or "je"? Can we provide rules for groups of characters instead?
No, the duplicates are just by design of my line generating logic. I
have fixed (removed) them. The varying transcription between
languages/locales can not be handled in one file at all as far as I
understood.
No, I did not mean here different languages but that some letters may need
to be transliterated in a different way depending on the context. For
example, a letter "е" might be transliterated as "e" or "ie" or "je"
depending on whether it appears after "ж" or after another consonant
or after a vowel or a soft or hard sign etc. All within Russian language.
(Sorry if I'm messing that, maybe what I wrote is wrong but may be correct
for another combination of letters.)

Regards,

Rafal
Zack Weinberg
2018-10-08 23:20:23 UTC
Permalink
On Mon, Oct 8, 2018 at 6:05 PM Rafal Luzynski
Post by Rafal Luzynski
The problem is that we don't have a separate maintainer for each locale,
we have only 2 maintainers for about 200 locales and we must represent
them all. Sometimes a locale may happen to be our own native locale or
of someone in this list, or it may be a locale which we accidentally can
speak as a foreign language, or we may have friends who can speak it.
Or it may be totally unknown and we still must somehow handle it.
I just want to mention that this is also why most of the non-locale
maintainers tend to stay out of threads about locales. We know we're
even less expert on these issues than you are, and I think as a
general rule you should be assuming that the community is OK with what
you're doing unless someone speaks up to object.

zw
Carlos O'Donell
2018-10-09 15:26:25 UTC
Permalink
Post by Zack Weinberg
On Mon, Oct 8, 2018 at 6:05 PM Rafal Luzynski
Post by Rafal Luzynski
The problem is that we don't have a separate maintainer for each locale,
we have only 2 maintainers for about 200 locales and we must represent
them all. Sometimes a locale may happen to be our own native locale or
of someone in this list, or it may be a locale which we accidentally can
speak as a foreign language, or we may have friends who can speak it.
Or it may be totally unknown and we still must somehow handle it.
I just want to mention that this is also why most of the non-locale
maintainers tend to stay out of threads about locales. We know we're
even less expert on these issues than you are, and I think as a
general rule you should be assuming that the community is OK with what
you're doing unless someone speaks up to object.
I agree with Zach here.

Rafal and Mike are localedata subsystem maintainers, and your best efforts
are the best we have right now in the community.

I also agree that a conservative position of is always a good place to start,
but it sounds like Egor has added enough coverage to perhaps make all of
these transliterations opt-in by default.

I don't have a good sense of this though, and so I defer to you as a the
subsystem maintainer to review and formulate a position. If you have any
specific questions, I can certainly help review.
--
Cheers,
Carlos.
Rafal Luzynski
2018-10-09 21:51:28 UTC
Permalink
Post by Carlos O'Donell
[...]
but it sounds like Egor has added enough coverage to perhaps make all of
these transliterations opt-in by default.
I think that it is correct if this transliteration is meant to be "Russian
language as if it used a Latin alphabet (even if it does not actually
except in some computer systems which do not support Cyrillic)"
but not if it is meant to be "Russian language to make sure it is comfortable
for reading by English speakers (assuming that everyone else should be fine
with English if their native language is not supported)".

Regards,

Rafal
Marko Myllynen
2018-10-09 16:10:26 UTC
Permalink
Hi,
Post by Rafal Luzynski
Particularly, I think that those rules will not be helpful at all for
the languages which use neither Latin nor Cyrillic alphabet.
This is certainly a very good point.
Post by Rafal Luzynski
If you refer to other languages than Russian which also use the Cyrillic
alphabet but need a different transliteration rules than Russian for
the same characters then it is OK for me now. I am afraid that the iconv
algorithm does not handle such case. Of course, we should add this missing
feature eventually but I do not volunteer to do it now.
Yes, this would be needed for correct transliteration of different
languages, and this might be quite a bit of work. There's also the case
of transliteration and character sets, consider the transliteration
examples from https://fi.wikipedia.org/wiki/Siirtokirjoitus:

Russian: Борис Николаевич Ельцин
Int'l: Boris Nikolaevič Elʹcin
Finnish: Boris Nikolajevitš Jeltsin
French: Boris Nikolaïevitch Ieltsine
Phonetic (IPA): [bɐˈrʲis nʲɪkɐˈlaɪvʲɪtɕ ˈjelʲtsɨn]

For French you'll get the correct transliteration with iconv by using -t
ISO-8859-1//TRANSLIT, for Finnish with -t ISO-8859-15//TRANSLIT but it's
not so obvious how to get the above kind transliteration for ISO 9
international or especially for the phonetic case.

One thing that might be helpful here could be something like:

$ echo ж | LC_ALL=fi_FI.UTF-8 iconv -f UTF-8 -t UTF-8//TRANSLIT_FORCE
ž

That is, force transliteration of each character (if defined) even if
it's part of the target character set. AFAICS this is not currently
possible.
Post by Rafal Luzynski
But, while at this, is there anything that stops are from adding transliteration
rules for additional Cyrillic characters not used in Russian but used in
other languages?
This would probably make sense.

FWIW, for Finnish the diff for Russian to be applied in the locale on
top of translit_cyrillic (ISO 9) rules would be something like below, I
still need to check whether there are rules needed for other languages
than Russian that could be added (I hope to submit a proper patch
against fi_FI shortly after translit_cyrillic has landed):

<U0446> "<U0074><U0073>"
<U0447> "<U0074><U0161>";"<U0074><U0073><U0068>"
<U0448> "<U0161>";"<U0073><U0068>"
<U0449> "<U0161><U0074><U0161>";"<U0073><U0068><U0074><U0073><U0068>"
<U044A> ""
<U044C> ""
<U044D> "<U0065>"
<U044E> "<U006A><U0075>"
<U044F> "<U006A><U0061>"
<U0451> "<U006A><U006F>"

Thanks,
--
Marko Myllynen
Egor Kobylkin
2018-10-09 16:22:22 UTC
Permalink
In the hope to be helpful: what you describe below from
https://fi.wikipedia.org/wiki/Siirtokirjoitus is called _transcription_,
not transliteration.

Transliteration is what we have done with ISO 9 or GOST 7.79 System A
and it could be the same for all languages indeed.

The transcription can be phonetic or serve other purposes and depends on
the target language or use case. We have used the GOST 7.79 System B.

Egor
Post by Marko Myllynen
Hi,
Post by Rafal Luzynski
Particularly, I think that those rules will not be helpful at all for
the languages which use neither Latin nor Cyrillic alphabet.
This is certainly a very good point.
Post by Rafal Luzynski
If you refer to other languages than Russian which also use the Cyrillic
alphabet but need a different transliteration rules than Russian for
the same characters then it is OK for me now. I am afraid that the iconv
algorithm does not handle such case. Of course, we should add this missing
feature eventually but I do not volunteer to do it now.
Yes, this would be needed for correct transliteration of different
languages, and this might be quite a bit of work. There's also the case
of transliteration and character sets, consider the transliteration
Russian: Борис Николаевич Ельцин
Int'l: Boris Nikolaevič Elʹcin
Finnish: Boris Nikolajevitš Jeltsin
French: Boris Nikolaïevitch Ieltsine
Phonetic (IPA): [bɐˈrʲis nʲɪkɐˈlaɪvʲɪtɕ ˈjelʲtsɨn]
For French you'll get the correct transliteration with iconv by using -t
ISO-8859-1//TRANSLIT, for Finnish with -t ISO-8859-15//TRANSLIT but it's
not so obvious how to get the above kind transliteration for ISO 9
international or especially for the phonetic case.
$ echo ж | LC_ALL=fi_FI.UTF-8 iconv -f UTF-8 -t UTF-8//TRANSLIT_FORCE
ž
That is, force transliteration of each character (if defined) even if
it's part of the target character set. AFAICS this is not currently
possible.
Post by Rafal Luzynski
But, while at this, is there anything that stops are from adding transliteration
rules for additional Cyrillic characters not used in Russian but used in
other languages?
This would probably make sense.
FWIW, for Finnish the diff for Russian to be applied in the locale on
top of translit_cyrillic (ISO 9) rules would be something like below, I
still need to check whether there are rules needed for other languages
than Russian that could be added (I hope to submit a proper patch
<U0446> "<U0074><U0073>"
<U0447> "<U0074><U0161>";"<U0074><U0073><U0068>"
<U0448> "<U0161>";"<U0073><U0068>"
<U0449> "<U0161><U0074><U0161>";"<U0073><U0068><U0074><U0073><U0068>"
<U044A> ""
<U044C> ""
<U044D> "<U0065>"
<U044E> "<U006A><U0075>"
<U044F> "<U006A><U0061>"
<U0451> "<U006A><U006F>"
Thanks,
Marko Myllynen
2018-10-09 16:49:06 UTC
Permalink
Hi,

To clarify, the page has a section explaining the differences between
transliteration and transcription and how the terminology is not
entirely unambiguous. It also explains that the national standard SFS
4900 overrides ISO 9, thus ISO 9 can't be used as-is in Finnish context.

Thanks,
Post by Egor Kobylkin
In the hope to be helpful: what you describe below from
https://fi.wikipedia.org/wiki/Siirtokirjoitus is called _transcription_,
not transliteration.
Transliteration is what we have done with ISO 9 or GOST 7.79 System A
and it could be the same for all languages indeed.
The transcription can be phonetic or serve other purposes and depends on
the target language or use case. We have used the GOST 7.79 System B.
Egor
Post by Marko Myllynen
Hi,
Post by Rafal Luzynski
Particularly, I think that those rules will not be helpful at all for
the languages which use neither Latin nor Cyrillic alphabet.
This is certainly a very good point.
Post by Rafal Luzynski
If you refer to other languages than Russian which also use the Cyrillic
alphabet but need a different transliteration rules than Russian for
the same characters then it is OK for me now. I am afraid that the iconv
algorithm does not handle such case. Of course, we should add this missing
feature eventually but I do not volunteer to do it now.
Yes, this would be needed for correct transliteration of different
languages, and this might be quite a bit of work. There's also the case
of transliteration and character sets, consider the transliteration
Russian: Борис Николаевич Ельцин
Int'l: Boris Nikolaevič Elʹcin
Finnish: Boris Nikolajevitš Jeltsin
French: Boris Nikolaïevitch Ieltsine
Phonetic (IPA): [bɐˈrʲis nʲɪkɐˈlaɪvʲɪtɕ ˈjelʲtsɨn]
For French you'll get the correct transliteration with iconv by using -t
ISO-8859-1//TRANSLIT, for Finnish with -t ISO-8859-15//TRANSLIT but it's
not so obvious how to get the above kind transliteration for ISO 9
international or especially for the phonetic case.
$ echo ж | LC_ALL=fi_FI.UTF-8 iconv -f UTF-8 -t UTF-8//TRANSLIT_FORCE
ž
That is, force transliteration of each character (if defined) even if
it's part of the target character set. AFAICS this is not currently
possible.
Post by Rafal Luzynski
But, while at this, is there anything that stops are from adding transliteration
rules for additional Cyrillic characters not used in Russian but used in
other languages?
This would probably make sense.
FWIW, for Finnish the diff for Russian to be applied in the locale on
top of translit_cyrillic (ISO 9) rules would be something like below, I
still need to check whether there are rules needed for other languages
than Russian that could be added (I hope to submit a proper patch
<U0446> "<U0074><U0073>"
<U0447> "<U0074><U0161>";"<U0074><U0073><U0068>"
<U0448> "<U0161>";"<U0073><U0068>"
<U0449> "<U0161><U0074><U0161>";"<U0073><U0068><U0074><U0073><U0068>"
<U044A> ""
<U044C> ""
<U044D> "<U0065>"
<U044E> "<U006A><U0075>"
<U044F> "<U006A><U0061>"
<U0451> "<U006A><U006F>"
Thanks,
--
Marko Myllynen
Rafal Luzynski
2018-10-09 22:08:56 UTC
Permalink
Post by Marko Myllynen
Post by Rafal Luzynski
If you refer to other languages than Russian which also use the Cyrillic
alphabet but need a different transliteration rules than Russian for
the same characters then it is OK for me now. I am afraid that the iconv
algorithm does not handle such case. Of course, we should add this missing
feature eventually but I do not volunteer to do it now.
Yes, this would be needed for correct transliteration of different
languages, and this might be quite a bit of work. There's also the case
of transliteration and character sets, consider the transliteration
Russian: Борис Николаевич Ельцин
Int'l: Boris Nikolaevič Elʹcin
Finnish: Boris Nikolajevitš Jeltsin
French: Boris Nikolaïevitch Ieltsine
Phonetic (IPA): [bɐˈrʲis nʲɪkɐˈlaɪvʲɪtɕ ˈjelʲtsɨn]
No, I did not mean the transcription using the rules of the destination
locale using Latin but that the rules of transliteration may be different
depending on the language of the source text. For example, consider
this Cyrillic string: "нъг" (I'm not telling that it is actually used
in any existing word but still must be handled). By our transliteration
rules it will be transliterated as "n``g". But this is fine for Russian;
if we knew that the source string is Ukrainian it would be transliterated
as "n``h"; if it was Bulgarian it would be transliterated as "năg".
Similarly, if you had to transliterate the Latin letters "sch" to Cyrillic
first you would have to ask what was be the source language.

Unfortunately, I think that distinction of the source language is impossible
at the moment so let's assume that we fall back to Russian if there is
any ambiguity.

Regards,

Rafal
Marko Myllynen
2018-10-10 11:21:46 UTC
Permalink
Hi,
Post by Rafal Luzynski
Post by Marko Myllynen
Post by Rafal Luzynski
If you refer to other languages than Russian which also use the Cyrillic
alphabet but need a different transliteration rules than Russian for
the same characters then it is OK for me now. I am afraid that the iconv
algorithm does not handle such case. Of course, we should add this missing
feature eventually but I do not volunteer to do it now.
Yes, this would be needed for correct transliteration of different
languages, and this might be quite a bit of work. There's also the case
of transliteration and character sets, consider the transliteration
Russian: Борис Николаевич Ельцин
Int'l: Boris Nikolaevič Elʹcin
Finnish: Boris Nikolajevitš Jeltsin
French: Boris Nikolaïevitch Ieltsine
Phonetic (IPA): [bɐˈrʲis nʲɪkɐˈlaɪvʲɪtɕ ˈjelʲtsɨn]
No, I did not mean the transcription using the rules of the destination
locale using Latin but that the rules of transliteration may be different
depending on the language of the source text.
Yes, I mentioned this case in my earlier email:

https://sourceware.org/ml/libc-alpha/2018-10/msg00083.html
Post by Rafal Luzynski
this Cyrillic string: "нъг" (I'm not telling that it is actually used
in any existing word but still must be handled). By our transliteration
rules it will be transliterated as "n``g". But this is fine for Russian;
if we knew that the source string is Ukrainian it would be transliterated
as "n``h"; if it was Bulgarian it would be transliterated as "năg".
And according to SFS 4900, in fi_FI for this string we would see for
Russian ng, for Ukrainian nh, and for Bulgarian năg.
Post by Rafal Luzynski
Unfortunately, I think that distinction of the source language is impossible
at the moment so let's assume that we fall back to Russian if there is
any ambiguity.
Yeah, it's not optimal but probably the most decent compromise for now.

Thanks,
--
Marko Myllynen
Marko Myllynen
2018-10-11 10:10:00 UTC
Permalink
Hi,
Post by Marko Myllynen
$ echo ж | LC_ALL=fi_FI.UTF-8 iconv -f UTF-8 -t UTF-8//TRANSLIT_FORCE
ž
That is, force transliteration of each character (if defined) even if
it's part of the target character set. AFAICS this is not currently
possible.
FWIW, this is currently not possible with iconv(1) but uconv(1) supports
this with -x (AFAICS it's using ICU not glibc locale data):

https://en.wikipedia.org/wiki/uconv
https://linux.die.net/man/1/uconv
https://github.com/unicode-org/icu/tree/master/icu4c/source/extra/uconv

Cheers,
--
Marko Myllynen
Marko Myllynen
2018-10-05 11:54:10 UTC
Permalink
Hi,

Would it make sense to first use ISO 9:1995/GOST 7.79 System A if
possible and if not, then fall back to GOST 7.79 System B?

Implementation-wise current translit_* files have few examples where a
non-ASCII transliteration is tried first before an ASCII fallback. These
examples are from translit_neutral:

% NARROW NO-BREAK SPACE
<U202F> <U00A0>;<U0020>
% REVERSED TRIPLE PRIME
<U2037> "<U2035><U2035><U2035>";"<U0060><U0060><U0060>"

Thanks,
Post by Egor Kobylkin
Keld,Marko,Rafal, other locale maintainers,
this all is written with having in mind a minimal viable fix for this
bug asap. I want to avoid wasting maintainers time getting into
fundamental discussions here (although for perfectly good reasons).
1. those locale maintainers that are fine with using ISO
9:1995/GOST_7.79_System_B cyrillic transliteration table (Ru) include it
in their locales (see attached screenshot of the table).
2. those that that want to have a differing table can create their own
variety based on the spreadsheet I have prepared
https://sourceware.org/bugzilla/attachment.cgi?id=8590 and include it in
this patch.
3. those that want to omit a cyrillic transliteration altogether for now
state so and just carry over the bug #2872 from the year 2006.
Does this make sense to you?
Just to be super clear on this: the patch is a stopgap _ASCII_
transliteration table. ASCII being AMERICAN Standard Code for
Information Interchange, that is obviously orthogonal to any
transliteration rule of other countries. As such it is not explicitly
targeting transliteration standards of any country.
The fact that the patch is reflecting Russian variety of ISO
9:1995/GOST_7.79_System_B is because a) ISO 9:1995/GOST_7.79_System_B is
available and can be helpful to a majority of cyrillic users b) I have
access to it including via being proficient in Russian.
It is offered to all the respective locale maintainers as a stopgap
solution. Stopgap in the sense that it is better to have some
transliteration than not to have any at all and carry over the bug from
2006. That it may be a somewhat officially correct transliteration for
ru_RU is a bonus. In that sense I would dub the discussion on the
correctness for other languages "offtopic". Let me know if this is not OK.
You are all are correctly mentioning the deficiencies of this approach.
However, I couldn't find a better straightforward approach as of yet.
Happy to hear from you as on how this could be handled.
There is a danger of being caught in the web of language/country
differences. I propose just pruning the locales that are not comfortable
including this current table. We can address possible solutions in the
second wave of patching.
I am vary of getting into discussions on specific country variants just
because of the sheer complexity of this topic. It is probably better
addressed by respective maintainers of their locales. I do not see a
"one fits all" solution in this first wave possible.
I would like to have this "three options plan of action" vetted first
and then we could go to the specific detail. (Like, for instance, what
characters should be included in to the table, and in which
transliteration form.)
I am looking forward to your reply,
Egor Kobylkin
P.S. specifically as to how address languages other than Ru included in
GOST_7.79_System_B: we can take the first option left to right from that
table (Ru,By,Uk,Bg,Mk). Then it will technically work for all those
locales/languages but with errors where Ru supersedes their own variants.
Post by Rafal Luzynski
Post by Egor Kobylkin
Post by Keld Simonsen
Hi
Please note that translitteration of Cyrillic to latin is not universal.
There are different schemes for for example German, English and Danish, and
there is also an ISO standard for it.
Thanks for your feedback, Keld!
Could the locale maintainers that wouldn't like to include this patch
explicitly state so here?
I think it is about me so I must reply. I am sorry about that and the sole
reason is my lack of time. I'm just a volunteer here, that means it's not
my regular job to work on locale data nor anything in glibc nor in any other
open source project. I do these things only in my free time which I don't
have much. Of course you will see my contributions here and there but they
are either trivial or take me months to complete. Your patches are on my
radar but I can't tell any ETA for them. Of course, there are other people
around here and they are all welcome to come and join.
Post by Egor Kobylkin
- In the case that there is a different preferred cyrillic
transliteration table for any specific locale their maintainers may want
to point me to it so I can supply a separate table/patch.
- Or they could state explicitly that for some reason they would like to
exclude their locale from the patch for a default cyrillic
transliteration altogether.
As Keld wrote, there are probably separate rules for every language so
I don't think you should treat your rules as universal and include them
in every locale. At first sight, it seems to me they work only for English
(as a destination locale). Also, although it is called "transliteration
from Cyrillic" it seems that it covers only Russian alphabet. What about
other languages which use Cyrillic alphabet but add their own diacritic
characters? Think about Belarusian, Ukrainian, Serbian, Chechen, Chuvash,
Mari, Ossetian, Yakut, Tatar, and more. What about languages which use
Cyrillic alphabet but transliterate their respective letters in a different
way than Russian? For example, Russian "Ъ" is (I think) usually skipped
in transliteration, I think you propose "``", but when transliterating from
Bulgarian they usually transliterate this as "ă".
* I think you transliterate "щ" as "shh", wouldn't "shch" be better?
* You transliterate "ц" as "cz", wouldn't "ts" be better? By the way,
in Polish language "cz" is a correct transliteration of "ч".
* You transliterate "й" as "j", this is fine in many languages but wouldn't
"y" be better in English?
* In case of "е": how will you know if it is correct to transliterate it
to "e" or "ie" or "je" or "ye"?
These remarks are obviously incomplete, your patch deserves much more
attention to review.
Best regards,
Rafal
--
Marko Myllynen
Egor Kobylkin
2018-10-05 12:00:02 UTC
Permalink
Hi Marko,

I have chosen the System B because it is ASCII compartible. System A is
not ASCII compartible (diacritics in target).

https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A
"GOST 7.79 contains two transliteration tables.

System A
one Cyrillic character to one Latin character, some with diacritics
– identical to ISO 9:1995

System B
one Cyrillic character to one or many Latin characters without
diacritics
"
Hope this helps,
Egor
Post by Marko Myllynen
Hi,
Would it make sense to first use ISO 9:1995/GOST 7.79 System A if
possible and if not, then fall back to GOST 7.79 System B?
Implementation-wise current translit_* files have few examples where a
non-ASCII transliteration is tried first before an ASCII fallback. These
% NARROW NO-BREAK SPACE
<U202F> <U00A0>;<U0020>
% REVERSED TRIPLE PRIME
<U2037> "<U2035><U2035><U2035>";"<U0060><U0060><U0060>"
Thanks,
Post by Egor Kobylkin
Keld,Marko,Rafal, other locale maintainers,
this all is written with having in mind a minimal viable fix for this
bug asap. I want to avoid wasting maintainers time getting into
fundamental discussions here (although for perfectly good reasons).
1. those locale maintainers that are fine with using ISO
9:1995/GOST_7.79_System_B cyrillic transliteration table (Ru) include it
in their locales (see attached screenshot of the table).
2. those that that want to have a differing table can create their own
variety based on the spreadsheet I have prepared
https://sourceware.org/bugzilla/attachment.cgi?id=8590 and include it in
this patch.
3. those that want to omit a cyrillic transliteration altogether for now
state so and just carry over the bug #2872 from the year 2006.
Does this make sense to you?
Just to be super clear on this: the patch is a stopgap _ASCII_
transliteration table. ASCII being AMERICAN Standard Code for
Information Interchange, that is obviously orthogonal to any
transliteration rule of other countries. As such it is not explicitly
targeting transliteration standards of any country.
The fact that the patch is reflecting Russian variety of ISO
9:1995/GOST_7.79_System_B is because a) ISO 9:1995/GOST_7.79_System_B is
available and can be helpful to a majority of cyrillic users b) I have
access to it including via being proficient in Russian.
It is offered to all the respective locale maintainers as a stopgap
solution. Stopgap in the sense that it is better to have some
transliteration than not to have any at all and carry over the bug from
2006. That it may be a somewhat officially correct transliteration for
ru_RU is a bonus. In that sense I would dub the discussion on the
correctness for other languages "offtopic". Let me know if this is not OK.
You are all are correctly mentioning the deficiencies of this approach.
However, I couldn't find a better straightforward approach as of yet.
Happy to hear from you as on how this could be handled.
There is a danger of being caught in the web of language/country
differences. I propose just pruning the locales that are not comfortable
including this current table. We can address possible solutions in the
second wave of patching.
I am vary of getting into discussions on specific country variants just
because of the sheer complexity of this topic. It is probably better
addressed by respective maintainers of their locales. I do not see a
"one fits all" solution in this first wave possible.
I would like to have this "three options plan of action" vetted first
and then we could go to the specific detail. (Like, for instance, what
characters should be included in to the table, and in which
transliteration form.)
I am looking forward to your reply,
Egor Kobylkin
P.S. specifically as to how address languages other than Ru included in
GOST_7.79_System_B: we can take the first option left to right from that
table (Ru,By,Uk,Bg,Mk). Then it will technically work for all those
locales/languages but with errors where Ru supersedes their own variants.
Post by Rafal Luzynski
Post by Egor Kobylkin
Post by Keld Simonsen
Hi
Please note that translitteration of Cyrillic to latin is not universal.
There are different schemes for for example German, English and Danish, and
there is also an ISO standard for it.
Thanks for your feedback, Keld!
Could the locale maintainers that wouldn't like to include this patch
explicitly state so here?
I think it is about me so I must reply. I am sorry about that and the sole
reason is my lack of time. I'm just a volunteer here, that means it's not
my regular job to work on locale data nor anything in glibc nor in any other
open source project. I do these things only in my free time which I don't
have much. Of course you will see my contributions here and there but they
are either trivial or take me months to complete. Your patches are on my
radar but I can't tell any ETA for them. Of course, there are other people
around here and they are all welcome to come and join.
Post by Egor Kobylkin
- In the case that there is a different preferred cyrillic
transliteration table for any specific locale their maintainers may want
to point me to it so I can supply a separate table/patch.
- Or they could state explicitly that for some reason they would like to
exclude their locale from the patch for a default cyrillic
transliteration altogether.
As Keld wrote, there are probably separate rules for every language so
I don't think you should treat your rules as universal and include them
in every locale. At first sight, it seems to me they work only for English
(as a destination locale). Also, although it is called "transliteration
from Cyrillic" it seems that it covers only Russian alphabet. What about
other languages which use Cyrillic alphabet but add their own diacritic
characters? Think about Belarusian, Ukrainian, Serbian, Chechen, Chuvash,
Mari, Ossetian, Yakut, Tatar, and more. What about languages which use
Cyrillic alphabet but transliterate their respective letters in a different
way than Russian? For example, Russian "Ъ" is (I think) usually skipped
in transliteration, I think you propose "``", but when transliterating from
Bulgarian they usually transliterate this as "ă".
* I think you transliterate "щ" as "shh", wouldn't "shch" be better?
* You transliterate "ц" as "cz", wouldn't "ts" be better? By the way,
in Polish language "cz" is a correct transliteration of "ч".
* You transliterate "й" as "j", this is fine in many languages but wouldn't
"y" be better in English?
* In case of "е": how will you know if it is correct to transliterate it
to "e" or "ie" or "je" or "ye"?
These remarks are obviously incomplete, your patch deserves much more
attention to review.
Best regards,
Rafal
Marko Myllynen
2018-10-05 12:21:09 UTC
Permalink
Hi,

The scheme I proposed would also be ASCII compatible; consider this example:

% CYRILLIC CAPITAL LETTER SHA
<U0428> "<U0160>";"<U0053><U0068>"

"printf \\u0428\\n | iconv -f UTF-8 -t ISO-8859-15//TRANSLIT | iconv -f
ISO-8859-15 -t UTF-8" would produce Š as per System A and "printf
\\u0428\\n | iconv -f UTF-8 -t ASCII//TRANSLIT" would produce Sh as per
System B.

Thanks,
Post by Egor Kobylkin
Hi Marko,
I have chosen the System B because it is ASCII compartible. System A is
not ASCII compartible (diacritics in target).
https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A
"GOST 7.79 contains two transliteration tables.
System A
one Cyrillic character to one Latin character, some with diacritics
– identical to ISO 9:1995
System B
one Cyrillic character to one or many Latin characters without
diacritics
"
Hope this helps,
Egor
Post by Marko Myllynen
Hi,
Would it make sense to first use ISO 9:1995/GOST 7.79 System A if
possible and if not, then fall back to GOST 7.79 System B?
Implementation-wise current translit_* files have few examples where a
non-ASCII transliteration is tried first before an ASCII fallback. These
% NARROW NO-BREAK SPACE
<U202F> <U00A0>;<U0020>
% REVERSED TRIPLE PRIME
<U2037> "<U2035><U2035><U2035>";"<U0060><U0060><U0060>"
Thanks,
Post by Egor Kobylkin
Keld,Marko,Rafal, other locale maintainers,
this all is written with having in mind a minimal viable fix for this
bug asap. I want to avoid wasting maintainers time getting into
fundamental discussions here (although for perfectly good reasons).
1. those locale maintainers that are fine with using ISO
9:1995/GOST_7.79_System_B cyrillic transliteration table (Ru) include it
in their locales (see attached screenshot of the table).
2. those that that want to have a differing table can create their own
variety based on the spreadsheet I have prepared
https://sourceware.org/bugzilla/attachment.cgi?id=8590 and include it in
this patch.
3. those that want to omit a cyrillic transliteration altogether for now
state so and just carry over the bug #2872 from the year 2006.
Does this make sense to you?
Just to be super clear on this: the patch is a stopgap _ASCII_
transliteration table. ASCII being AMERICAN Standard Code for
Information Interchange, that is obviously orthogonal to any
transliteration rule of other countries. As such it is not explicitly
targeting transliteration standards of any country.
The fact that the patch is reflecting Russian variety of ISO
9:1995/GOST_7.79_System_B is because a) ISO 9:1995/GOST_7.79_System_B is
available and can be helpful to a majority of cyrillic users b) I have
access to it including via being proficient in Russian.
It is offered to all the respective locale maintainers as a stopgap
solution. Stopgap in the sense that it is better to have some
transliteration than not to have any at all and carry over the bug from
2006. That it may be a somewhat officially correct transliteration for
ru_RU is a bonus. In that sense I would dub the discussion on the
correctness for other languages "offtopic". Let me know if this is not OK.
You are all are correctly mentioning the deficiencies of this approach.
However, I couldn't find a better straightforward approach as of yet.
Happy to hear from you as on how this could be handled.
There is a danger of being caught in the web of language/country
differences. I propose just pruning the locales that are not comfortable
including this current table. We can address possible solutions in the
second wave of patching.
I am vary of getting into discussions on specific country variants just
because of the sheer complexity of this topic. It is probably better
addressed by respective maintainers of their locales. I do not see a
"one fits all" solution in this first wave possible.
I would like to have this "three options plan of action" vetted first
and then we could go to the specific detail. (Like, for instance, what
characters should be included in to the table, and in which
transliteration form.)
I am looking forward to your reply,
Egor Kobylkin
P.S. specifically as to how address languages other than Ru included in
GOST_7.79_System_B: we can take the first option left to right from that
table (Ru,By,Uk,Bg,Mk). Then it will technically work for all those
locales/languages but with errors where Ru supersedes their own variants.
Post by Rafal Luzynski
Post by Egor Kobylkin
Post by Keld Simonsen
Hi
Please note that translitteration of Cyrillic to latin is not universal.
There are different schemes for for example German, English and Danish, and
there is also an ISO standard for it.
Thanks for your feedback, Keld!
Could the locale maintainers that wouldn't like to include this patch
explicitly state so here?
I think it is about me so I must reply. I am sorry about that and the sole
reason is my lack of time. I'm just a volunteer here, that means it's not
my regular job to work on locale data nor anything in glibc nor in any other
open source project. I do these things only in my free time which I don't
have much. Of course you will see my contributions here and there but they
are either trivial or take me months to complete. Your patches are on my
radar but I can't tell any ETA for them. Of course, there are other people
around here and they are all welcome to come and join.
Post by Egor Kobylkin
- In the case that there is a different preferred cyrillic
transliteration table for any specific locale their maintainers may want
to point me to it so I can supply a separate table/patch.
- Or they could state explicitly that for some reason they would like to
exclude their locale from the patch for a default cyrillic
transliteration altogether.
As Keld wrote, there are probably separate rules for every language so
I don't think you should treat your rules as universal and include them
in every locale. At first sight, it seems to me they work only for English
(as a destination locale). Also, although it is called "transliteration
from Cyrillic" it seems that it covers only Russian alphabet. What about
other languages which use Cyrillic alphabet but add their own diacritic
characters? Think about Belarusian, Ukrainian, Serbian, Chechen, Chuvash,
Mari, Ossetian, Yakut, Tatar, and more. What about languages which use
Cyrillic alphabet but transliterate their respective letters in a different
way than Russian? For example, Russian "Ъ" is (I think) usually skipped
in transliteration, I think you propose "``", but when transliterating from
Bulgarian they usually transliterate this as "ă".
* I think you transliterate "щ" as "shh", wouldn't "shch" be better?
* You transliterate "ц" as "cz", wouldn't "ts" be better? By the way,
in Polish language "cz" is a correct transliteration of "ч".
* You transliterate "й" as "j", this is fine in many languages but wouldn't
"y" be better in English?
* In case of "е": how will you know if it is correct to transliterate it
to "e" or "ie" or "je" or "ye"?
These remarks are obviously incomplete, your patch deserves much more
attention to review.
Best regards,
Rafal
--
Marko Myllynen
Egor Kobylkin
2018-10-05 20:47:09 UTC
Permalink
After some kind help from Marko in the offline discussion
I realized the multi/single character approach I originally took was
against the of the iconv(1) logic anyway. So there is no harm in
dropping it and adopting Marko's suggestion instead. I will do so and
will resubmit the patch with ISO 9:1995/GOST 7.79 System A + fallback to
GOST 7.79 System B (for ASCII).

However this doesn't resolve the issue for ASCII part being different
for various locales. Again, I am offering the locale maintainers to let
me know if they want to 1) adopt the one I am supplying, 2) write their
own or 3) ignore the patch altogether. Your feedback is appreciated!
The first part (ISO-8859-15 or ASCII) defines the target encoding for
If the string //TRANSLIT is appended to to-encoding, characters
being converted are transliterated when needed and possible. This
means that when a character cannot be represented in the target
character set, it can be approximated through one or sev‐ eral
similar looking characters. Characters that are outside of the
target character set and cannot be transliterated are replaced
with a question mark (?) in the output.
So in the above examples, iconv(1) encounters the character U+0428
which is not part of either of the target encoding and since
//TRANSLIT is specified, iconv(1) tries transliteration according to
the rules defined above, in case of ASCII U+0160 is not part of the
target encoding so the next alternative is used.
Bests,
Egor Kobylkin
Hi,
% CYRILLIC CAPITAL LETTER SHA <U0428> "<U0160>";"<U0053><U0068>"
"printf \\u0428\\n | iconv -f UTF-8 -t ISO-8859-15//TRANSLIT | iconv
-f ISO-8859-15 -t UTF-8" would produce Š as per System A and "printf
\\u0428\\n | iconv -f UTF-8 -t ASCII//TRANSLIT" would produce Sh as
per System B.
Thanks,
Post by Egor Kobylkin
Hi Marko,
I have chosen the System B because it is ASCII compartible. System
A is not ASCII compartible (diacritics in target).
https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A
"GOST 7.79 contains two transliteration tables.
Post by Egor Kobylkin
System A one Cyrillic character to one Latin character, some with
diacritics – identical to ISO 9:1995
System B one Cyrillic character to one or many Latin characters
without diacritics " Hope this helps, Egor
Post by Marko Myllynen
Hi,
Would it make sense to first use ISO 9:1995/GOST 7.79 System A if
possible and if not, then fall back to GOST 7.79 System B?
Implementation-wise current translit_* files have few examples
where a non-ASCII transliteration is tried first before an ASCII
% NARROW NO-BREAK SPACE <U202F> <U00A0>;<U0020> % REVERSED
TRIPLE PRIME <U2037>
"<U2035><U2035><U2035>";"<U0060><U0060><U0060>"
Thanks,
Post by Egor Kobylkin
Keld,Marko,Rafal, other locale maintainers,
this all is written with having in mind a minimal viable fix
for this bug asap. I want to avoid wasting maintainers time
getting into fundamental discussions here (although for
perfectly good reasons).
I see three options: 1. those locale maintainers that are fine
with using ISO 9:1995/GOST_7.79_System_B cyrillic
transliteration table (Ru) include it in their locales (see
attached screenshot of the table). 2. those that that want to
have a differing table can create their own variety based on
the spreadsheet I have prepared
https://sourceware.org/bugzilla/attachment.cgi?id=8590 and
include it in this patch. 3. those that want to omit a
cyrillic transliteration altogether for now state so and just
carry over the bug #2872 from the year 2006.
Does this make sense to you?
Just to be super clear on this: the patch is a stopgap _ASCII_
transliteration table. ASCII being AMERICAN Standard Code for
Information Interchange, that is obviously orthogonal to any
transliteration rule of other countries. As such it is not
explicitly targeting transliteration standards of any country.
The fact that the patch is reflecting Russian variety of ISO
9:1995/GOST_7.79_System_B is because a) ISO
9:1995/GOST_7.79_System_B is available and can be helpful to a
majority of cyrillic users b) I have access to it including
via being proficient in Russian.
It is offered to all the respective locale maintainers as a
stopgap solution. Stopgap in the sense that it is better to
have some transliteration than not to have any at all and
carry over the bug from 2006. That it may be a somewhat
officially correct transliteration for ru_RU is a bonus. In
that sense I would dub the discussion on the correctness for
other languages "offtopic". Let me know if this is not OK.
You are all are correctly mentioning the deficiencies of this
approach. However, I couldn't find a better straightforward
approach as of yet. Happy to hear from you as on how this
could be handled.
There is a danger of being caught in the web of
language/country differences. I propose just pruning the
locales that are not comfortable including this current table.
We can address possible solutions in the second wave of
patching.
I am vary of getting into discussions on specific country
variants just because of the sheer complexity of this topic.
It is probably better addressed by respective maintainers of
their locales. I do not see a "one fits all" solution in this
first wave possible.
I would like to have this "three options plan of action"
vetted first and then we could go to the specific detail.
(Like, for instance, what characters should be included in to
the table, and in which transliteration form.)
I am looking forward to your reply, Egor Kobylkin
P.S. specifically as to how address languages other than Ru
included in GOST_7.79_System_B: we can take the first option
left to right from that table (Ru,By,Uk,Bg,Mk). Then it will
technically work for all those locales/languages but with
errors where Ru supersedes their own variants.
Post by Rafal Luzynski
Post by Egor Kobylkin
Post by Keld Simonsen
Hi
Please note that translitteration of Cyrillic to latin
is not universal. There are different schemes for for
example German, English and Danish, and there is also an
ISO standard for it.
Thanks for your feedback, Keld!
Could the locale maintainers that wouldn't like to include
this patch explicitly state so here?
I think it is about me so I must reply. I am sorry about
that and the sole reason is my lack of time. I'm just a
volunteer here, that means it's not my regular job to work
on locale data nor anything in glibc nor in any other open
source project. I do these things only in my free time
which I don't have much. Of course you will see my
contributions here and there but they are either trivial or
take me months to complete. Your patches are on my radar but
I can't tell any ETA for them. Of course, there are other
people around here and they are all welcome to come and
join.
Post by Egor Kobylkin
That is: - In the case that there is a different preferred
cyrillic transliteration table for any specific locale
their maintainers may want to point me to it so I can
supply a separate table/patch. - Or they could state
explicitly that for some reason they would like to exclude
their locale from the patch for a default cyrillic
transliteration altogether.
As Keld wrote, there are probably separate rules for every
language so I don't think you should treat your rules as
universal and include them in every locale. At first sight,
it seems to me they work only for English (as a destination
locale). Also, although it is called "transliteration from
Cyrillic" it seems that it covers only Russian alphabet. What
about other languages which use Cyrillic alphabet but add
their own diacritic characters? Think about Belarusian,
Ukrainian, Serbian, Chechen, Chuvash, Mari, Ossetian, Yakut,
Tatar, and more. What about languages which use Cyrillic
alphabet but transliterate their respective letters in a
different way than Russian? For example, Russian "Ъ" is (I
think) usually skipped in transliteration, I think you
propose "``", but when transliterating from Bulgarian they
usually transliterate this as "ă".
* I think you transliterate "щ" as "shh", wouldn't "shch" be
better? * You transliterate "ц" as "cz", wouldn't "ts" be
better? By the way, in Polish language "cz" is a correct
transliteration of "ч". * You transliterate "й" as "j", this
is fine in many languages but wouldn't "y" be better in
English? * In case of "е": how will you know if it is
correct to transliterate it to "e" or "ie" or "je" or "ye"?
These remarks are obviously incomplete, your patch deserves
much more attention to review.
Best regards,
Rafal
Marko Myllynen
2018-10-08 12:40:53 UTC
Permalink
Hi,

Thanks for the update. I have few mostly cosmetic comments below,
hopefully we'll hear from others whether they agree with this direction.

- Please add the standard glibc locale header (see the existing
translit_* files for reference)
- Consider wrapping the header lines at or around column 70-72
- Consider describing which characters, character ranges, or blocks are
supported (perhaps also describe why some of those are not included, see
e.g. https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode)
- Please remove trailing whitespaces and spaces after ;
- No duplicates:

% CYRILLIC SMALL LETTER IE
<U0435> <U0065>; <U0065>

should become:

% CYRILLIC SMALL LETTER IE
<U0435> <U0065>

- There are few issues with the definitions:

% CYRILLIC CAPITAL LETTER U
<U0423> <U0055>; <U0055>
% CYRILLIC UNDEFINED
<U0423><U0423> <U00DA>; "<U0055><U0060>"

% CYRILLIC SMALL LETTER U
<U0443> <U0075>; <U0075>
% CYRILLIC UNDEFINED
<U0443><U0443> <U00FA>; "<U0075><U0060>"

I wonder would it be possible to automate generation of this file so
that issues like the above could avoided? But perhaps that could be the
next step once this initial patch lands.

Thanks,
Post by Egor Kobylkin
After some kind help from Marko in the offline discussion
I realized the multi/single character approach I originally took was
against the of the iconv(1) logic anyway. So there is no harm in
dropping it and adopting Marko's suggestion instead. I will do so and
will resubmit the patch with ISO 9:1995/GOST 7.79 System A + fallback to
GOST 7.79 System B (for ASCII).
However this doesn't resolve the issue for ASCII part being different
for various locales. Again, I am offering the locale maintainers to let
me know if they want to 1) adopt the one I am supplying, 2) write their
own or 3) ignore the patch altogether. Your feedback is appreciated!
The first part (ISO-8859-15 or ASCII) defines the target encoding for
If the string //TRANSLIT is appended to to-encoding, characters
being converted are transliterated when needed and possible. This
means that when a character cannot be represented in the target
character set, it can be approximated through one or sev‐ eral
similar looking characters. Characters that are outside of the
target character set and cannot be transliterated are replaced
with a question mark (?) in the output.
So in the above examples, iconv(1) encounters the character U+0428
which is not part of either of the target encoding and since
//TRANSLIT is specified, iconv(1) tries transliteration according to
the rules defined above, in case of ASCII U+0160 is not part of the
target encoding so the next alternative is used.
Bests,
Egor Kobylkin
Hi,
% CYRILLIC CAPITAL LETTER SHA <U0428> "<U0160>";"<U0053><U0068>"
"printf \\u0428\\n | iconv -f UTF-8 -t ISO-8859-15//TRANSLIT | iconv
-f ISO-8859-15 -t UTF-8" would produce Š as per System A and "printf
\\u0428\\n | iconv -f UTF-8 -t ASCII//TRANSLIT" would produce Sh as
per System B.
Thanks,
Post by Egor Kobylkin
Hi Marko,
I have chosen the System B because it is ASCII compartible. System
A is not ASCII compartible (diacritics in target).
https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A
"GOST 7.79 contains two transliteration tables.
Post by Egor Kobylkin
System A one Cyrillic character to one Latin character, some with
diacritics – identical to ISO 9:1995
System B one Cyrillic character to one or many Latin characters
without diacritics " Hope this helps, Egor
Post by Marko Myllynen
Hi,
Would it make sense to first use ISO 9:1995/GOST 7.79 System A if
possible and if not, then fall back to GOST 7.79 System B?
Implementation-wise current translit_* files have few examples
where a non-ASCII transliteration is tried first before an ASCII
% NARROW NO-BREAK SPACE <U202F> <U00A0>;<U0020> % REVERSED
TRIPLE PRIME <U2037>
"<U2035><U2035><U2035>";"<U0060><U0060><U0060>"
Thanks,
Post by Egor Kobylkin
Keld,Marko,Rafal, other locale maintainers,
this all is written with having in mind a minimal viable fix
for this bug asap. I want to avoid wasting maintainers time
getting into fundamental discussions here (although for
perfectly good reasons).
I see three options: 1. those locale maintainers that are fine
with using ISO 9:1995/GOST_7.79_System_B cyrillic
transliteration table (Ru) include it in their locales (see
attached screenshot of the table). 2. those that that want to
have a differing table can create their own variety based on
the spreadsheet I have prepared
https://sourceware.org/bugzilla/attachment.cgi?id=8590 and
include it in this patch. 3. those that want to omit a
cyrillic transliteration altogether for now state so and just
carry over the bug #2872 from the year 2006.
Does this make sense to you?
Just to be super clear on this: the patch is a stopgap _ASCII_
transliteration table. ASCII being AMERICAN Standard Code for
Information Interchange, that is obviously orthogonal to any
transliteration rule of other countries. As such it is not
explicitly targeting transliteration standards of any country.
The fact that the patch is reflecting Russian variety of ISO
9:1995/GOST_7.79_System_B is because a) ISO
9:1995/GOST_7.79_System_B is available and can be helpful to a
majority of cyrillic users b) I have access to it including
via being proficient in Russian.
It is offered to all the respective locale maintainers as a
stopgap solution. Stopgap in the sense that it is better to
have some transliteration than not to have any at all and
carry over the bug from 2006. That it may be a somewhat
officially correct transliteration for ru_RU is a bonus. In
that sense I would dub the discussion on the correctness for
other languages "offtopic". Let me know if this is not OK.
You are all are correctly mentioning the deficiencies of this
approach. However, I couldn't find a better straightforward
approach as of yet. Happy to hear from you as on how this
could be handled.
There is a danger of being caught in the web of
language/country differences. I propose just pruning the
locales that are not comfortable including this current table.
We can address possible solutions in the second wave of
patching.
I am vary of getting into discussions on specific country
variants just because of the sheer complexity of this topic.
It is probably better addressed by respective maintainers of
their locales. I do not see a "one fits all" solution in this
first wave possible.
I would like to have this "three options plan of action"
vetted first and then we could go to the specific detail.
(Like, for instance, what characters should be included in to
the table, and in which transliteration form.)
I am looking forward to your reply, Egor Kobylkin
P.S. specifically as to how address languages other than Ru
included in GOST_7.79_System_B: we can take the first option
left to right from that table (Ru,By,Uk,Bg,Mk). Then it will
technically work for all those locales/languages but with
errors where Ru supersedes their own variants.
Post by Rafal Luzynski
Post by Egor Kobylkin
Post by Keld Simonsen
Hi
Please note that translitteration of Cyrillic to latin
is not universal. There are different schemes for for
example German, English and Danish, and there is also an
ISO standard for it.
Thanks for your feedback, Keld!
Could the locale maintainers that wouldn't like to include
this patch explicitly state so here?
I think it is about me so I must reply. I am sorry about
that and the sole reason is my lack of time. I'm just a
volunteer here, that means it's not my regular job to work
on locale data nor anything in glibc nor in any other open
source project. I do these things only in my free time
which I don't have much. Of course you will see my
contributions here and there but they are either trivial or
take me months to complete. Your patches are on my radar but
I can't tell any ETA for them. Of course, there are other
people around here and they are all welcome to come and
join.
Post by Egor Kobylkin
That is: - In the case that there is a different preferred
cyrillic transliteration table for any specific locale
their maintainers may want to point me to it so I can
supply a separate table/patch. - Or they could state
explicitly that for some reason they would like to exclude
their locale from the patch for a default cyrillic
transliteration altogether.
As Keld wrote, there are probably separate rules for every
language so I don't think you should treat your rules as
universal and include them in every locale. At first sight,
it seems to me they work only for English (as a destination
locale). Also, although it is called "transliteration from
Cyrillic" it seems that it covers only Russian alphabet. What
about other languages which use Cyrillic alphabet but add
their own diacritic characters? Think about Belarusian,
Ukrainian, Serbian, Chechen, Chuvash, Mari, Ossetian, Yakut,
Tatar, and more. What about languages which use Cyrillic
alphabet but transliterate their respective letters in a
different way than Russian? For example, Russian "Ъ" is (I
think) usually skipped in transliteration, I think you
propose "``", but when transliterating from Bulgarian they
usually transliterate this as "ă".
* I think you transliterate "щ" as "shh", wouldn't "shch" be
better? * You transliterate "ц" as "cz", wouldn't "ts" be
better? By the way, in Polish language "cz" is a correct
transliteration of "ч". * You transliterate "й" as "j", this
is fine in many languages but wouldn't "y" be better in
English? * In case of "е": how will you know if it is
correct to transliterate it to "e" or "ie" or "je" or "ye"?
These remarks are obviously incomplete, your patch deserves
much more attention to review.
Best regards,
Rafal
--
Marko Myllynen
Rafal Luzynski
2018-10-08 22:23:42 UTC
Permalink
Post by Marko Myllynen
Hi,
Thanks for the update. I have few mostly cosmetic comments below,
hopefully we'll hear from others whether they agree with this direction.
- Please add the standard glibc locale header (see the existing
translit_* files for reference)
- Consider wrapping the header lines at or around column 70-72
- Consider describing which characters, character ranges, or blocks are
supported (perhaps also describe why some of those are not included, see
e.g. https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode)
- Please remove trailing whitespaces and spaces after ;
Thanks for this, Marko. While at this, in the ChangeLog and in the commit
message these paths:

* locales/aa_DJ: likewise

1. Should be a relative path starting in the root directory of glibc source,
that is: "* localedata/locales/aa_DJ".
2. Should be "Likewise." (starting with an uppercase and ending with a dot).
Post by Marko Myllynen
% CYRILLIC SMALL LETTER IE
<U0435> <U0065>; <U0065>
% CYRILLIC SMALL LETTER IE
<U0435> <U0065>
% CYRILLIC CAPITAL LETTER U
<U0423> <U0055>; <U0055>
% CYRILLIC UNDEFINED
<U0423><U0423> <U00DA>; "<U0055><U0060>"
% CYRILLIC SMALL LETTER U
<U0443> <U0075>; <U0075>
% CYRILLIC UNDEFINED
<U0443><U0443> <U00FA>; "<U0075><U0060>"
Are the duplicates here because some Cyrillic letters may have multiple
Latin transliterations depending on the context, for example Cyrillic IE
must be transliterated sometimes as "e", sometimes as "ie", sometimes
as "ye" or "je"? Can we provide rules for groups of characters instead?
Post by Marko Myllynen
I wonder would it be possible to automate generation of this file so
that issues like the above could avoided? But perhaps that could be the
next step once this initial patch lands.
I agree with this.

Regards,

Rafal
Egor Kobylkin
2018-10-08 23:35:57 UTC
Permalink
Post by Rafal Luzynski
Post by Marko Myllynen
Hi,
Thanks for the update. I have few mostly cosmetic comments below,
hopefully we'll hear from others whether they agree with this direction.
Yeah, the earlier we have feedback the more productive we are. I'd be
happy to get much feedback on this as early as possible. So please
everybody concerned please chime in.
Post by Rafal Luzynski
Post by Marko Myllynen
% CYRILLIC SMALL LETTER IE
<U0435> <U0065>; <U0065>
% CYRILLIC SMALL LETTER IE
<U0435> <U0065>
% CYRILLIC CAPITAL LETTER U
<U0423> <U0055>; <U0055>
% CYRILLIC UNDEFINED
<U0423><U0423> <U00DA>; "<U0055><U0060>"
% CYRILLIC SMALL LETTER U
<U0443> <U0075>; <U0075>
% CYRILLIC UNDEFINED
<U0443><U0443> <U00FA>; "<U0075><U0060>"
Are the duplicates here because some Cyrillic letters may have multiple
Latin transliterations depending on the context, for example Cyrillic IE
must be transliterated sometimes as "e", sometimes as "ie", sometimes
as "ye" or "je"? Can we provide rules for groups of characters instead?
No, the duplicates are just by design of my line generating logic. I
have fixed (removed) them. The varying transcription between
languages/locales can not be handled in one file at all as far as I
understood.
Post by Rafal Luzynski
Post by Marko Myllynen
I wonder would it be possible to automate generation of this file so
that issues like the above could avoided? But perhaps that could be the
next step once this initial patch lands.
I am generating the content part of the translit_cyrillc from the
LibreOffice Spreadsheet. Not sure if you had time to view it by now?
https://sourceware.org/bugzilla/attachment.cgi?id=11299

Anyway I have just fixed the issues identified by Marko above in that
spreadsheet. I will do the changes for the below request and then upload
the new translit_cyrillic file to the bugzilla.
Post by Rafal Luzynski
Post by Marko Myllynen
- Please add the standard glibc locale header (see the existing
translit_* files for reference)
- Consider wrapping the header lines at or around column 70-72
- Consider describing which characters, character ranges, or blocks are
supported (perhaps also describe why some of those are not included, see
e.g. https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode)
- Please remove trailing whitespaces and spaces after ;
Thanks for this, Marko. While at this, in the ChangeLog and in the commit
* locales/aa_DJ: likewise
1. Should be a relative path starting in the root directory of glibc source,
that is: "* localedata/locales/aa_DJ".
2. Should be "Likewise." (starting with an uppercase and ending with a dot).
will do.

Bests,
Egor
Egor Kobylkin
2018-10-09 13:18:04 UTC
Permalink
Hi,

I have now implemented all the changes requested for translit_cyrillic
file but started hitting what seems like a bug:

- If the line <U0425> <U0048>;<U0058> is present in translt_cyrillic the
locale compilation fails i.e. grep CYRILLIC < $testfile |
LOCPATH=$workdir/compiled_locales/"$locale"/ LC_ALL="$locale".UTF-8
iconv -f UTF-8 -t ASCII//TRANSLIT is hanging frozen.

- If the line <U0425> <U0048>;<U0058> is absent from translit_cyrillic
everything works, just the transliteration of <U0425> fails as expected
(? is displayed)

- If translit_cyrillic contains <U0425> <U0048>;<U0058> as the _only_
line the transliteration of <U0425> works again (others as ?).

Would you have any idea into what direction should I look? The new
translit_cyrillic is attached.

(<U0425> is % CYRILLIC CAPITAL LETTER HA)

Best regards,
Egor
Post by Egor Kobylkin
Post by Rafal Luzynski
Post by Marko Myllynen
Hi,
Thanks for the update. I have few mostly cosmetic comments below,
hopefully we'll hear from others whether they agree with this direction.
Yeah, the earlier we have feedback the more productive we are. I'd be
happy to get much feedback on this as early as possible. So please
everybody concerned please chime in.
Post by Rafal Luzynski
Post by Marko Myllynen
% CYRILLIC SMALL LETTER IE
<U0435> <U0065>; <U0065>
% CYRILLIC SMALL LETTER IE
<U0435> <U0065>
% CYRILLIC CAPITAL LETTER U
<U0423> <U0055>; <U0055>
% CYRILLIC UNDEFINED
<U0423><U0423> <U00DA>; "<U0055><U0060>"
% CYRILLIC SMALL LETTER U
<U0443> <U0075>; <U0075>
% CYRILLIC UNDEFINED
<U0443><U0443> <U00FA>; "<U0075><U0060>"
Are the duplicates here because some Cyrillic letters may have multiple
Latin transliterations depending on the context, for example Cyrillic IE
must be transliterated sometimes as "e", sometimes as "ie", sometimes
as "ye" or "je"? Can we provide rules for groups of characters instead?
No, the duplicates are just by design of my line generating logic. I
have fixed (removed) them. The varying transcription between
languages/locales can not be handled in one file at all as far as I
understood.
Post by Rafal Luzynski
Post by Marko Myllynen
I wonder would it be possible to automate generation of this file so
that issues like the above could avoided? But perhaps that could be the
next step once this initial patch lands.
I am generating the content part of the translit_cyrillc from the
LibreOffice Spreadsheet. Not sure if you had time to view it by now?
https://sourceware.org/bugzilla/attachment.cgi?id=11299
Anyway I have just fixed the issues identified by Marko above in that
spreadsheet. I will do the changes for the below request and then upload
the new translit_cyrillic file to the bugzilla.
Post by Rafal Luzynski
Post by Marko Myllynen
- Please add the standard glibc locale header (see the existing
translit_* files for reference)
- Consider wrapping the header lines at or around column 70-72
- Consider describing which characters, character ranges, or blocks are
supported (perhaps also describe why some of those are not included, see
e.g. https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode)
- Please remove trailing whitespaces and spaces after ;
Thanks for this, Marko. While at this, in the ChangeLog and in the commit
* locales/aa_DJ: likewise
1. Should be a relative path starting in the root directory of glibc
source,
Post by Rafal Luzynski
that is: "* localedata/locales/aa_DJ".
2. Should be "Likewise." (starting with an uppercase and ending with a
dot).
will do.
Bests,
Egor
Egor Kobylkin
2018-10-09 18:34:08 UTC
Permalink
The culprits were the "" around the "<U0423><U0301>" (<U00DA>) and
"<U0443><U0301>" (<U00FA>).
It works now with
% CYRILLIC UNDEFINED
<U0423><U0301> <U00DA>;"<U0055><U0060>"
% CYRILLIC UNDEFINED
<U0443><U0301> <U00FA>;"<U0075><U0060>"

The <U0301> is "combining" and obviously it doesn't work if enclosed in
quotes with the letter codepoint. Please let me know if there is another
explanation.

I will now make those changes and generate the patch itself.
Egor
Post by Egor Kobylkin
Hi,
I have now implemented all the changes requested for translit_cyrillic
- If the line <U0425> <U0048>;<U0058> is present in translt_cyrillic the
locale compilation fails i.e. grep CYRILLIC < $testfile |
LOCPATH=$workdir/compiled_locales/"$locale"/ LC_ALL="$locale".UTF-8
iconv -f UTF-8 -t ASCII//TRANSLIT is hanging frozen.
- If the line <U0425> <U0048>;<U0058> is absent from translit_cyrillic
everything works, just the transliteration of <U0425> fails as expected
(? is displayed)
- If translit_cyrillic contains <U0425> <U0048>;<U0058> as the _only_
line the transliteration of <U0425> works again (others as ?).
Would you have any idea into what direction should I look? The new
translit_cyrillic is attached.
(<U0425> is % CYRILLIC CAPITAL LETTER HA)
Best regards,
Egor
Post by Egor Kobylkin
Post by Rafal Luzynski
Post by Marko Myllynen
Hi,
Thanks for the update. I have few mostly cosmetic comments below,
hopefully we'll hear from others whether they agree with this direction.
Yeah, the earlier we have feedback the more productive we are. I'd be
happy to get much feedback on this as early as possible. So please
everybody concerned please chime in.
Post by Rafal Luzynski
Post by Marko Myllynen
% CYRILLIC SMALL LETTER IE
<U0435> <U0065>; <U0065>
% CYRILLIC SMALL LETTER IE
<U0435> <U0065>
% CYRILLIC CAPITAL LETTER U
<U0423> <U0055>; <U0055>
% CYRILLIC UNDEFINED
<U0423><U0423> <U00DA>; "<U0055><U0060>"
% CYRILLIC SMALL LETTER U
<U0443> <U0075>; <U0075>
% CYRILLIC UNDEFINED
<U0443><U0443> <U00FA>; "<U0075><U0060>"
Are the duplicates here because some Cyrillic letters may have multiple
Latin transliterations depending on the context, for example Cyrillic IE
must be transliterated sometimes as "e", sometimes as "ie", sometimes
as "ye" or "je"? Can we provide rules for groups of characters instead?
No, the duplicates are just by design of my line generating logic. I
have fixed (removed) them. The varying transcription between
languages/locales can not be handled in one file at all as far as I
understood.
Post by Rafal Luzynski
Post by Marko Myllynen
I wonder would it be possible to automate generation of this file so
that issues like the above could avoided? But perhaps that could be the
next step once this initial patch lands.
I am generating the content part of the translit_cyrillc from the
LibreOffice Spreadsheet. Not sure if you had time to view it by now?
https://sourceware.org/bugzilla/attachment.cgi?id=11299
Anyway I have just fixed the issues identified by Marko above in that
spreadsheet. I will do the changes for the below request and then upload
the new translit_cyrillic file to the bugzilla.
Post by Rafal Luzynski
Post by Marko Myllynen
- Please add the standard glibc locale header (see the existing
translit_* files for reference)
- Consider wrapping the header lines at or around column 70-72
- Consider describing which characters, character ranges, or blocks are
supported (perhaps also describe why some of those are not included, see
e.g. https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode)
- Please remove trailing whitespaces and spaces after ;
Thanks for this, Marko. While at this, in the ChangeLog and in the commit
* locales/aa_DJ: likewise
1. Should be a relative path starting in the root directory of glibc
source,
Post by Rafal Luzynski
that is: "* localedata/locales/aa_DJ".
2. Should be "Likewise." (starting with an uppercase and ending with a
dot).
will do.
Bests,
Egor
Rafal Luzynski
2018-10-09 22:17:36 UTC
Permalink
Post by Egor Kobylkin
The culprits were the "" around the "<U0423><U0301>" (<U00DA>) and
"<U0443><U0301>" (<U00FA>).
It works now with
% CYRILLIC UNDEFINED
<U0423><U0301> <U00DA>;"<U0055><U0060>"
% CYRILLIC UNDEFINED
<U0443><U0301> <U00FA>;"<U0075><U0060>"
[...]
I wonder why you need Cyrillic U with acute, and why you comment it
as "undefined" at all. I know that any Cyrillic vowel may appear with
an acute accent but "the diacritic is used only in dictionaries, children's
books, resources for foreign-language learners (...)". [1] So maybe
all vowels with an acute accent should be handled (which I think is fine)
rather than just U.

Regards,

Rafal


[1] https://en.wikipedia.org/wiki/Russian_alphabet#Diacritics
Egor Kobylkin
2018-10-09 22:40:31 UTC
Permalink
Post by Rafal Luzynski
Post by Egor Kobylkin
The culprits were the "" around the "<U0423><U0301>" (<U00DA>) and
"<U0443><U0301>" (<U00FA>).
It works now with
% CYRILLIC UNDEFINED
<U0423><U0301> <U00DA>;"<U0055><U0060>"
% CYRILLIC UNDEFINED
<U0443><U0301> <U00FA>;"<U0075><U0060>"
[...]
I wonder why you need Cyrillic U with acute, and why you comment it
as "undefined" at all. I know that any Cyrillic vowel may appear with
an acute accent but "the diacritic is used only in dictionaries, children's
books, resources for foreign-language learners (...)". [1] So maybe
all vowels with an acute accent should be handled (which I think is fine)
rather than just U.
I have just taken the https://en.wikipedia.org/wiki/ISO_9 table and
implemented it on Marko's suggestion. Personally I have no opinion on
what letters should be included and under what name. These funny Us just
happened to be in the ISO9 table.

There is no codepoint and no name for <U0423><U0301> and <U0443><U0301>
in Unicode. That’s why its coming through that way from my worksheet as
it does a reverse lookup on the names based on the Unicode codepoints.

Manually we can change it to whatever you’d suggest in the
translit_cyrillic. I just don’t know the right name.

On my side I think I have all outstanding tasks complete for the patch
https://sourceware.org/bugzilla/attachment.cgi?id=11144. So please let
me know explicitly if you'd like anything changed there.

I was planning to rewrite just the commit message according to your
earlier feedback and resubmit sometime soon.

Bests,
Diego
Egor Kobylkin
2018-10-09 22:42:58 UTC
Permalink
Ups, sorry, wrong link to the patch
correct link https://sourceware.org/bugzilla/attachment.cgi?id=11303
Post by Egor Kobylkin
Post by Rafal Luzynski
Post by Egor Kobylkin
The culprits were the "" around the "<U0423><U0301>" (<U00DA>) and
"<U0443><U0301>" (<U00FA>).
It works now with
% CYRILLIC UNDEFINED
<U0423><U0301> <U00DA>;"<U0055><U0060>"
% CYRILLIC UNDEFINED
<U0443><U0301> <U00FA>;"<U0075><U0060>"
[...]
I wonder why you need Cyrillic U with acute, and why you comment it
as "undefined" at all. I know that any Cyrillic vowel may appear with
an acute accent but "the diacritic is used only in dictionaries, children's
books, resources for foreign-language learners (...)". [1] So maybe
all vowels with an acute accent should be handled (which I think is fine)
rather than just U.
I have just taken the https://en.wikipedia.org/wiki/ISO_9 table and
implemented it on Marko's suggestion. Personally I have no opinion on
what letters should be included and under what name. These funny Us just
happened to be in the ISO9 table.
There is no codepoint and no name for <U0423><U0301> and <U0443><U0301>
in Unicode. That’s why its coming through that way from my worksheet as
it does a reverse lookup on the names based on the Unicode codepoints.
Manually we can change it to whatever you’d suggest in the
translit_cyrillic. I just don’t know the right name.
On my side I think I have all outstanding tasks complete for the patch
https://sourceware.org/bugzilla/attachment.cgi?id=11144. So please let
me know explicitly if you'd like anything changed there.
correct link https://sourceware.org/bugzilla/attachment.cgi?id=11303
Post by Egor Kobylkin
I was planning to rewrite just the commit message according to your
earlier feedback and resubmit sometime soon.
Bests,
Egor
Marko Myllynen
2018-10-10 11:22:59 UTC
Permalink
Hi,
Post by Egor Kobylkin
Ups, sorry, wrong link to the patch
correct link https://sourceware.org/bugzilla/attachment.cgi?id=11303
Although I haven't checked every rule this in general looks very good
(but see below). Not sure do we want to add the few missing characters
mentioned at https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode,
e.g., one instantly notices that U+0400 is missing. (I wouldn't add at
least initially the more exotic characters, like the historic ones,
though.) Perhaps filing a bug or two for these cases for separate
consideration would be ok.
Post by Egor Kobylkin
Post by Egor Kobylkin
Post by Rafal Luzynski
Post by Egor Kobylkin
The culprits were the "" around the "<U0423><U0301>" (<U00DA>) and
"<U0443><U0301>" (<U00FA>).
It works now with
% CYRILLIC UNDEFINED
<U0423><U0301> <U00DA>;"<U0055><U0060>"
% CYRILLIC UNDEFINED
<U0443><U0301> <U00FA>;"<U0075><U0060>"
[...]
I wonder why you need Cyrillic U with acute, and why you comment it
as "undefined" at all. I know that any Cyrillic vowel may appear with
an acute accent but "the diacritic is used only in dictionaries, children's
books, resources for foreign-language learners (...)". [1] So maybe
all vowels with an acute accent should be handled (which I think is fine)
rather than just U.
I have just taken the https://en.wikipedia.org/wiki/ISO_9 table and
implemented it on Marko's suggestion. Personally I have no opinion on
what letters should be included and under what name. These funny Us just
happened to be in the ISO9 table.
There is no codepoint and no name for <U0423><U0301> and <U0443><U0301>
in Unicode. That’s why its coming through that way from my worksheet as
it does a reverse lookup on the names based on the Unicode codepoints.
Manually we can change it to whatever you’d suggest in the
translit_cyrillic. I just don’t know the right name.
I'm not sure this will work, no existing rule in translit_* files
contain two characters, I'd assume that the rule for U+0423 is applied
first and then the below rule is never used.

% CYRILLIC UNDEFINED
<U0423><U0301> <U00DA>;"<U0055><U0060>"

Perhaps this should be commented out or removed altogether if it's not
working as intended.

Thanks,
--
Marko Myllynen
Egor Kobylkin
2018-10-10 12:19:37 UTC
Permalink
Post by Marko Myllynen
Post by Egor Kobylkin
correct link https://sourceware.org/bugzilla/attachment.cgi?id=11303
Although I haven't checked every rule this in general looks very good
(but see below).
Not sure do we want to add the few missing characters
mentioned at https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode,
e.g., one instantly notices that U+0400 is missing. (I wouldn't add at
least initially the more exotic characters, like the historic ones,
though.) Perhaps filing a bug or two for these cases for separate
consideration would be ok.
The question here is what should serve as their transliteration and
transcription?
They are not covered by ISO9 neither by GOST 7.79. So maybe it would be
reasonable to assume there is no notable occurrence of those anywhere?

Anyway I am happy to include your specific suggestions for all and any
Unicode quartets in this form:
[Cyrillic Unicode
; ISO9 Latin Transliteration (System A) as Unicode
; Transcription (System B) as (mulitcharacter)ASCII
; name to put in %COMMENT
].
Post by Marko Myllynen
Post by Egor Kobylkin
Post by Egor Kobylkin
Post by Rafal Luzynski
Post by Egor Kobylkin
The culprits were the "" around the "<U0423><U0301>" (<U00DA>) and
"<U0443><U0301>" (<U00FA>).
It works now with
% CYRILLIC UNDEFINED
<U0423><U0301> <U00DA>;"<U0055><U0060>"
% CYRILLIC UNDEFINED
<U0443><U0301> <U00FA>;"<U0075><U0060>"
[...]
I wonder why you need Cyrillic U with acute, and why you comment it
as "undefined" at all. I know that any Cyrillic vowel may appear with
an acute accent but "the diacritic is used only in dictionaries, children's
books, resources for foreign-language learners (...)". [1] So maybe
all vowels with an acute accent should be handled (which I think is fine)
rather than just U.
I have just taken the https://en.wikipedia.org/wiki/ISO_9 table and
implemented it on Marko's suggestion. Personally I have no opinion on
what letters should be included and under what name. These funny Us just
happened to be in the ISO9 table.
There is no codepoint and no name for <U0423><U0301> and <U0443><U0301>
in Unicode. That’s why its coming through that way from my worksheet as
it does a reverse lookup on the names based on the Unicode codepoints.
Manually we can change it to whatever you’d suggest in the
translit_cyrillic. I just don’t know the right name.
I'm not sure this will work, no existing rule in translit_* files
contain two characters, I'd assume that the rule for U+0423 is applied
first and then the below rule is never used.
% CYRILLIC UNDEFINED
<U0423><U0301> <U00DA>;"<U0055><U0060>"
Perhaps this should be commented out or removed altogether if it's not
working as intended.
here is a result of my test on
https://sourceware.org/bugzilla/attachment.cgi?id=11304

U0423 0301-У́ -> U0423 0301-U
U0443 0301-у́ -> U0443 0301-u

So yes, they are not processed. I would drop them to not to have special
cases. But I am also fine with keeping them because all work is done
already.

Result:
CYRILLIC RUSSIAN S``esh` eshhyo e`tih myagkih francuzskih bulok, da
vypej zhe chayu. SA`ESH` ESHHYO E`TIH MYAGKIH FRANCUZSKIH BULOK? DA
VYPEJ ZHE CHAYU!
CYRILLIC COMPLETE U0401-YO U0402-DJ U0403-G` U0404-Ye U0405-Z` U0406-I
U0407-Yi U0408-J U0409-L` U040A-N` U040B-TSH U040C-K` U040E-U` U040F-Dh
U0410-A U0411-B U0412-V U0413-G U0414-D U0415-E U0416-ZH U0417-Z U0418-I
U0419-J U041A-K U041B-L U041C-M U041D-N U041E-O U041F-P U0420-R U0421-S
U0422-T U0423-U U0423 0301-U U0424-F U0425-H U0426-C U0427-CH U0428-SH
U0429-SHH U042A-`` U042B-Y U042C-` U042D-E` U042E-YU U042F-YA U0430-a
U0431-b U0432-v U0433-g U0434-d U0435-e U0436-zh U0437-z U0438-i U0439-j
U043A-k U043B-l U043C-m U043D-n U043E-o U043F-p U0440-r U0441-s U0442-t
U0443-u U0443 0301-u U0444-f U0445-h U0446-c U0447-ch U0448-sh U0449-shh
U044A-A` U044B-y U044C-` U044D-e` U044E-yu U044F-ya U0451-yo U0452-dj
U0453-g` U0454-ye U0455-z` U0456-i U0457-yi U0458-j U0459-l` U045A-n`
U045B-tsh U045C-k` U045E-u` U045F-dh U046A-O` U046B-o` U0472-Fh U0473-fh
U0474-Yh U0475-yh U048C-E` U048D-e` U0490-G` U0491-g` U0492-GH U0493-gh
U0494-GH U0495-gh U0496-ZH` U0497-zh` U049A-K` U049B-k` U049E-K`
U049F-k` U04A2-N` U04A3-n` U04A4-NG U04A5-ng U04A6-P` U04A7-p` U04A8-O`
U04A9-o` U04AA-C` U04AB-C` U04AC-T` U04AD-t` U04AE-U U04AF-u U04B2-H`
U04B3-h` U04B4-TCZ U04B5-tcz U04BA-SH` U04BB-SH` U04BC-CH` U04BD-ch`
U04BE-CH` U04BF-ch` U04C0-i U04C1-ZH` U04C2-zh` U04CB-CH` U04CC-ch`
U04D0-A` U04D1-a` U04D2-A` U04D3-a` U04D6-E` U04D7-e` U04D8-A` U04D9-a`
U04DC-ZH` U04DD-zh` U04DE-Z` U04DF-z` U04E0-Z` U04E1-z` U04E4-I`
U04E5-i` U04E6-O` U04E7-o` U04E8-O` U04E9-o` U04F0-U` U04F1-u` U04F2-U`
U04F3-u` U04F4-CH` U04F5-ch` U04F8-Y` U04F9-y` U2019-'

Source:
CYRILLIC RUSSIAN Съешь ещё этих мягких французских булок, да выпей же
чаю. СЪЕШЬ ЕЩЁ ЭТИХ МЯГКИХ ФРАНЦУЗСКИХ БУЛОК? ДА ВЫПЕЙ ЖЕ ЧАЮ!
CYRILLIC COMPLETE U0401-Ё U0402-Ђ U0403-Ѓ U0404-Є U0405-Ѕ U0406-І
U0407-Ї U0408-Ј U0409-Љ U040A-Њ U040B-Ћ U040C-Ќ U040E-Ў U040F-Џ U0410-А
U0411-Б U0412-В U0413-Г U0414-Д U0415-Е U0416-Ж U0417-З U0418-И U0419-Й
U041A-К U041B-Л U041C-М U041D-Н U041E-О U041F-П U0420-Р U0421-С U0422-Т
U0423-У U0423 0301-У́ U0424-Ф U0425-Х U0426-Ц U0427-Ч U0428-Ш U0429-Щ
U042A-ъ U042B-Ы U042C-ь U042D-Э U042E-Ю U042F-Я U0430-а U0431-б U0432-в
U0433-г U0434-д U0435-е U0436-ж U0437-з U0438-и U0439-й U043A-к U043B-л
U043C-м U043D-н U043E-о U043F-п U0440-р U0441-с U0442-т U0443-у U0443
0301-у́ U0444-ф U0445-х U0446-ц U0447-ч U0448-ш U0449-щ U044A-Ъ U044B-ы
U044C-Ь U044D-э U044E-ю U044F-я U0451-ё U0452-ђ U0453-ѓ U0454-є U0455-ѕ
U0456-і U0457-ї U0458-ј U0459-љ U045A-њ U045B-ћ U045C-ќ U045E-ў U045F-џ
U046A-Ѫ U046B-ѫ U0472-Ѳ U0473-ѳ U0474-Ѵ U0475-ѵ U048C-Ҍ U048D-ҍ U0490-Ґ
U0491-ґ U0492-Ғ U0493-ғ U0494-Ҕ U0495-ҕ U0496-Җ U0497-җ U049A-Қ U049B-қ
U049E-Ҟ U049F-ҟ U04A2-Ң U04A3-ң U04A4-Ҥ U04A5-ҥ U04A6-Ҧ U04A7-ҧ U04A8-Ҩ
U04A9-ҩ U04AA-Ҫ U04AB-ҫ U04AC-Ҭ U04AD-ҭ U04AE-Ү U04AF-ү U04B2-Ҳ U04B3-ҳ
U04B4-Ҵ U04B5-ҵ U04BA-Һ U04BB-һ U04BC-Ҽ U04BD-ҽ U04BE-Ҿ U04BF-ҿ U04C0-Ӏ
U04C1-Ӂ U04C2-ӂ U04CB-Ӌ U04CC-ӌ U04D0-Ӑ U04D1-ӑ U04D2-Ӓ U04D3-ӓ U04D6-Ӗ
U04D7-ӗ U04D8-Ә U04D9-ә U04DC-Ӝ U04DD-ӝ U04DE-Ӟ U04DF-ӟ U04E0-Ӡ U04E1-ӡ
U04E4-Ӥ U04E5-ӥ U04E6-Ӧ U04E7-ӧ U04E8-Ө U04E9-ө U04F0-Ӱ U04F1-ӱ U04F2-Ӳ
U04F3-ӳ U04F4-Ӵ U04F5-ӵ U04F8-Ӹ U04F9-ӹ U2019-’
Marko Myllynen
2018-10-10 12:34:26 UTC
Permalink
Hi,
Post by Egor Kobylkin
Post by Marko Myllynen
Post by Egor Kobylkin
correct link https://sourceware.org/bugzilla/attachment.cgi?id=11303
Although I haven't checked every rule this in general looks very good
(but see below).
Not sure do we want to add the few missing characters
mentioned at https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode,
e.g., one instantly notices that U+0400 is missing. (I wouldn't add at
least initially the more exotic characters, like the historic ones,
though.) Perhaps filing a bug or two for these cases for separate
consideration would be ok.
The question here is what should serve as their transliteration and
transcription?
Not sure, so filing a separate bug about this once your patch is merged
might be the most suitable action for now, I don't think we want to
postpone merging your work further due to these non-ISO 9 cases.
Post by Egor Kobylkin
Post by Marko Myllynen
I'm not sure this will work, no existing rule in translit_* files
contain two characters, I'd assume that the rule for U+0423 is applied
first and then the below rule is never used.
% CYRILLIC UNDEFINED
<U0423><U0301> <U00DA>;"<U0055><U0060>"
Perhaps this should be commented out or removed altogether if it's not
working as intended.
So yes, they are not processed. I would drop them to not to have special
cases. But I am also fine with keeping them because all work is done
already.
I'd probably drop them but I don't feel strongly about this either way.

Thanks for your efforts, I don't have any further comments, I'll leave
this now for Rafal and Mike to provide additional feedback and hopefully
merge soon.

Thanks,
--
Marko Myllynen
Egor Kobylkin
2018-10-10 22:29:08 UTC
Permalink
Dear locale maintainers,

fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"

https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]

add Cyrillic transliteration table translit_cyrillic file

https://sourceware.org/bugzilla/attachment.cgi?id=8591 [7]

to localedata/locales/ and include it in all your locales going forward.

Patch included inline below.

From this patch I have excluded locales that already mention cyrillic or
have a transliteration table for it:
az_AZ
iso14651_t1_common
ky_KG
mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
***@cyrillic

Their maintainers are requested to make an explicit decision on how and
whether at all to include this patch.

Current bug effect:

The glibc wiki explicitly lists this use case as the test example

https://sourceware.org/glibc/wiki/Locales#Testing_Locales :

LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt

currently it fails on Cyrillic texts in most locales including ru_RU [1]
[8] [9]:

LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt |grep CYRILLIC

CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.

- It produces a string of question marks and spaces.

This is what it should produce and it does so after the patch applied:

CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.


Root problem and the fix:

The root problem is the missing transliteration table that I am
supplying here. Furthermore it has to be referenced/included into the
active locale at the compilation time to be used by iconv.



COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.

Examples: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
compatible transcription and iconv -f UTF-8 -t ISO-8859-15//TRANSLIT |
iconv -f ISO-8859-15 -t UTF-8 will produce Latin transliteration as per
ISO 9.1995.

While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration/transcription has only Latin/ASCII codes but still can
be read by a native speaker. Among other things it is useful for
processing the Cyrillic texts and filenames by programs or on systems
that are not specifically prepared to work with Cyrillic, don't have
corresponding fonts installed or can't handle UTF-8.

The transliteration table itself is attached as a file translit_cyrillic
[7]. Its content (mapping) is based on ISO 9.1995 standard [10] and its
derivative GOST 7.79-2000 official source (Federal Agency on Technical
Regulating and Metrology Of Russian Federation [2]). Technically an
independent but mostly identical source [3] was used and prepared in a
spreadsheet [6].

The documentation suggests that the transliteration tables inclusion is
done by adding *include "translit_cyrillic";""* string into LC_CTYPE
translit_start section
http://man7.org/linux/man-pages/man5/locale.5.html [5]
Practically I have searched for all locales that have a
translit_start/end stance and generated a patch for them.

The Cyrillic transliteration of e.g. Russian text may have already
worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
have their transliteration tables included inline.

I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files. Volodymyr
Lisivka <***@gmail.com>, Max Kutny <***@gmail.com> (uk_UA),
ДаМОлП КегаМ <***@gnome.org> (sr_YU, sr_CS) have confirmed the
exclusion.

Links:

[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11301
[7] translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11302
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=11304
[10] https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A

Best regards,
Egor Kobylkin

---
2018-10-11 Egor Kobylkin <***@kobylkin.com>

[BZ #2872]
* localedata/locales/translit_cyrillic: add ISO 9.1995, GOST 7.79
System A transliteration System B transcription table from Cyrillic to
Latin/ASCII.
* localedata/locales/C: add include "translit_cyrillic";"" to LC_CTYPE
translit section.
* localedata/locales/aa_DJ: Likewise.
* localedata/locales/af_ZA: Likewise.
* localedata/locales/ak_GH: Likewise.
* localedata/locales/am_ET: Likewise.
* localedata/locales/ar_EG: Likewise.
* localedata/locales/be_BY: Likewise.
* localedata/locales/bem_ZM: Likewise.
* localedata/locales/ber_DZ: Likewise.
* localedata/locales/ber_MA: Likewise.
* localedata/locales/bg_BG: Likewise.
* localedata/locales/bi_VU: Likewise.
* localedata/locales/bn_BD: Likewise.
* localedata/locales/bo_CN: Likewise.
* localedata/locales/ca_ES: Likewise.
* localedata/locales/ce_RU: Likewise.
* localedata/locales/cmn_TW: Likewise.
* localedata/locales/cs_CZ: Likewise.
* localedata/locales/cv_RU: Likewise.
* localedata/locales/cy_GB: Likewise.
* localedata/locales/da_DK: Likewise.
* localedata/locales/de_DE: Likewise.
* localedata/locales/dv_MV: Likewise.
* localedata/locales/dz_BT: Likewise.
* localedata/locales/el_GR: Likewise.
* localedata/locales/en_GB: Likewise.
* localedata/locales/en_NG: Likewise.
* localedata/locales/en_ZM: Likewise.
* localedata/locales/es_CU: Likewise.
* localedata/locales/es_ES: Likewise.
* localedata/locales/et_EE: Likewise.
* localedata/locales/fa_IR: Likewise.
* localedata/locales/ff_SN: Likewise.
* localedata/locales/fi_FI: Likewise.
* localedata/locales/fr_FR: Likewise.
* localedata/locales/ga_IE: Likewise.
* localedata/locales/gd_GB: Likewise.
* localedata/locales/gu_IN: Likewise.
* localedata/locales/gv_GB: Likewise.
* localedata/locales/he_IL: Likewise.
* localedata/locales/hi_IN: Likewise.
* localedata/locales/hif_FJ: Likewise.
* localedata/locales/hr_HR: Likewise.
* localedata/locales/ht_HT: Likewise.
* localedata/locales/hu_HU: Likewise.
* localedata/locales/hy_AM: Likewise.
* localedata/locales/id_ID: Likewise.
* localedata/locales/is_IS: Likewise.
* localedata/locales/it_IT: Likewise.
* localedata/locales/ja_JP: Likewise.
* localedata/locales/kab_DZ: Likewise.
* localedata/locales/kk_KZ: Likewise.
* localedata/locales/km_KH: Likewise.
* localedata/locales/kn_IN: Likewise.
* localedata/locales/ko_KR: Likewise.
* localedata/locales/ks_IN: Likewise.
* localedata/locales/kw_GB: Likewise.
* localedata/locales/lb_LU: Likewise.
* localedata/locales/lg_UG: Likewise.
* localedata/locales/lij_IT: Likewise.
* localedata/locales/ln_CD: Likewise.
* localedata/locales/lo_LA: Likewise.
* localedata/locales/lt_LT: Likewise.
* localedata/locales/lv_LV: Likewise.
* localedata/locales/mg_MG: Likewise.
* localedata/locales/mhr_RU: Likewise.
* localedata/locales/mk_MK: Likewise.
* localedata/locales/ml_IN: Likewise.
* localedata/locales/ms_MY: Likewise.
* localedata/locales/mt_MT: Likewise.
* localedata/locales/***@latin: Likewise.
* localedata/locales/nb_NO: Likewise.
* localedata/locales/ne_NP: Likewise.
* localedata/locales/nhn_MX: Likewise.
* localedata/locales/niu_NU: Likewise.
* localedata/locales/niu_NZ: Likewise.
* localedata/locales/nl_NL: Likewise.
* localedata/locales/nr_ZA: Likewise.
* localedata/locales/oc_FR: Likewise.
* localedata/locales/om_KE: Likewise.
* localedata/locales/or_IN: Likewise.
* localedata/locales/os_RU: Likewise.
* localedata/locales/pa_IN: Likewise.
* localedata/locales/pa_PK: Likewise.
* localedata/locales/pl_PL: Likewise.
* localedata/locales/pt_PT: Likewise.
* localedata/locales/quz_PE: Likewise.
* localedata/locales/ro_RO: Likewise.
* localedata/locales/ru_RU: Likewise.
* localedata/locales/rw_RW: Likewise.
* localedata/locales/sa_IN: Likewise.
* localedata/locales/sd_IN: Likewise.
* localedata/locales/***@devanagari: Likewise.
* localedata/locales/sd_PK: Likewise.
* localedata/locales/se_NO: Likewise.
* localedata/locales/sgs_LT: Likewise.
* localedata/locales/shn_MM: Likewise.
* localedata/locales/si_LK: Likewise.
* localedata/locales/sk_SK: Likewise.
* localedata/locales/sl_SI: Likewise.
* localedata/locales/sm_WS: Likewise.
* localedata/locales/so_SO: Likewise.
* localedata/locales/sq_AL: Likewise.
* localedata/locales/ss_ZA: Likewise.
* localedata/locales/st_ZA: Likewise.
* localedata/locales/sv_SE: Likewise.
* localedata/locales/sw_KE: Likewise.
* localedata/locales/ta_IN: Likewise.
* localedata/locales/te_IN: Likewise.
* localedata/locales/th_TH: Likewise.
* localedata/locales/ti_ET: Likewise.
* localedata/locales/tn_ZA: Likewise.
* localedata/locales/to_TO: Likewise.
* localedata/locales/tpi_PG: Likewise.
* localedata/locales/tr_TR: Likewise.
* localedata/locales/ts_ZA: Likewise.
* localedata/locales/unm_US: Likewise.
* localedata/locales/ur_IN: Likewise.
* localedata/locales/ur_PK: Likewise.
* localedata/locales/ve_ZA: Likewise.
* localedata/locales/vi_VN: Likewise.
* localedata/locales/wa_BE: Likewise.
* localedata/locales/wo_SN: Likewise.
* localedata/locales/xh_ZA: Likewise.
* localedata/locales/yi_US: Likewise.
* localedata/locales/yuw_PG: Likewise.
* localedata/locales/zh_CN: Likewise.
* localedata/locales/zu_ZA: Likewise.

diff -uNr a/localedata/locales/C b/localedata/locales/C
--- a/localedata/locales/C 2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/C 2018-10-09 19:02:45.000000000 +0000
@@ -2293,6 +2293,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
--- a/localedata/locales/aa_DJ 2018-10-09 19:02:12.000000000 +0000
+++ b/localedata/locales/aa_DJ 2018-10-09 19:02:45.000000000 +0000
@@ -68,6 +68,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
--- a/localedata/locales/af_ZA 2018-10-09 19:02:12.000000000 +0000
+++ b/localedata/locales/af_ZA 2018-10-09 19:02:45.000000000 +0000
@@ -70,6 +70,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
--- a/localedata/locales/ak_GH 2018-10-09 19:02:12.000000000 +0000
+++ b/localedata/locales/ak_GH 2018-10-09 19:02:45.000000000 +0000
@@ -54,6 +54,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
--- a/localedata/locales/am_ET 2018-10-09 19:02:12.000000000 +0000
+++ b/localedata/locales/am_ET 2018-10-09 19:02:45.000000000 +0000
@@ -1394,6 +1394,7 @@
<U137A> <U0060><U0039><U0030>
<U137B> <U0060><U0031><U0030><U0030>
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
--- a/localedata/locales/ar_EG 2018-10-09 19:02:12.000000000 +0000
+++ b/localedata/locales/ar_EG 2018-10-09 19:02:45.000000000 +0000
@@ -44,6 +44,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
--- a/localedata/locales/be_BY 2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/be_BY 2018-10-09 19:02:45.000000000 +0000
@@ -68,6 +68,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
--- a/localedata/locales/bem_ZM 2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/bem_ZM 2018-10-09 19:02:45.000000000 +0000
@@ -41,6 +41,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
--- a/localedata/locales/ber_DZ 2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/ber_DZ 2018-10-09 19:02:45.000000000 +0000
@@ -165,6 +165,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
--- a/localedata/locales/ber_MA 2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/ber_MA 2018-10-09 19:02:45.000000000 +0000
@@ -85,6 +85,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
--- a/localedata/locales/bg_BG 2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/bg_BG 2018-10-09 19:02:45.000000000 +0000
@@ -49,6 +49,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
--- a/localedata/locales/bi_VU 2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/bi_VU 2018-10-09 19:02:45.000000000 +0000
@@ -39,6 +39,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
--- a/localedata/locales/bn_BD 2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/bn_BD 2018-10-09 19:02:46.000000000 +0000
@@ -61,6 +61,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
--- a/localedata/locales/bo_CN 2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/bo_CN 2018-10-09 19:02:46.000000000 +0000
@@ -43,6 +43,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
--- a/localedata/locales/ca_ES 2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/ca_ES 2018-10-09 19:02:46.000000000 +0000
@@ -71,6 +71,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
--- a/localedata/locales/ce_RU 2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/ce_RU 2018-10-09 19:02:46.000000000 +0000
@@ -38,6 +38,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/cmn_TW b/localedata/locales/cmn_TW
--- a/localedata/locales/cmn_TW 2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/cmn_TW 2018-10-09 19:02:46.000000000 +0000
@@ -49,6 +49,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end

class "hanzi"; /
diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
--- a/localedata/locales/cs_CZ 2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/cs_CZ 2018-10-09 19:02:46.000000000 +0000
@@ -204,6 +204,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
--- a/localedata/locales/cv_RU 2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/cv_RU 2018-10-09 19:02:46.000000000 +0000
@@ -108,6 +108,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
--- a/localedata/locales/cy_GB 2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/cy_GB 2018-10-09 19:02:46.000000000 +0000
@@ -65,6 +65,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
--- a/localedata/locales/da_DK 2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/da_DK 2018-10-09 19:02:46.000000000 +0000
@@ -166,6 +166,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
--- a/localedata/locales/de_DE 2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/de_DE 2018-10-09 19:02:46.000000000 +0000
@@ -78,6 +78,7 @@
% DOUBLE HIGH-REVERSED-9 QUOTATION MARK
<U201F> <U00AB>;<U0022>

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
--- a/localedata/locales/dv_MV 2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/dv_MV 2018-10-09 19:02:46.000000000 +0000
@@ -51,6 +51,7 @@
include "translit_combining";""


+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
--- a/localedata/locales/dz_BT 2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/dz_BT 2018-10-09 19:02:46.000000000 +0000
@@ -59,6 +59,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
--- a/localedata/locales/el_GR 2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/el_GR 2018-10-09 19:02:46.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
--- a/localedata/locales/en_GB 2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/en_GB 2018-10-09 19:02:46.000000000 +0000
@@ -54,6 +54,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
--- a/localedata/locales/en_NG 2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/en_NG 2018-10-09 19:02:46.000000000 +0000
@@ -49,6 +49,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
--- a/localedata/locales/en_ZM 2018-10-09 19:02:15.000000000 +0000
+++ b/localedata/locales/en_ZM 2018-10-09 19:02:46.000000000 +0000
@@ -41,6 +41,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
--- a/localedata/locales/es_CU 2018-10-09 19:02:15.000000000 +0000
+++ b/localedata/locales/es_CU 2018-10-09 19:02:47.000000000 +0000
@@ -59,6 +59,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
--- a/localedata/locales/es_ES 2018-10-09 19:02:15.000000000 +0000
+++ b/localedata/locales/es_ES 2018-10-09 19:02:47.000000000 +0000
@@ -72,6 +72,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
--- a/localedata/locales/et_EE 2018-10-09 19:02:15.000000000 +0000
+++ b/localedata/locales/et_EE 2018-10-09 19:02:47.000000000 +0000
@@ -112,6 +112,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
--- a/localedata/locales/fa_IR 2018-10-09 19:02:15.000000000 +0000
+++ b/localedata/locales/fa_IR 2018-10-09 19:02:47.000000000 +0000
@@ -78,6 +78,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
--- a/localedata/locales/ff_SN 2018-10-09 19:02:15.000000000 +0000
+++ b/localedata/locales/ff_SN 2018-10-09 19:02:47.000000000 +0000
@@ -41,6 +41,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
--- a/localedata/locales/fi_FI 2018-10-09 19:02:15.000000000 +0000
+++ b/localedata/locales/fi_FI 2018-10-09 19:02:47.000000000 +0000
@@ -136,6 +136,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
--- a/localedata/locales/fr_FR 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/fr_FR 2018-10-09 19:02:47.000000000 +0000
@@ -58,6 +58,7 @@
% In France, accents are simply omitted if they cannot be represented.
include "translit_combining";""

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
--- a/localedata/locales/ga_IE 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/ga_IE 2018-10-09 19:02:47.000000000 +0000
@@ -53,6 +53,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
--- a/localedata/locales/gd_GB 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/gd_GB 2018-10-09 19:02:47.000000000 +0000
@@ -45,6 +45,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
--- a/localedata/locales/gu_IN 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/gu_IN 2018-10-09 19:02:47.000000000 +0000
@@ -62,6 +62,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
--- a/localedata/locales/gv_GB 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/gv_GB 2018-10-09 19:02:47.000000000 +0000
@@ -56,6 +56,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
--- a/localedata/locales/he_IL 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/he_IL 2018-10-09 19:02:47.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
--- a/localedata/locales/hi_IN 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/hi_IN 2018-10-09 19:02:47.000000000 +0000
@@ -61,6 +61,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
--- a/localedata/locales/hif_FJ 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/hif_FJ 2018-10-09 19:02:47.000000000 +0000
@@ -37,6 +37,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
--- a/localedata/locales/hr_HR 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/hr_HR 2018-10-09 19:02:47.000000000 +0000
@@ -61,6 +61,7 @@
% transliterate <U0111> {đ} into d + j
<U0111> "<U0064><U006A>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
--- a/localedata/locales/ht_HT 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/ht_HT 2018-10-09 19:02:48.000000000 +0000
@@ -57,6 +57,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
--- a/localedata/locales/hu_HU 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/hu_HU 2018-10-09 19:02:48.000000000 +0000
@@ -476,6 +476,7 @@
<U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
<U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
--- a/localedata/locales/hy_AM 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/hy_AM 2018-10-09 19:02:48.000000000 +0000
@@ -75,6 +75,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
--- a/localedata/locales/id_ID 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/id_ID 2018-10-09 19:02:48.000000000 +0000
@@ -54,6 +54,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
--- a/localedata/locales/is_IS 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/is_IS 2018-10-09 19:02:48.000000000 +0000
@@ -149,6 +149,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
--- a/localedata/locales/it_IT 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/it_IT 2018-10-09 19:02:48.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
--- a/localedata/locales/ja_JP 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/ja_JP 2018-10-09 19:02:48.000000000 +0000
@@ -1681,6 +1681,7 @@
include "translit_combining";""
include "translit_cjk_variants";""

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/kab_DZ b/localedata/locales/kab_DZ
--- a/localedata/locales/kab_DZ 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/kab_DZ 2018-10-09 19:02:48.000000000 +0000
@@ -41,6 +41,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
--- a/localedata/locales/kk_KZ 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/kk_KZ 2018-10-09 19:02:48.000000000 +0000
@@ -157,6 +157,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
--- a/localedata/locales/km_KH 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/km_KH 2018-10-09 19:02:48.000000000 +0000
@@ -42,6 +42,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
--- a/localedata/locales/kn_IN 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/kn_IN 2018-10-09 19:02:49.000000000 +0000
@@ -63,6 +63,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
--- a/localedata/locales/ko_KR 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/ko_KR 2018-10-09 19:02:49.000000000 +0000
@@ -6099,6 +6099,7 @@
include "translit_combining";""
include "translit_hangul";""

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
--- a/localedata/locales/ks_IN 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/ks_IN 2018-10-09 19:02:49.000000000 +0000
@@ -46,6 +46,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
--- a/localedata/locales/kw_GB 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/kw_GB 2018-10-09 19:02:49.000000000 +0000
@@ -57,6 +57,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
--- a/localedata/locales/lb_LU 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/lb_LU 2018-10-09 19:02:49.000000000 +0000
@@ -77,6 +77,7 @@
% LATIN SMALL LETTER E WITH CIRCUMFLEX
<U00EA> "e^"

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
--- a/localedata/locales/lg_UG 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/lg_UG 2018-10-09 19:02:49.000000000 +0000
@@ -56,6 +56,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
--- a/localedata/locales/lij_IT 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/lij_IT 2018-10-09 19:02:49.000000000 +0000
@@ -47,6 +47,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
--- a/localedata/locales/ln_CD 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/ln_CD 2018-10-09 19:02:49.000000000 +0000
@@ -39,6 +39,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
--- a/localedata/locales/lo_LA 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/lo_LA 2018-10-09 19:02:49.000000000 +0000
@@ -50,6 +50,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
--- a/localedata/locales/lt_LT 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/lt_LT 2018-10-09 19:02:49.000000000 +0000
@@ -163,6 +163,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
--- a/localedata/locales/lv_LV 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/lv_LV 2018-10-09 19:02:50.000000000 +0000
@@ -110,6 +110,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
--- a/localedata/locales/mg_MG 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/mg_MG 2018-10-09 19:02:50.000000000 +0000
@@ -54,6 +54,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
--- a/localedata/locales/mhr_RU 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/mhr_RU 2018-10-09 19:02:50.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
--- a/localedata/locales/mk_MK 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/mk_MK 2018-10-09 19:02:50.000000000 +0000
@@ -48,6 +48,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
--- a/localedata/locales/ml_IN 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/ml_IN 2018-10-09 19:02:50.000000000 +0000
@@ -60,6 +60,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
%
diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
--- a/localedata/locales/ms_MY 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/ms_MY 2018-10-09 19:02:50.000000000 +0000
@@ -45,6 +45,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
--- a/localedata/locales/mt_MT 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/mt_MT 2018-10-09 19:02:50.000000000 +0000
@@ -47,6 +47,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/***@latin
b/localedata/locales/***@latin
--- a/localedata/locales/***@latin 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/***@latin 2018-10-09 19:02:50.000000000 +0000
@@ -52,6 +52,7 @@
% accents are simply omitted if they cannot be represented.
include "translit_combining";""

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
--- a/localedata/locales/nb_NO 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/nb_NO 2018-10-09 19:02:50.000000000 +0000
@@ -154,6 +154,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
--- a/localedata/locales/ne_NP 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/ne_NP 2018-10-09 19:02:50.000000000 +0000
@@ -43,6 +43,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
--- a/localedata/locales/nhn_MX 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/nhn_MX 2018-10-09 19:02:50.000000000 +0000
@@ -59,6 +59,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
--- a/localedata/locales/niu_NU 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/niu_NU 2018-10-09 19:02:50.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
--- a/localedata/locales/niu_NZ 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/niu_NZ 2018-10-09 19:02:50.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
--- a/localedata/locales/nl_NL 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/nl_NL 2018-10-09 19:02:50.000000000 +0000
@@ -56,6 +56,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
--- a/localedata/locales/nr_ZA 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/nr_ZA 2018-10-09 19:02:50.000000000 +0000
@@ -64,6 +64,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
--- a/localedata/locales/oc_FR 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/oc_FR 2018-10-09 19:02:50.000000000 +0000
@@ -54,6 +54,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
--- a/localedata/locales/om_KE 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/om_KE 2018-10-09 19:02:50.000000000 +0000
@@ -138,6 +138,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
--- a/localedata/locales/or_IN 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/or_IN 2018-10-09 19:02:51.000000000 +0000
@@ -62,6 +62,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
--- a/localedata/locales/os_RU 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/os_RU 2018-10-09 19:02:51.000000000 +0000
@@ -69,6 +69,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
--- a/localedata/locales/pa_IN 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/pa_IN 2018-10-09 19:02:51.000000000 +0000
@@ -60,6 +60,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
--- a/localedata/locales/pa_PK 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/pa_PK 2018-10-09 19:02:51.000000000 +0000
@@ -57,6 +57,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
--- a/localedata/locales/pl_PL 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/pl_PL 2018-10-09 19:02:51.000000000 +0000
@@ -116,6 +116,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
--- a/localedata/locales/pt_PT 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/pt_PT 2018-10-09 19:02:51.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
--- a/localedata/locales/quz_PE 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/quz_PE 2018-10-09 19:02:51.000000000 +0000
@@ -55,6 +55,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
--- a/localedata/locales/ro_RO 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/ro_RO 2018-10-09 19:02:51.000000000 +0000
@@ -143,6 +143,7 @@
<U0162> "<U021A>";"<U0054>"
<U0163> "<U021B>";"<U0074>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
--- a/localedata/locales/ru_RU 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/ru_RU 2018-10-09 19:02:51.000000000 +0000
@@ -73,6 +73,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
--- a/localedata/locales/rw_RW 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/rw_RW 2018-10-09 19:02:51.000000000 +0000
@@ -45,6 +45,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
--- a/localedata/locales/sa_IN 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/sa_IN 2018-10-09 19:02:51.000000000 +0000
@@ -44,6 +44,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
--- a/localedata/locales/sd_IN 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/sd_IN 2018-10-09 19:02:51.000000000 +0000
@@ -46,6 +46,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/***@devanagari
b/localedata/locales/***@devanagari
--- a/localedata/locales/***@devanagari 2018-10-09 19:02:19.000000000
+0000
+++ b/localedata/locales/***@devanagari 2018-10-09 19:02:51.000000000
+0000
@@ -44,6 +44,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
--- a/localedata/locales/sd_PK 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/sd_PK 2018-10-09 19:02:51.000000000 +0000
@@ -39,6 +39,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
--- a/localedata/locales/se_NO 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/se_NO 2018-10-09 19:02:51.000000000 +0000
@@ -204,6 +204,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
--- a/localedata/locales/sgs_LT 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/sgs_LT 2018-10-09 19:02:51.000000000 +0000
@@ -58,6 +58,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/shn_MM b/localedata/locales/shn_MM
--- a/localedata/locales/shn_MM 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/shn_MM 2018-10-09 19:02:51.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
--- a/localedata/locales/si_LK 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/si_LK 2018-10-09 19:02:51.000000000 +0000
@@ -44,6 +44,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
--- a/localedata/locales/sk_SK 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/sk_SK 2018-10-09 19:02:52.000000000 +0000
@@ -67,6 +67,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
--- a/localedata/locales/sl_SI 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/sl_SI 2018-10-09 19:02:52.000000000 +0000
@@ -90,6 +90,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
--- a/localedata/locales/sm_WS 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/sm_WS 2018-10-09 19:02:52.000000000 +0000
@@ -37,6 +37,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
--- a/localedata/locales/so_SO 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/so_SO 2018-10-09 19:02:52.000000000 +0000
@@ -68,6 +68,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
--- a/localedata/locales/sq_AL 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/sq_AL 2018-10-09 19:02:52.000000000 +0000
@@ -45,6 +45,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
--- a/localedata/locales/ss_ZA 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/ss_ZA 2018-10-09 19:02:52.000000000 +0000
@@ -66,6 +66,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
--- a/localedata/locales/st_ZA 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/st_ZA 2018-10-09 19:02:52.000000000 +0000
@@ -62,6 +62,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
--- a/localedata/locales/sv_SE 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/sv_SE 2018-10-09 19:02:52.000000000 +0000
@@ -138,6 +138,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
--- a/localedata/locales/sw_KE 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/sw_KE 2018-10-09 19:02:52.000000000 +0000
@@ -43,6 +43,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
--- a/localedata/locales/ta_IN 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/ta_IN 2018-10-09 19:02:52.000000000 +0000
@@ -63,6 +63,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
--- a/localedata/locales/te_IN 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/te_IN 2018-10-09 19:02:52.000000000 +0000
@@ -63,6 +63,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
--- a/localedata/locales/th_TH 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/th_TH 2018-10-09 19:02:52.000000000 +0000
@@ -57,6 +57,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
--- a/localedata/locales/ti_ET 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/ti_ET 2018-10-09 19:02:52.000000000 +0000
@@ -864,6 +864,7 @@
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>

include "translit_combining";""
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
--- a/localedata/locales/tn_ZA 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/tn_ZA 2018-10-09 19:02:53.000000000 +0000
@@ -67,6 +67,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
--- a/localedata/locales/to_TO 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/to_TO 2018-10-09 19:02:53.000000000 +0000
@@ -36,6 +36,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
--- a/localedata/locales/tpi_PG 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/tpi_PG 2018-10-09 19:02:53.000000000 +0000
@@ -44,6 +44,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
--- a/localedata/locales/tr_TR 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/tr_TR 2018-10-09 19:02:53.000000000 +0000
@@ -2423,6 +2423,7 @@

% TURKISH LIRA SIGN
<U20BA> "<U0054><U004C>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/translit_cyrillic
b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000
+0000
+++ b/localedata/locales/translit_cyrillic 2018-10-09 19:02:54.000000000
+0000
@@ -0,0 +1,383 @@
+escape_char /
+comment_char %
+
+% This file is part of the GNU C Library and contains locale data.
+% The Free Software Foundation does not claim any copyright interest
+% in the locale data contained in this file. The foregoing does not
+% affect the license of the GNU C Library as a whole. It does not
+% exempt you from the conditions of the license if your use would
+% otherwise be governed by that license.
+
+% Transliterations of cyrillic letters to latin and/or ascii symbols.
+% Inspired by ISO 9.1995 / GOST 7.79-2000.
+% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
+% i.e [U4001-U4F9, U2019] but only the letters covered by ISO 9.1995
+% It implements the GOST_7.79 System A (Latin Script) as a first
+% option and System B Cyrillic (ASCII) as a second option. Check
+% https://en.wikipedia.org/wiki/ISO_9 for reference.
+% The System B is extended from GOST_7.79-Russian using open sources
+% of the transliteration mappings and the "h/`" diacritics logic.
+
+% Usage examples:
+% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
+% | iconv -f ISO-8859-15 -t UTF-8 # System A
+% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
+
+% Contributions welcome for the rest of Cyrillic script in Unicode
+% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
+% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
+% Generated from UnicodeData.txt with
+% https://sourceware.org/bugzilla/attachment.cgi?id=11301.
+
+LC_CTYPE
+
+translit_start
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> <U00CB>;"<U0059><U004F>"
+% CYRILLIC CAPITAL LETTER DJE
+<U0402> <U0110>;"<U0044><U004A>"
+% CYRILLIC CAPITAL LETTER GJE
+<U0403> <U01F4>;"<U0047><U0060>"
+% CYRILLIC CAPITAL LETTER UKRAINIAN IE
+<U0404> <U00CA>;"<U0059><U0065>"
+% CYRILLIC CAPITAL LETTER DZE
+<U0405> <U1E90>;"<U005A><U0060>"
+% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0406> <U00CC>;<U0049>
+% CYRILLIC CAPITAL LETTER YI
+<U0407> <U00CF>;"<U0059><U0069>"
+% CYRILLIC CAPITAL LETTER JE
+<U0408> "<U004A><U030C>";<U004A>
+% CYRILLIC CAPITAL LETTER LJE
+<U0409> "<U004C><U0302>";"<U004C><U0060>"
+% CYRILLIC CAPITAL LETTER NJE
+<U040A> "<U004E><U0302>";"<U004E><U0060>"
+% CYRILLIC CAPITAL LETTER TSHE
+<U040B> <U0106>;"<U0054><U0053><U0048>"
+% CYRILLIC CAPITAL LETTER KJE
+<U040C> <U1E30>;"<U004B><U0060>"
+% CYRILLIC CAPITAL LETTER SHORT U
+<U040E> <U016C>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER DZHE
+<U040F> "<U0044><U0302>";"<U0044><U0068>"
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> <U017D>;"<U005A><U0048>"
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC UNDEFINED
+<U0423><U0301> <U00DA>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0048>;<U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> <U0043>;"<U0043><U005A>"
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> <U010C>;"<U0043><U0048>"
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> <U0160>;"<U0053><U0048>"
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> <U015C>;"<U0053><U0048><U0048>"
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> <U02BA>;"<U0041><U0060>"
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> <U0059>;"<U0059><U0060>"
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U02B9>;<U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> <U00C8>;"<U0045><U0060>"
+% CYRILLIC CAPITAL LETTER YU
+<U042E> <U00DB>;"<U0059><U0055>"
+% CYRILLIC CAPITAL LETTER YA
+<U042F> <U00C2>;"<U0059><U0041>"
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> <U017E>;"<U007A><U0068>"
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC UNDEFINED
+<U0443><U0301> <U00FA>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0068>;<U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> <U0063>;"<U0063><U007A>"
+% CYRILLIC SMALL LETTER CHE
+<U0447> <U010D>;"<U0063><U0068>"
+% CYRILLIC SMALL LETTER SHA
+<U0448> <U0161>;"<U0073><U0068>"
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> <U015D>;"<U0073><U0068><U0068>"
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> <U02BA>;"<U0060><U0060>"
+% CYRILLIC SMALL LETTER YERU
+<U044B> <U0079>;"<U0079><U0060>"
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U02B9>;<U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> <U00E8>;"<U0065><U0060>"
+% CYRILLIC SMALL LETTER YU
+<U044E> <U00FB>;"<U0079><U0075>"
+% CYRILLIC SMALL LETTER YA
+<U044F> <U00E2>;"<U0079><U0061>"
+% CYRILLIC SMALL LETTER IO
+<U0451> <U00EB>;"<U0079><U006F>"
+% CYRILLIC SMALL LETTER DJE
+<U0452> <U0111>;"<U0064><U006A>"
+% CYRILLIC SMALL LETTER GJE
+<U0453> <U01F5>;"<U0067><U0060>"
+% CYRILLIC SMALL LETTER UKRAINIAN IE
+<U0454> <U00EA>;"<U0079><U0065>"
+% CYRILLIC SMALL LETTER DZE
+<U0455> <U1E91>;"<U007A><U0060>"
+% CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0456> <U00EC>;<U0069>
+% CYRILLIC SMALL LETTER YI
+<U0457> <U00EF>;"<U0079><U0069>"
+% CYRILLIC SMALL LETTER JE
+<U0458> <U01F0>;<U006A>
+% CYRILLIC SMALL LETTER LJE
+<U0459> "<U006C><U0302>";"<U006C><U0060>"
+% CYRILLIC SMALL LETTER NJE
+<U045A> "<U006E><U0302>";"<U006E><U0060>"
+% CYRILLIC SMALL LETTER TSHE
+<U045B> <U0107>;"<U0074><U0073><U0068>"
+% CYRILLIC SMALL LETTER KJE
+<U045C> <U1E31>;"<U006B><U0060>"
+% CYRILLIC SMALL LETTER SHORT U
+<U045E> <U016D>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER DZHE
+<U045F> "<U0064><U0302>";"<U0064><U0068>"
+% CYRILLIC CAPITAL LETTER BIG YUS
+<U046A> <U01CD>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BIG YUS
+<U046B> <U01CE>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER FITA
+<U0472> "<U0046><U0300>";"<U0046><U0068>"
+% CYRILLIC SMALL LETTER FITA
+<U0473> "<U0066><U0300>";"<U0066><U0068>"
+% CYRILLIC CAPITAL LETTER IZHITSA
+<U0474> <U1EF2>;"<U0059><U0068>"
+% CYRILLIC SMALL LETTER IZHITSA
+<U0475> <U1EF3>;"<U0079><U0068>"
+% CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+<U048C> <U011A>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER SEMISOFT SIGN
+<U048D> <U011B>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+<U0490> "<U0047><U0300>";"<U0047><U0060>"
+% CYRILLIC SMALL LETTER GHE WITH UPTURN
+<U0491> "<U0067><U0300>";"<U0067><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH STROKE
+<U0492> <U0120>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH STROKE
+<U0493> <U0121>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+<U0494> <U011E>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+<U0495> <U011F>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+<U0496> "<U017D><U0327>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+<U0497> "<U017E><U0327>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+<U049A> <U0136>;"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH DESCENDER
+<U049B> <U0137>;"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH STROKE
+<U049E> "<U004B><U0304>";"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH STROKE
+<U049F> "<U006B><U0304>";"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+<U04A2> <U1E46>;"<U004E><U0060>"
+% CYRILLIC SMALL LETTER EN WITH DESCENDER
+<U04A3> <U1E47>;"<U006E><U0060>"
+% CYRILLIC CAPITAL LIGATURE EN GHE
+<U04A4> <U1E44>;"<U004E><U0047>"
+% CYRILLIC SMALL LIGATURE EN GHE
+<U04A5> <U1E45>;"<U006E><U0067>"
+% CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+<U04A6> <U1E54>;"<U0050><U0060>"
+% CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+<U04A7> <U1E55>;"<U0070><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN HA
+<U04A8> <U00D2>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN HA
+<U04A9> <U00F2>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+<U04AA> <U00C7>;"<U0043><U0060>"
+% CYRILLIC SMALL LETTER ES WITH DESCENDER
+<U04AB> <U00E7>;"<U0043><U0060>"
+% CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+<U04AC> <U0162>;"<U0054><U0060>"
+% CYRILLIC SMALL LETTER TE WITH DESCENDER
+<U04AD> <U0163>;"<U0074><U0060>"
+% CYRILLIC CAPITAL LETTER STRAIGHT U
+<U04AE> <U00D9>;<U0055>
+% CYRILLIC SMALL LETTER STRAIGHT U
+<U04AF> <U00F9>;<U0075>
+% CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+<U04B2> <U1E28>;"<U0048><U0060>"
+% CYRILLIC SMALL LETTER HA WITH DESCENDER
+<U04B3> <U1E29>;"<U0068><U0060>"
+% CYRILLIC CAPITAL LIGATURE TE TSE
+<U04B4> "<U0043><U0304>";"<U0054><U0043><U005A>"
+% CYRILLIC SMALL LIGATURE TE TSE
+<U04B5> "<U0063><U0304>";"<U0074><U0063><U007A>"
+% CYRILLIC CAPITAL LETTER SHHA
+<U04BA> <U1E24>;"<U0053><U0048><U0060>"
+% CYRILLIC SMALL LETTER SHHA
+<U04BB> <U1E25>;"<U0053><U0048><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+<U04BC> "<U0043><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE
+<U04BD> "<U0063><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BE> "<U00C7><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BF> "<U00E7><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC LETTER PALOCHKA
+<U04C0> <U2021>;<U0069>
+% CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+<U04C1> "<U005A><U0306>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH BREVE
+<U04C2> "<U007A><U0306>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+<U04CB> <U00C7>;"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER KHAKASSIAN CHE
+<U04CC> <U00E7>;"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH BREVE
+<U04D0> <U0102>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH BREVE
+<U04D1> <U0103>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+<U04D2> <U00C4>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH DIAERESIS
+<U04D3> <U00E4>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER IE WITH BREVE
+<U04D6> <U0114>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER IE WITH BREVE
+<U04D7> <U0115>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER SCHWA
+<U04D8> "<U0041><U030B>";"<U0041><U0060>"
+% CYRILLIC SMALL LETTER SCHWA
+<U04D9> "<U0061><U030B>";"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+<U04DC> "<U005A><U0304>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+<U04DD> "<U007A><U0304>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+<U04DE> "<U005A><U0308>";"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+<U04DF> "<U007A><U0308>";"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+<U04E0> <U0179>;"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN DZE
+<U04E1> <U017A>;"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+<U04E4> <U00CE>;"<U0049><U0060>"
+% CYRILLIC SMALL LETTER I WITH DIAERESIS
+<U04E5> <U00EE>;"<U0069><U0060>"
+% CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+<U04E6> <U00D6>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER O WITH DIAERESIS
+<U04E7> <U00F6>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER BARRED O
+<U04E8> <U00D4>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BARRED O
+<U04E9> <U00F4>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+<U04F0> <U00DC>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DIAERESIS
+<U04F1> <U00FC>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+<U04F2> <U0170>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+<U04F3> <U0171>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+<U04F4> "<U0043><U0308>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+<U04F5> "<U0063><U0308>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+<U04F8> <U0178>;"<U0059><U0060>"
+% CYRILLIC SMALL LETTER YERU WITH DIAERESIS
+<U04F9> <U00FF>;"<U0079><U0060>"
+% RIGHT SINGLE QUOTATION MARK
+<U2019> <U2035>;<U0027>
+
+translit_end
+
+END LC_CTYPE
diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
--- a/localedata/locales/ts_ZA 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/ts_ZA 2018-10-09 19:02:53.000000000 +0000
@@ -62,6 +62,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
--- a/localedata/locales/unm_US 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/unm_US 2018-10-09 19:02:53.000000000 +0000
@@ -48,6 +48,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
--- a/localedata/locales/ur_IN 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/ur_IN 2018-10-09 19:02:53.000000000 +0000
@@ -46,6 +46,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
--- a/localedata/locales/ur_PK 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/ur_PK 2018-10-09 19:02:53.000000000 +0000
@@ -57,6 +57,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
--- a/localedata/locales/ve_ZA 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/ve_ZA 2018-10-09 19:02:53.000000000 +0000
@@ -65,6 +65,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
--- a/localedata/locales/vi_VN 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/vi_VN 2018-10-09 19:02:53.000000000 +0000
@@ -57,6 +57,7 @@
% dong sign -> d// -> dd
<U20AB> "<U0111>";"<U0064><U0064>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
--- a/localedata/locales/wa_BE 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/wa_BE 2018-10-09 19:02:53.000000000 +0000
@@ -59,6 +59,7 @@
<U00C5> "A<U030A>";"A";"AU"
<U00E5> "a<U030A>";"a";"au"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
--- a/localedata/locales/wo_SN 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/wo_SN 2018-10-09 19:02:53.000000000 +0000
@@ -54,6 +54,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
--- a/localedata/locales/xh_ZA 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/xh_ZA 2018-10-09 19:02:54.000000000 +0000
@@ -64,6 +64,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
--- a/localedata/locales/yi_US 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/yi_US 2018-10-09 19:02:54.000000000 +0000
@@ -66,6 +66,7 @@
<U05F0> "<U05D5><U05D5>";"ww"
<U05F1> "<U05D5><U05D9>";"wj"
<U05F2> "<U05D9><U05D9>";"jj"
+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/yuw_PG b/localedata/locales/yuw_PG
--- a/localedata/locales/yuw_PG 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/yuw_PG 2018-10-09 19:02:54.000000000 +0000
@@ -40,6 +40,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
--- a/localedata/locales/zh_CN 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/zh_CN 2018-10-09 19:02:54.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end

class "hanzi"; /
diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
--- a/localedata/locales/zu_ZA 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/zu_ZA 2018-10-09 19:02:54.000000000 +0000
@@ -68,6 +68,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
Marko Myllynen
2018-10-11 09:59:54 UTC
Permalink
Hi,

Looks like there's one rule after all which might be debatable, I'll
just highlight it and let others to comment and decide what to do with it.
Post by Egor Kobylkin
+% RIGHT SINGLE QUOTATION MARK
+<U2019> <U2035>;<U0027>
translit_neutral (which is included by i18n) has:

% RIGHT SINGLE QUOTATION MARK
<U2019> <U0027> % not <U00B4> because it's often used as an apostrophe

In practice the end result might well be the same (since if U+2019 is
not available then probably U+2035 is neither and both rules produce
U+0027). However, given that translit_cyrillic would be included in
every locale, I'm not sure is this kind of minor discrepancy ok or not.

Thanks,
--
Marko Myllynen
Rafal Luzynski
2018-10-11 11:04:28 UTC
Permalink
Thank you, Egor. I am looking at your patch and although I have
not yet finished, here are some remarks:

First of all, I think that such a large patch should also include
the tests. Please see how automatic tests are performed in locale
data and write your own.
Post by Egor Kobylkin
[...]
From this patch I have excluded locales that already mention cyrillic or
az_AZ
iso14651_t1_common
ky_KG
mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
[...]
I think that eventually we would like to include your translit_cyrillic
also in these locales because I assume that your rules should work good
for them as well, also should include more characters than the individual
language contributors took into account. Similarly to Mike's work on
collation: a common rules were created and all locales include them adding
their own language specific modifications.
Post by Egor Kobylkin
[...]
[...]
I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files. Volodymyr
I am not sure if we want Cyrillic text in the commit message. Shouldn't
it be, uhm, tranlisterated? :-)

"sr_CS" - I guess you meant "sr_RS".

"sr_YU" has been dropped, do we want to mention it?
Post by Egor Kobylkin
[...]
[BZ #2872]
* localedata/locales/translit_cyrillic: add ISO 9.1995, GOST 7.79
Please start "Add" with an uppercase. BTW, shouldn't it be "New file"
instead?
Post by Egor Kobylkin
System A transliteration System B transcription table from Cyrillic to
Latin/ASCII.
* localedata/locales/C: add include "translit_cyrillic";"" to LC_CTYPE
translit section.
Same, "Add" here.
Post by Egor Kobylkin
* localedata/locales/aa_DJ: Likewise.
Good (here and everywhere below).
Post by Egor Kobylkin
[...]
diff -uNr a/localedata/locales/translit_cyrillic
b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000
+0000
+++ b/localedata/locales/translit_cyrillic 2018-10-09 19:02:54.000000000
+0000
@@ -0,0 +1,383 @@
+escape_char /
+comment_char %
+
+% This file is part of the GNU C Library and contains locale data.
+% The Free Software Foundation does not claim any copyright interest
+% in the locale data contained in this file. The foregoing does not
+% affect the license of the GNU C Library as a whole. It does not
+% exempt you from the conditions of the license if your use would
+% otherwise be governed by that license.
+
+% Transliterations of cyrillic letters to latin and/or ascii symbols.
"cyrillic" -> "Cyrillic"; "latin" -> "Latin"; "ascii" -> "ASCII".
Post by Egor Kobylkin
+% Inspired by ISO 9.1995 / GOST 7.79-2000.
+% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
+% i.e [U4001-U4F9, U2019] but only the letters covered by ISO 9.1995
Typos:

"i.e" -> "i.e.," (somebody please fix me if I'm wrong here)
"U4001" - I guess you meant "U0401"
"U4F9" -> "U04F9". I think that "U4F9" is not definitely bad but
let's be consistent.

Also I can see some gaps in the range. Are you going to fill them
or maybe for now just mention that they exist?
Post by Egor Kobylkin
+% It implements the GOST_7.79 System A (Latin Script) as a first
+% option and System B Cyrillic (ASCII) as a second option. Check
+% https://en.wikipedia.org/wiki/ISO_9 for reference.
+% The System B is extended from GOST_7.79-Russian using open sources
+% of the transliteration mappings and the "h/`" diacritics logic.
What is "h/`" diacritics logic?
Post by Egor Kobylkin
+
+% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
+% | iconv -f ISO-8859-15 -t UTF-8 # System A
+% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
+
+% Contributions welcome for the rest of Cyrillic script in Unicode
Sure, I'm not going to stop you from pushing these changes just because
there are missing characters. I will consider adding them later.
Post by Egor Kobylkin
+% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
+% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
+% Generated from UnicodeData.txt with
+% https://sourceware.org/bugzilla/attachment.cgi?id=11301.
1. Is the file really generated with a script and not modified later?
If yes then maybe you should contribute the script instead? In that case,
you should also not post this file to libc-locale, maintainers and
developers should be able to regenerate it.
2. The link leads to a LibreOffice spreadsheet.
Post by Egor Kobylkin
+LC_CTYPE
+
+translit_start
+
<U0400> is missing here. Are you going to leave it for now?
Post by Egor Kobylkin
+% CYRILLIC CAPITAL LETTER IO
+<U0401> <U00CB>;"<U0059><U004F>"
[...]
+% CYRILLIC CAPITAL LETTER KJE
+<U040C> <U1E30>;"<U004B><U0060>"
<U040D> is missing here. Can we add it already?
Post by Egor Kobylkin
+% CYRILLIC CAPITAL LETTER SHORT U
+<U040E> <U016C>;"<U0055><U0060>"
[...]
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC UNDEFINED
+<U0423><U0301> <U00DA>;"<U0055><U0060>"
This still makes me wonder.

Does it work at all?
What if we remove this rule, won't it be transliterated as
<U0423> => "U", <U0301> - left unchanged, so "U" + <U0301>"
will eventually produce "Ú"?
Why is it called "UNDEFINED"?
Do we need similar rules for other characters?
Post by Egor Kobylkin
[...]
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC UNDEFINED
+<U0443><U0301> <U00FA>;"<U0075><U0060>"
Same here.
Post by Egor Kobylkin
[...]
+% CYRILLIC SMALL LETTER YA
+<U044F> <U00E2>;"<U0079><U0061>"
Again <U0450> missing (because it is lowercase variant of <U0400>).
Post by Egor Kobylkin
+% CYRILLIC SMALL LETTER IO
+<U0451> <U00EB>;"<U0079><U006F>"
[...]
+% CYRILLIC SMALL LETTER KJE
+<U045C> <U1E31>;"<U006B><U0060>"
<U045D> missing (same reason as <U040D>).
Post by Egor Kobylkin
+% CYRILLIC SMALL LETTER SHORT U
+<U045E> <U016D>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER DZHE
+<U045F> "<U0064><U0302>";"<U0064><U0068>"
More letters missing here. Is this because they are historic so we
don't want to include them now? Well, but "YUS" is also historic.
(Please, do not remove YUS for consistency).
Post by Egor Kobylkin
+% CYRILLIC CAPITAL LETTER BIG YUS
+<U046A> <U01CD>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BIG YUS
+<U046B> <U01CE>;"<U006F><U0060>"
[...]
I will continue but, again, I don't give any ETA so other reviewers
are welcome here.

Regards,

Rafal
Marko Myllynen
2018-10-11 13:10:49 UTC
Permalink
Hi,
Post by Rafal Luzynski
First of all, I think that such a large patch should also include
the tests. Please see how automatic tests are performed in locale
data and write your own.
Also I can see some gaps in the range. Are you going to fill them
or maybe for now just mention that they exist?
<U040D> is missing here. Can we add it already?
Sure, I'm not going to stop you from pushing these changes just because
there are missing characters. I will consider adding them later.
<U0400> is missing here. Are you going to leave it for now?
See check https://sourceware.org/ml/libc-alpha/2018-10/msg00160.html.
Post by Rafal Luzynski
Post by Egor Kobylkin
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC UNDEFINED
+<U0423><U0301> <U00DA>;"<U0055><U0060>"
This still makes me wonder.
Does it work at all?
No, see the above link.

More importantly, I realized that ICU uconv(1) I mentioned earlier
should make a great reference for this data; output of the currently
included transliteration rules should match uconv(1) output. If that is
not the case, the patch or uconv(1) might have an issue. If the outputs
match, then we should be able to safely assume the patch is ok.

It could also be considered to use uconv(1) output as reference how the
handle to currently missing characters.

(uconv(1) is part of the icu package on Fedora/CentOS/RHEL/openSUSE.)

Thanks,
--
Marko Myllynen
Volodymyr Lisivka
2018-10-11 13:50:46 UTC
Permalink
Post by Rafal Luzynski
Thank you, Egor. I am looking at your patch and although I have
First of all, I think that such a large patch should also include
the tests. Please see how automatic tests are performed in locale
data and write your own.
Post by Egor Kobylkin
[...]
From this patch I have excluded locales that already mention cyrillic or
az_AZ
iso14651_t1_common
ky_KG
mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
[...]
I think that eventually we would like to include your translit_cyrillic
also in these locales because I assume that your rules should work good
for them as well, also should include more characters than the individual
language contributors took into account.
It's very good idea. Transliteration in Ukrainian locale predates this
work for about decade. It well tested. I also have automatic test
cases, which I can adapt to current standard. Let's drop Russian
transliteration rules and replace them with Ukrainian transliteration
rules. I assume that Ukrainian rules should work good for them as
well.

Ukrainian language is the oldest and most developed language in Slavic
family - last king of all Slavs named Madzhak/Muzhik (Brave), leader
of Volyniana union, was lived in Western Ukraine in Volyn` region.
After Madzhak capturing of Madzhak, kingdom was split into multiple
western parts and eastern part, where 9 Slavic tribes were united by
Rus` tribe, which abandoned their city, now known as Old Russa,
because of epidemic. IMHO, it's will be fair to use rules of the
oldest Slavic union.
Post by Rafal Luzynski
Similarly to Mike's work on
collation: a common rules were created and all locales include them adding
their own language specific modifications.
It's good idea too. In our own locale we prefer that words in our
language will be at top of a sorted list. Currently, in Ukrainian
locale it works as intended, but Russian locale has inverted order.
IMHO, Russian locale should use Ukrainian rules.

$ echo 'один два three four'| tr ' ' '\n' | LANG=uk_UA.utf8 sort
два
один
four
three
$ echo 'один два three four'| tr ' ' '\n' | LANG=ru_RU.utf8 sort
four
three
два
один
Post by Rafal Luzynski
Post by Egor Kobylkin
[...]
[...]
I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files. Volodymyr
I am not sure if we want Cyrillic text in the commit message. Shouldn't
it be, uhm, tranlisterated? :-)
"sr_CS" - I guess you meant "sr_RS".
"sr_YU" has been dropped, do we want to mention it?
Post by Egor Kobylkin
[...]
[BZ #2872]
* localedata/locales/translit_cyrillic: add ISO 9.1995, GOST 7.79
Please start "Add" with an uppercase. BTW, shouldn't it be "New file"
instead?
Post by Egor Kobylkin
System A transliteration System B transcription table from Cyrillic to
Latin/ASCII.
* localedata/locales/C: add include "translit_cyrillic";"" to LC_CTYPE
translit section.
Same, "Add" here.
Post by Egor Kobylkin
* localedata/locales/aa_DJ: Likewise.
Good (here and everywhere below).
Post by Egor Kobylkin
[...]
diff -uNr a/localedata/locales/translit_cyrillic
b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000
+0000
+++ b/localedata/locales/translit_cyrillic 2018-10-09 19:02:54.000000000
+0000
@@ -0,0 +1,383 @@
+escape_char /
+comment_char %
+
+% This file is part of the GNU C Library and contains locale data.
+% The Free Software Foundation does not claim any copyright interest
+% in the locale data contained in this file. The foregoing does not
+% affect the license of the GNU C Library as a whole. It does not
+% exempt you from the conditions of the license if your use would
+% otherwise be governed by that license.
+
+% Transliterations of cyrillic letters to latin and/or ascii symbols.
"cyrillic" -> "Cyrillic"; "latin" -> "Latin"; "ascii" -> "ASCII".
Post by Egor Kobylkin
+% Inspired by ISO 9.1995 / GOST 7.79-2000.
+% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
+% i.e [U4001-U4F9, U2019] but only the letters covered by ISO 9.1995
"i.e" -> "i.e.," (somebody please fix me if I'm wrong here)
"U4001" - I guess you meant "U0401"
"U4F9" -> "U04F9". I think that "U4F9" is not definitely bad but
let's be consistent.
Also I can see some gaps in the range. Are you going to fill them
or maybe for now just mention that they exist?
Post by Egor Kobylkin
+% It implements the GOST_7.79 System A (Latin Script) as a first
+% option and System B Cyrillic (ASCII) as a second option. Check
+% https://en.wikipedia.org/wiki/ISO_9 for reference.
+% The System B is extended from GOST_7.79-Russian using open sources
+% of the transliteration mappings and the "h/`" diacritics logic.
What is "h/`" diacritics logic?
Post by Egor Kobylkin
+
+% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
+% | iconv -f ISO-8859-15 -t UTF-8 # System A
+% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
+
+% Contributions welcome for the rest of Cyrillic script in Unicode
Sure, I'm not going to stop you from pushing these changes just because
there are missing characters. I will consider adding them later.
Post by Egor Kobylkin
+% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
+% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
+% Generated from UnicodeData.txt with
+% https://sourceware.org/bugzilla/attachment.cgi?id=11301.
1. Is the file really generated with a script and not modified later?
If yes then maybe you should contribute the script instead? In that case,
you should also not post this file to libc-locale, maintainers and
developers should be able to regenerate it.
2. The link leads to a LibreOffice spreadsheet.
Post by Egor Kobylkin
+LC_CTYPE
+
+translit_start
+
<U0400> is missing here. Are you going to leave it for now?
Post by Egor Kobylkin
+% CYRILLIC CAPITAL LETTER IO
+<U0401> <U00CB>;"<U0059><U004F>"
[...]
+% CYRILLIC CAPITAL LETTER KJE
+<U040C> <U1E30>;"<U004B><U0060>"
<U040D> is missing here. Can we add it already?
Post by Egor Kobylkin
+% CYRILLIC CAPITAL LETTER SHORT U
+<U040E> <U016C>;"<U0055><U0060>"
[...]
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC UNDEFINED
+<U0423><U0301> <U00DA>;"<U0055><U0060>"
This still makes me wonder.
Does it work at all?
What if we remove this rule, won't it be transliterated as
<U0423> => "U", <U0301> - left unchanged, so "U" + <U0301>"
will eventually produce "Ú"?
Why is it called "UNDEFINED"?
Do we need similar rules for other characters?
Post by Egor Kobylkin
[...]
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC UNDEFINED
+<U0443><U0301> <U00FA>;"<U0075><U0060>"
Same here.
Post by Egor Kobylkin
[...]
+% CYRILLIC SMALL LETTER YA
+<U044F> <U00E2>;"<U0079><U0061>"
Again <U0450> missing (because it is lowercase variant of <U0400>).
Post by Egor Kobylkin
+% CYRILLIC SMALL LETTER IO
+<U0451> <U00EB>;"<U0079><U006F>"
[...]
+% CYRILLIC SMALL LETTER KJE
+<U045C> <U1E31>;"<U006B><U0060>"
<U045D> missing (same reason as <U040D>).
Post by Egor Kobylkin
+% CYRILLIC SMALL LETTER SHORT U
+<U045E> <U016D>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER DZHE
+<U045F> "<U0064><U0302>";"<U0064><U0068>"
More letters missing here. Is this because they are historic so we
don't want to include them now? Well, but "YUS" is also historic.
(Please, do not remove YUS for consistency).
Post by Egor Kobylkin
+% CYRILLIC CAPITAL LETTER BIG YUS
+<U046A> <U01CD>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BIG YUS
+<U046B> <U01CE>;"<U006F><U0060>"
[...]
I will continue but, again, I don't give any ETA so other reviewers
are welcome here.
Regards,
Rafal
Egor Kobylkin
2018-10-11 14:59:11 UTC
Permalink
Hi Rafal
Post by Rafal Luzynski
Thank you, Egor. I am looking at your patch and although I have
First of all, I think that such a large patch should also include
the tests. Please see how automatic tests are performed in locale
data and write your own.
Could you please point me to the existing automatic tests?
Locally I am using the test suggested in glibc locales wiki.
From my commit message:
"The glibc wiki explicitly lists this use case as the test example
https://sourceware.org/glibc/wiki/Locales#Testing_Locales :
LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt
"
I am visually checking whether any iconv run fails for all those locales
but you must refer to some automated unit test with a boolean outcome,
right?
Post by Rafal Luzynski
Post by Egor Kobylkin
[...]
From this patch I have excluded locales that already mention cyrillic or
az_AZ
iso14651_t1_common
ky_KG
mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
[...]
I think that eventually we would like to include your translit_cyrillic
also in these locales because I assume that your rules should work good
for them as well, also should include more characters than the individual
language contributors took into account. Similarly to Mike's work on
collation: a common rules were created and all locales include them adding
their own language specific modifications.
This is fine with me. Should anybody supply translit_xxxxxxxxxxxx for
any of the mentioned locales we can include them as well. Wouldn't it be
easier to coordinate those as separate patches though?
Post by Rafal Luzynski
Post by Egor Kobylkin
[...]
[...]
I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files. Volodymyr
I am not sure if we want Cyrillic text in the commit message. Shouldn't
it be, uhm, tranlisterated? :-)
I do not see any Cyrillic text in the commit message.
the ?????? you see are the actual "?" symbols coming out of iconv now.
Post by Rafal Luzynski
"sr_CS" - I guess you meant "sr_RS".
"sr_YU" has been dropped, do we want to mention it?
The list of locales and the patch itself is generated from the actual
locales - I do not hand pick them, only exclude the ones in the
exclusion list above.
Post by Rafal Luzynski
Post by Egor Kobylkin
[...]
[BZ #2872]
* localedata/locales/translit_cyrillic: add ISO 9.1995, GOST 7.79
Please start "Add" with an uppercase. BTW, shouldn't it be "New file"
instead?
Post by Egor Kobylkin
System A transliteration System B transcription table from Cyrillic to
Latin/ASCII.
* localedata/locales/C: add include "translit_cyrillic";"" to LC_CTYPE
translit section.
Same, "Add" here.
Post by Egor Kobylkin
* localedata/locales/aa_DJ: Likewise.
Good (here and everywhere below).
Post by Egor Kobylkin
[...]
diff -uNr a/localedata/locales/translit_cyrillic
b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000
+0000
+++ b/localedata/locales/translit_cyrillic 2018-10-09 19:02:54.000000000
+0000
@@ -0,0 +1,383 @@
+escape_char /
+comment_char %
+
+% This file is part of the GNU C Library and contains locale data.
+% The Free Software Foundation does not claim any copyright interest
+% in the locale data contained in this file. The foregoing does not
+% affect the license of the GNU C Library as a whole. It does not
+% exempt you from the conditions of the license if your use would
+% otherwise be governed by that license.
+
+% Transliterations of cyrillic letters to latin and/or ascii symbols.
"cyrillic" -> "Cyrillic"; "latin" -> "Latin"; "ascii" -> "ASCII".
Post by Egor Kobylkin
+% Inspired by ISO 9.1995 / GOST 7.79-2000.
+% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
+% i.e [U4001-U4F9, U2019] but only the letters covered by ISO 9.1995
"i.e" -> "i.e.," (somebody please fix me if I'm wrong here)
"U4001" - I guess you meant "U0401"
"U4F9" -> "U04F9". I think that "U4F9" is not definitely bad but
let's be consistent.
These are all good catches. I will fix them and resubmit.
Post by Rafal Luzynski
Also I can see some gaps in the range. Are you going to fill them
or maybe for now just mention that they exist?
Post by Egor Kobylkin
Post by Marko Myllynen
Post by Egor Kobylkin
correct link https://sourceware.org/bugzilla/attachment.cgi?id=11303
Although I haven't checked every rule this in general looks very good
(but see below).
Not sure do we want to add the few missing characters
mentioned at https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode,
e.g., one instantly notices that U+0400 is missing. (I wouldn't add at
least initially the more exotic characters, like the historic ones,
though.) Perhaps filing a bug or two for these cases for separate
consideration would be ok.
The question here is what should serve as their transliteration and
transcription?
Not sure, so filing a separate bug about this once your patch is merged
might be the most suitable action for now, I don't think we want to
postpone merging your work further due to these non-ISO 9 cases.
Post by Egor Kobylkin
+% It implements the GOST_7.79 System A (Latin Script) as a first
+% option and System B Cyrillic (ASCII) as a second option. Check
+% https://en.wikipedia.org/wiki/ISO_9 for reference.
+% The System B is extended from GOST_7.79-Russian using open sources
+% of the transliteration mappings and the "h/`" diacritics logic.
What is "h/`" diacritics logic?
Basically some Linguist mentioned that they have chosen "h" and '`" to
represent the diacritics for the transcription (i.e. GOST 7.79 System
B). This way there is some resemblance to the watertight transliteration
as per ISO 9 (Sysetem A) but it is still all in ASCII. We have decided
to extend GOST 7.79 to the all ISO 9 characters and so I have extended
it following that Linguist logic.
Post by Rafal Luzynski
Post by Egor Kobylkin
+% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
+% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
+% Generated from UnicodeData.txt with
+% https://sourceware.org/bugzilla/attachment.cgi?id=11301.
1. Is the file really generated with a script and not modified later?
If yes then maybe you should contribute the script instead? In that case,
you should also not post this file to libc-locale, maintainers and
developers should be able to regenerate it.
2. The link leads to a LibreOffice spreadsheet.
No, I do not have a script. The "generated" means it is a result of
formulas in that spreadsheet. People are welcome to write a script that
should be straightforward implementation of those rules in formulas.
Post by Rafal Luzynski
Post by Egor Kobylkin
+LC_CTYPE
+
+translit_start
+
<U0400> is missing here. Are you going to leave it for now?
Yes, it is to be left out, not in ISO 9. See the exchange with Marko above.
Post by Rafal Luzynski
Post by Egor Kobylkin
+% CYRILLIC CAPITAL LETTER IO
+<U0401> <U00CB>;"<U0059><U004F>"
[...]
+% CYRILLIC CAPITAL LETTER KJE
+<U040C> <U1E30>;"<U004B><U0060>"
<U040D> is missing here. Can we add it already?
Yes, it is to be left out, not in ISO 9. See the exchange with Marko above.
Post by Rafal Luzynski
Post by Egor Kobylkin
+% CYRILLIC CAPITAL LETTER SHORT U
+<U040E> <U016C>;"<U0055><U0060>"
[...]
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC UNDEFINED
+<U0423><U0301> <U00DA>;"<U0055><U0060>"
This still makes me wonder.
Does it work at all?
What if we remove this rule, won't it be transliterated as
<U0423> => "U", <U0301> - left unchanged, so "U" + <U0301>"
will eventually produce "Ú"?
Why is it called "UNDEFINED"?
...
Post by Rafal Luzynski
Post by Egor Kobylkin
Post by Marko Myllynen
I'm not sure this will work, no existing rule in translit_* files
contain two characters, I'd assume that the rule for U+0423 is applied
first and then the below rule is never used.
% CYRILLIC UNDEFINED
<U0423><U0301> <U00DA>;"<U0055><U0060>"
Perhaps this should be commented out or removed altogether if it's not
working as intended.
So yes, they are not processed. I would drop them to not to have special
cases. But I am also fine with keeping them because all work is done
already.
I'd probably drop them but I don't feel strongly about this either way.
Thanks for your efforts, I don't have any further comments, I'll leave
this now for Rafal and Mike to provide additional feedback and hopefully
merge soon.
Could you also please check the discussion with Marko on UNDEFINED and
other related topics? You were on To: or CC: for those emails.
The same for the other characters below.
Post by Rafal Luzynski
Do we need similar rules for other characters?
Post by Egor Kobylkin
[...]
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC UNDEFINED
+<U0443><U0301> <U00FA>;"<U0075><U0060>"
Same here.
Post by Egor Kobylkin
[...]
+% CYRILLIC SMALL LETTER YA
+<U044F> <U00E2>;"<U0079><U0061>"
Again <U0450> missing (because it is lowercase variant of <U0400>).
Post by Egor Kobylkin
+% CYRILLIC SMALL LETTER IO
+<U0451> <U00EB>;"<U0079><U006F>"
[...]
+% CYRILLIC SMALL LETTER KJE
+<U045C> <U1E31>;"<U006B><U0060>"
<U045D> missing (same reason as <U040D>).
Post by Egor Kobylkin
+% CYRILLIC SMALL LETTER SHORT U
+<U045E> <U016D>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER DZHE
+<U045F> "<U0064><U0302>";"<U0064><U0068>"
More letters missing here. Is this because they are historic so we
don't want to include them now? Well, but "YUS" is also historic.
(Please, do not remove YUS for consistency).
Post by Egor Kobylkin
+% CYRILLIC CAPITAL LETTER BIG YUS
+<U046A> <U01CD>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BIG YUS
+<U046B> <U01CE>;"<U006F><U0060>"
[...]
I will continue but, again, I don't give any ETA so other reviewers
are welcome here.
Regards,
Rafal
Bests,
Egor
Egor Kobylkin
2018-10-11 21:30:48 UTC
Permalink
Post by Egor Kobylkin
Post by Rafal Luzynski
Post by Egor Kobylkin
[...]
I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files. Volodymyr
I am not sure if we want Cyrillic text in the commit message. Shouldn't
it be, uhm, tranlisterated? :-)
I do not see any Cyrillic text in the commit message.
the ?????? you see are the actual "?" symbols coming out of iconv now.
Post by Rafal Luzynski
"sr_CS" - I guess you meant "sr_RS".
"sr_YU" has been dropped, do we want to mention it?
The list of locales and the patch itself is generated from the actual
locales - I do not hand pick them, only exclude the ones in the
exclusion list above.
Ah, yes, that message above should read sr_RS. Will fix.
There is no sr_YU anymore indeed, so I will drop it. No changes to the
patch, just the commit message.

Bests,
Egor
Egor Kobylkin
2018-10-11 15:05:39 UTC
Permalink
Post by Rafal Luzynski
Thank you, Egor. I am looking at your patch and although I have
...
Post by Rafal Luzynski
Post by Egor Kobylkin
[...]
[BZ #2872]
* localedata/locales/translit_cyrillic: add ISO 9.1995, GOST 7.79
Please start "Add" with an uppercase. BTW, shouldn't it be "New file"
instead?
"New file or Add" - I don't know. You tell me.
Post by Rafal Luzynski
Post by Egor Kobylkin
System A transliteration System B transcription table from Cyrillic to
Latin/ASCII.
* localedata/locales/C: add include "translit_cyrillic";"" to LC_CTYPE
translit section.
Same, "Add" here.
Same, please advise.
Bests,
Egor
Egor Kobylkin
2018-10-11 15:44:17 UTC
Permalink
Dear locale maintainers,

fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"

https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]

add Cyrillic transliteration table translit_cyrillic file

https://sourceware.org/bugzilla/attachment.cgi?id=11317 [7]

to localedata/locales/ and include it in all your locales going forward.

Patch included inline below.

From this patch I have excluded locales that already mention cyrillic or
have a transliteration table for it:
az_AZ
iso14651_t1_common
ky_KG
mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
***@cyrillic

Their maintainers are requested to make an explicit decision on how and
whether at all to include this patch.

Current bug effect:

The glibc wiki explicitly lists this use case as the test example

https://sourceware.org/glibc/wiki/Locales#Testing_Locales :

LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt

currently it fails on Cyrillic texts in most locales including ru_RU [1]
[8] [9]:

LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt |grep CYRILLIC

CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.

- It produces a string of question marks and spaces.

This is what it should produce and it does so after the patch applied:

CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.


Root problem and the fix:

The root problem is the missing transliteration table that I am
supplying here. Furthermore it has to be referenced/included into the
active locale at the compilation time to be used by iconv.



COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.

Examples: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
compatible transcription and iconv -f UTF-8 -t ISO-8859-15//TRANSLIT |
iconv -f ISO-8859-15 -t UTF-8 will produce Latin transliteration as per
ISO 9.1995.

While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration/transcription has only Latin/ASCII codes but still can
be read by a native speaker. Among other things it is useful for
processing the Cyrillic texts and filenames by programs or on systems
that are not specifically prepared to work with Cyrillic, don't have
corresponding fonts installed or can't handle UTF-8.

The transliteration table itself is attached as a file translit_cyrillic
[7]. Its content (mapping) is based on ISO 9.1995 standard [10] and its
derivative GOST 7.79-2000 official source (Federal Agency on Technical
Regulating and Metrology Of Russian Federation [2]). Technically an
independent but mostly identical source [3] was used and prepared in a
spreadsheet [6].

The documentation suggests that the transliteration tables inclusion is
done by adding *include "translit_cyrillic";""* string into LC_CTYPE
translit_start section
http://man7.org/linux/man-pages/man5/locale.5.html [5]
Practically I have searched for all locales that have a
translit_start/end stance and generated a patch for them.

The Cyrillic transliteration of e.g. Russian text may have already
worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
have their transliteration tables included inline.

I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files. Volodymyr
Lisivka <***@gmail.com>, Max Kutny <***@gmail.com> (uk_UA),
ДаМОлП КегаМ <***@gnome.org> (sr_YU, sr_CS) have confirmed the
exclusion.

Links:

[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11301
[7] translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11317
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=11304
[10] https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A

Best regards,
Egor Kobylkin

---
2018-10-11 Egor Kobylkin <***@kobylkin.com>

[BZ #2872]
* localedata/locales/translit_cyrillic: Add ISO 9.1995, GOST 7.79
System A transliteration System B transcription table from Cyrillic to
Latin/ASCII.
* localedata/locales/C: add include "translit_cyrillic";"" to LC_CTYPE
translit section.
* localedata/locales/aa_DJ: Likewise.
* localedata/locales/af_ZA: Likewise.
* localedata/locales/ak_GH: Likewise.
* localedata/locales/am_ET: Likewise.
* localedata/locales/ar_EG: Likewise.
* localedata/locales/be_BY: Likewise.
* localedata/locales/bem_ZM: Likewise.
* localedata/locales/ber_DZ: Likewise.
* localedata/locales/ber_MA: Likewise.
* localedata/locales/bg_BG: Likewise.
* localedata/locales/bi_VU: Likewise.
* localedata/locales/bn_BD: Likewise.
* localedata/locales/bo_CN: Likewise.
* localedata/locales/ca_ES: Likewise.
* localedata/locales/ce_RU: Likewise.
* localedata/locales/cmn_TW: Likewise.
* localedata/locales/cs_CZ: Likewise.
* localedata/locales/cv_RU: Likewise.
* localedata/locales/cy_GB: Likewise.
* localedata/locales/da_DK: Likewise.
* localedata/locales/de_DE: Likewise.
* localedata/locales/dv_MV: Likewise.
* localedata/locales/dz_BT: Likewise.
* localedata/locales/el_GR: Likewise.
* localedata/locales/en_GB: Likewise.
* localedata/locales/en_NG: Likewise.
* localedata/locales/en_ZM: Likewise.
* localedata/locales/es_CU: Likewise.
* localedata/locales/es_ES: Likewise.
* localedata/locales/et_EE: Likewise.
* localedata/locales/fa_IR: Likewise.
* localedata/locales/ff_SN: Likewise.
* localedata/locales/fi_FI: Likewise.
* localedata/locales/fr_FR: Likewise.
* localedata/locales/ga_IE: Likewise.
* localedata/locales/gd_GB: Likewise.
* localedata/locales/gu_IN: Likewise.
* localedata/locales/gv_GB: Likewise.
* localedata/locales/he_IL: Likewise.
* localedata/locales/hi_IN: Likewise.
* localedata/locales/hif_FJ: Likewise.
* localedata/locales/hr_HR: Likewise.
* localedata/locales/ht_HT: Likewise.
* localedata/locales/hu_HU: Likewise.
* localedata/locales/hy_AM: Likewise.
* localedata/locales/id_ID: Likewise.
* localedata/locales/is_IS: Likewise.
* localedata/locales/it_IT: Likewise.
* localedata/locales/ja_JP: Likewise.
* localedata/locales/kab_DZ: Likewise.
* localedata/locales/kk_KZ: Likewise.
* localedata/locales/km_KH: Likewise.
* localedata/locales/kn_IN: Likewise.
* localedata/locales/ko_KR: Likewise.
* localedata/locales/ks_IN: Likewise.
* localedata/locales/kw_GB: Likewise.
* localedata/locales/lb_LU: Likewise.
* localedata/locales/lg_UG: Likewise.
* localedata/locales/lij_IT: Likewise.
* localedata/locales/ln_CD: Likewise.
* localedata/locales/lo_LA: Likewise.
* localedata/locales/lt_LT: Likewise.
* localedata/locales/lv_LV: Likewise.
* localedata/locales/mg_MG: Likewise.
* localedata/locales/mhr_RU: Likewise.
* localedata/locales/mk_MK: Likewise.
* localedata/locales/ml_IN: Likewise.
* localedata/locales/ms_MY: Likewise.
* localedata/locales/mt_MT: Likewise.
* localedata/locales/***@latin: Likewise.
* localedata/locales/nb_NO: Likewise.
* localedata/locales/ne_NP: Likewise.
* localedata/locales/nhn_MX: Likewise.
* localedata/locales/niu_NU: Likewise.
* localedata/locales/niu_NZ: Likewise.
* localedata/locales/nl_NL: Likewise.
* localedata/locales/nr_ZA: Likewise.
* localedata/locales/oc_FR: Likewise.
* localedata/locales/om_KE: Likewise.
* localedata/locales/or_IN: Likewise.
* localedata/locales/os_RU: Likewise.
* localedata/locales/pa_IN: Likewise.
* localedata/locales/pa_PK: Likewise.
* localedata/locales/pl_PL: Likewise.
* localedata/locales/pt_PT: Likewise.
* localedata/locales/quz_PE: Likewise.
* localedata/locales/ro_RO: Likewise.
* localedata/locales/ru_RU: Likewise.
* localedata/locales/rw_RW: Likewise.
* localedata/locales/sa_IN: Likewise.
* localedata/locales/sd_IN: Likewise.
* localedata/locales/***@devanagari: Likewise.
* localedata/locales/sd_PK: Likewise.
* localedata/locales/se_NO: Likewise.
* localedata/locales/sgs_LT: Likewise.
* localedata/locales/shn_MM: Likewise.
* localedata/locales/si_LK: Likewise.
* localedata/locales/sk_SK: Likewise.
* localedata/locales/sl_SI: Likewise.
* localedata/locales/sm_WS: Likewise.
* localedata/locales/so_SO: Likewise.
* localedata/locales/sq_AL: Likewise.
* localedata/locales/ss_ZA: Likewise.
* localedata/locales/st_ZA: Likewise.
* localedata/locales/sv_SE: Likewise.
* localedata/locales/sw_KE: Likewise.
* localedata/locales/ta_IN: Likewise.
* localedata/locales/te_IN: Likewise.
* localedata/locales/th_TH: Likewise.
* localedata/locales/ti_ET: Likewise.
* localedata/locales/tn_ZA: Likewise.
* localedata/locales/to_TO: Likewise.
* localedata/locales/tpi_PG: Likewise.
* localedata/locales/tr_TR: Likewise.
* localedata/locales/ts_ZA: Likewise.
* localedata/locales/unm_US: Likewise.
* localedata/locales/ur_IN: Likewise.
* localedata/locales/ur_PK: Likewise.
* localedata/locales/ve_ZA: Likewise.
* localedata/locales/vi_VN: Likewise.
* localedata/locales/wa_BE: Likewise.
* localedata/locales/wo_SN: Likewise.
* localedata/locales/xh_ZA: Likewise.
* localedata/locales/yi_US: Likewise.
* localedata/locales/yuw_PG: Likewise.
* localedata/locales/zh_CN: Likewise.
* localedata/locales/zu_ZA: Likewise.

diff -uNr a/localedata/locales/C b/localedata/locales/C
--- a/localedata/locales/C 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/C 2018-10-11 15:10:43.000000000 +0000
@@ -2293,6 +2293,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
--- a/localedata/locales/aa_DJ 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/aa_DJ 2018-10-11 15:10:43.000000000 +0000
@@ -68,6 +68,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
--- a/localedata/locales/af_ZA 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/af_ZA 2018-10-11 15:10:43.000000000 +0000
@@ -70,6 +70,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
--- a/localedata/locales/ak_GH 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/ak_GH 2018-10-11 15:10:43.000000000 +0000
@@ -54,6 +54,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
--- a/localedata/locales/am_ET 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/am_ET 2018-10-11 15:10:43.000000000 +0000
@@ -1394,6 +1394,7 @@
<U137A> <U0060><U0039><U0030>
<U137B> <U0060><U0031><U0030><U0030>
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
--- a/localedata/locales/ar_EG 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/ar_EG 2018-10-11 15:10:43.000000000 +0000
@@ -44,6 +44,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
--- a/localedata/locales/be_BY 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/be_BY 2018-10-11 15:10:43.000000000 +0000
@@ -68,6 +68,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
--- a/localedata/locales/bem_ZM 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bem_ZM 2018-10-11 15:10:43.000000000 +0000
@@ -41,6 +41,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
--- a/localedata/locales/ber_DZ 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/ber_DZ 2018-10-11 15:10:43.000000000 +0000
@@ -165,6 +165,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
--- a/localedata/locales/ber_MA 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/ber_MA 2018-10-11 15:10:44.000000000 +0000
@@ -85,6 +85,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
--- a/localedata/locales/bg_BG 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bg_BG 2018-10-11 15:10:44.000000000 +0000
@@ -49,6 +49,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
--- a/localedata/locales/bi_VU 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bi_VU 2018-10-11 15:10:44.000000000 +0000
@@ -39,6 +39,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
--- a/localedata/locales/bn_BD 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bn_BD 2018-10-11 15:10:44.000000000 +0000
@@ -61,6 +61,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
--- a/localedata/locales/bo_CN 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bo_CN 2018-10-11 15:10:44.000000000 +0000
@@ -43,6 +43,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
--- a/localedata/locales/ca_ES 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/ca_ES 2018-10-11 15:10:44.000000000 +0000
@@ -71,6 +71,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
--- a/localedata/locales/ce_RU 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/ce_RU 2018-10-11 15:10:44.000000000 +0000
@@ -38,6 +38,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/cmn_TW b/localedata/locales/cmn_TW
--- a/localedata/locales/cmn_TW 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cmn_TW 2018-10-11 15:10:44.000000000 +0000
@@ -49,6 +49,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end

class "hanzi"; /
diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
--- a/localedata/locales/cs_CZ 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cs_CZ 2018-10-11 15:10:44.000000000 +0000
@@ -204,6 +204,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
--- a/localedata/locales/cv_RU 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cv_RU 2018-10-11 15:10:44.000000000 +0000
@@ -108,6 +108,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
--- a/localedata/locales/cy_GB 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cy_GB 2018-10-11 15:10:44.000000000 +0000
@@ -65,6 +65,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
--- a/localedata/locales/da_DK 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/da_DK 2018-10-11 15:10:44.000000000 +0000
@@ -166,6 +166,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
--- a/localedata/locales/de_DE 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/de_DE 2018-10-11 15:10:44.000000000 +0000
@@ -78,6 +78,7 @@
% DOUBLE HIGH-REVERSED-9 QUOTATION MARK
<U201F> <U00AB>;<U0022>

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
--- a/localedata/locales/dv_MV 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/dv_MV 2018-10-11 15:10:44.000000000 +0000
@@ -51,6 +51,7 @@
include "translit_combining";""


+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
--- a/localedata/locales/dz_BT 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/dz_BT 2018-10-11 15:10:44.000000000 +0000
@@ -59,6 +59,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
--- a/localedata/locales/el_GR 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/el_GR 2018-10-11 15:10:44.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
--- a/localedata/locales/en_GB 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_GB 2018-10-11 15:10:44.000000000 +0000
@@ -54,6 +54,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
--- a/localedata/locales/en_NG 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_NG 2018-10-11 15:10:45.000000000 +0000
@@ -49,6 +49,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
--- a/localedata/locales/en_ZM 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_ZM 2018-10-11 15:10:45.000000000 +0000
@@ -41,6 +41,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
--- a/localedata/locales/es_CU 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/es_CU 2018-10-11 15:10:45.000000000 +0000
@@ -59,6 +59,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
--- a/localedata/locales/es_ES 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/es_ES 2018-10-11 15:10:45.000000000 +0000
@@ -72,6 +72,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
--- a/localedata/locales/et_EE 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/et_EE 2018-10-11 15:10:45.000000000 +0000
@@ -112,6 +112,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
--- a/localedata/locales/fa_IR 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fa_IR 2018-10-11 15:10:45.000000000 +0000
@@ -78,6 +78,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
--- a/localedata/locales/ff_SN 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/ff_SN 2018-10-11 15:10:45.000000000 +0000
@@ -41,6 +41,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
--- a/localedata/locales/fi_FI 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fi_FI 2018-10-11 15:10:45.000000000 +0000
@@ -136,6 +136,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
--- a/localedata/locales/fr_FR 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fr_FR 2018-10-11 15:10:45.000000000 +0000
@@ -58,6 +58,7 @@
% In France, accents are simply omitted if they cannot be represented.
include "translit_combining";""

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
--- a/localedata/locales/ga_IE 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/ga_IE 2018-10-11 15:10:45.000000000 +0000
@@ -53,6 +53,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
--- a/localedata/locales/gd_GB 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gd_GB 2018-10-11 15:10:45.000000000 +0000
@@ -45,6 +45,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
--- a/localedata/locales/gu_IN 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gu_IN 2018-10-11 15:10:45.000000000 +0000
@@ -62,6 +62,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
--- a/localedata/locales/gv_GB 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gv_GB 2018-10-11 15:10:45.000000000 +0000
@@ -56,6 +56,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
--- a/localedata/locales/he_IL 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/he_IL 2018-10-11 15:10:45.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
--- a/localedata/locales/hi_IN 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hi_IN 2018-10-11 15:10:45.000000000 +0000
@@ -61,6 +61,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
--- a/localedata/locales/hif_FJ 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hif_FJ 2018-10-11 15:10:45.000000000 +0000
@@ -37,6 +37,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
--- a/localedata/locales/hr_HR 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hr_HR 2018-10-11 15:10:45.000000000 +0000
@@ -61,6 +61,7 @@
% transliterate <U0111> {đ} into d + j
<U0111> "<U0064><U006A>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
--- a/localedata/locales/ht_HT 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ht_HT 2018-10-11 15:10:45.000000000 +0000
@@ -57,6 +57,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
--- a/localedata/locales/hu_HU 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/hu_HU 2018-10-11 15:10:46.000000000 +0000
@@ -476,6 +476,7 @@
<U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
<U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
--- a/localedata/locales/hy_AM 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/hy_AM 2018-10-11 15:10:46.000000000 +0000
@@ -75,6 +75,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
--- a/localedata/locales/id_ID 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/id_ID 2018-10-11 15:10:46.000000000 +0000
@@ -54,6 +54,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
--- a/localedata/locales/is_IS 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/is_IS 2018-10-11 15:10:46.000000000 +0000
@@ -149,6 +149,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
--- a/localedata/locales/it_IT 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/it_IT 2018-10-11 15:10:46.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
--- a/localedata/locales/ja_JP 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ja_JP 2018-10-11 15:10:46.000000000 +0000
@@ -1681,6 +1681,7 @@
include "translit_combining";""
include "translit_cjk_variants";""

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/kab_DZ b/localedata/locales/kab_DZ
--- a/localedata/locales/kab_DZ 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kab_DZ 2018-10-11 15:10:46.000000000 +0000
@@ -41,6 +41,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
--- a/localedata/locales/kk_KZ 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kk_KZ 2018-10-11 15:10:46.000000000 +0000
@@ -157,6 +157,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
--- a/localedata/locales/km_KH 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/km_KH 2018-10-11 15:10:46.000000000 +0000
@@ -42,6 +42,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
--- a/localedata/locales/kn_IN 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kn_IN 2018-10-11 15:10:46.000000000 +0000
@@ -63,6 +63,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
--- a/localedata/locales/ko_KR 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ko_KR 2018-10-11 15:10:47.000000000 +0000
@@ -6099,6 +6099,7 @@
include "translit_combining";""
include "translit_hangul";""

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
--- a/localedata/locales/ks_IN 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ks_IN 2018-10-11 15:10:47.000000000 +0000
@@ -46,6 +46,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
--- a/localedata/locales/kw_GB 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kw_GB 2018-10-11 15:10:47.000000000 +0000
@@ -57,6 +57,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
--- a/localedata/locales/lb_LU 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lb_LU 2018-10-11 15:10:47.000000000 +0000
@@ -77,6 +77,7 @@
% LATIN SMALL LETTER E WITH CIRCUMFLEX
<U00EA> "e^"

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
--- a/localedata/locales/lg_UG 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lg_UG 2018-10-11 15:10:47.000000000 +0000
@@ -56,6 +56,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
--- a/localedata/locales/lij_IT 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lij_IT 2018-10-11 15:10:47.000000000 +0000
@@ -47,6 +47,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
--- a/localedata/locales/ln_CD 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ln_CD 2018-10-11 15:10:47.000000000 +0000
@@ -39,6 +39,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
--- a/localedata/locales/lo_LA 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lo_LA 2018-10-11 15:10:47.000000000 +0000
@@ -50,6 +50,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
--- a/localedata/locales/lt_LT 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lt_LT 2018-10-11 15:10:47.000000000 +0000
@@ -163,6 +163,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
--- a/localedata/locales/lv_LV 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lv_LV 2018-10-11 15:10:47.000000000 +0000
@@ -110,6 +110,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
--- a/localedata/locales/mg_MG 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mg_MG 2018-10-11 15:10:47.000000000 +0000
@@ -54,6 +54,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
--- a/localedata/locales/mhr_RU 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mhr_RU 2018-10-11 15:10:47.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
--- a/localedata/locales/mk_MK 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mk_MK 2018-10-11 15:10:47.000000000 +0000
@@ -48,6 +48,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
--- a/localedata/locales/ml_IN 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ml_IN 2018-10-11 15:10:47.000000000 +0000
@@ -60,6 +60,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
%
diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
--- a/localedata/locales/ms_MY 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ms_MY 2018-10-11 15:10:48.000000000 +0000
@@ -45,6 +45,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
--- a/localedata/locales/mt_MT 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mt_MT 2018-10-11 15:10:48.000000000 +0000
@@ -47,6 +47,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/***@latin
b/localedata/locales/***@latin
--- a/localedata/locales/***@latin 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/***@latin 2018-10-11 15:10:48.000000000 +0000
@@ -52,6 +52,7 @@
% accents are simply omitted if they cannot be represented.
include "translit_combining";""

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
--- a/localedata/locales/nb_NO 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nb_NO 2018-10-11 15:10:48.000000000 +0000
@@ -154,6 +154,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
--- a/localedata/locales/ne_NP 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ne_NP 2018-10-11 15:10:48.000000000 +0000
@@ -43,6 +43,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
--- a/localedata/locales/nhn_MX 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nhn_MX 2018-10-11 15:10:48.000000000 +0000
@@ -59,6 +59,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
--- a/localedata/locales/niu_NU 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/niu_NU 2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
--- a/localedata/locales/niu_NZ 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/niu_NZ 2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
--- a/localedata/locales/nl_NL 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nl_NL 2018-10-11 15:10:48.000000000 +0000
@@ -56,6 +56,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
--- a/localedata/locales/nr_ZA 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nr_ZA 2018-10-11 15:10:48.000000000 +0000
@@ -64,6 +64,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
--- a/localedata/locales/oc_FR 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/oc_FR 2018-10-11 15:10:48.000000000 +0000
@@ -54,6 +54,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
--- a/localedata/locales/om_KE 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/om_KE 2018-10-11 15:10:48.000000000 +0000
@@ -138,6 +138,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
--- a/localedata/locales/or_IN 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/or_IN 2018-10-11 15:10:48.000000000 +0000
@@ -62,6 +62,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
--- a/localedata/locales/os_RU 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/os_RU 2018-10-11 15:10:48.000000000 +0000
@@ -69,6 +69,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
--- a/localedata/locales/pa_IN 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pa_IN 2018-10-11 15:10:48.000000000 +0000
@@ -60,6 +60,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
--- a/localedata/locales/pa_PK 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pa_PK 2018-10-11 15:10:48.000000000 +0000
@@ -57,6 +57,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
--- a/localedata/locales/pl_PL 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pl_PL 2018-10-11 15:10:48.000000000 +0000
@@ -116,6 +116,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
--- a/localedata/locales/pt_PT 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pt_PT 2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
--- a/localedata/locales/quz_PE 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/quz_PE 2018-10-11 15:10:48.000000000 +0000
@@ -55,6 +55,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
--- a/localedata/locales/ro_RO 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/ro_RO 2018-10-11 15:10:49.000000000 +0000
@@ -143,6 +143,7 @@
<U0162> "<U021A>";"<U0054>"
<U0163> "<U021B>";"<U0074>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
--- a/localedata/locales/ru_RU 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/ru_RU 2018-10-11 15:10:49.000000000 +0000
@@ -73,6 +73,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
--- a/localedata/locales/rw_RW 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/rw_RW 2018-10-11 15:10:49.000000000 +0000
@@ -45,6 +45,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
--- a/localedata/locales/sa_IN 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sa_IN 2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
--- a/localedata/locales/sd_IN 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_IN 2018-10-11 15:10:49.000000000 +0000
@@ -46,6 +46,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/***@devanagari
b/localedata/locales/***@devanagari
--- a/localedata/locales/***@devanagari 2018-10-11 15:10:18.000000000
+0000
+++ b/localedata/locales/***@devanagari 2018-10-11 15:10:49.000000000
+0000
@@ -44,6 +44,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
--- a/localedata/locales/sd_PK 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_PK 2018-10-11 15:10:49.000000000 +0000
@@ -39,6 +39,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
--- a/localedata/locales/se_NO 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/se_NO 2018-10-11 15:10:49.000000000 +0000
@@ -204,6 +204,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
--- a/localedata/locales/sgs_LT 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sgs_LT 2018-10-11 15:10:49.000000000 +0000
@@ -58,6 +58,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/shn_MM b/localedata/locales/shn_MM
--- a/localedata/locales/shn_MM 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/shn_MM 2018-10-11 15:10:49.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
--- a/localedata/locales/si_LK 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/si_LK 2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
--- a/localedata/locales/sk_SK 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sk_SK 2018-10-11 15:10:49.000000000 +0000
@@ -67,6 +67,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
--- a/localedata/locales/sl_SI 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sl_SI 2018-10-11 15:10:49.000000000 +0000
@@ -90,6 +90,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
--- a/localedata/locales/sm_WS 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sm_WS 2018-10-11 15:10:49.000000000 +0000
@@ -37,6 +37,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
--- a/localedata/locales/so_SO 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/so_SO 2018-10-11 15:10:49.000000000 +0000
@@ -68,6 +68,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
--- a/localedata/locales/sq_AL 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sq_AL 2018-10-11 15:10:49.000000000 +0000
@@ -45,6 +45,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
--- a/localedata/locales/ss_ZA 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ss_ZA 2018-10-11 15:10:49.000000000 +0000
@@ -66,6 +66,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
--- a/localedata/locales/st_ZA 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/st_ZA 2018-10-11 15:10:50.000000000 +0000
@@ -62,6 +62,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
--- a/localedata/locales/sv_SE 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/sv_SE 2018-10-11 15:10:50.000000000 +0000
@@ -138,6 +138,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
--- a/localedata/locales/sw_KE 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/sw_KE 2018-10-11 15:10:50.000000000 +0000
@@ -43,6 +43,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
--- a/localedata/locales/ta_IN 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ta_IN 2018-10-11 15:10:50.000000000 +0000
@@ -63,6 +63,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
--- a/localedata/locales/te_IN 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/te_IN 2018-10-11 15:10:50.000000000 +0000
@@ -63,6 +63,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
--- a/localedata/locales/th_TH 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/th_TH 2018-10-11 15:10:50.000000000 +0000
@@ -57,6 +57,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
--- a/localedata/locales/ti_ET 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ti_ET 2018-10-11 15:10:50.000000000 +0000
@@ -864,6 +864,7 @@
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>

include "translit_combining";""
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
--- a/localedata/locales/tn_ZA 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/tn_ZA 2018-10-11 15:10:50.000000000 +0000
@@ -67,6 +67,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
--- a/localedata/locales/to_TO 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/to_TO 2018-10-11 15:10:50.000000000 +0000
@@ -36,6 +36,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
--- a/localedata/locales/tpi_PG 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/tpi_PG 2018-10-11 15:10:50.000000000 +0000
@@ -44,6 +44,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
--- a/localedata/locales/tr_TR 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/tr_TR 2018-10-11 15:10:50.000000000 +0000
@@ -2423,6 +2423,7 @@

% TURKISH LIRA SIGN
<U20BA> "<U0054><U004C>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/translit_cyrillic
b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000
+0000
+++ b/localedata/locales/translit_cyrillic 2018-10-11 15:10:52.000000000
+0000
@@ -0,0 +1,383 @@
+escape_char /
+comment_char %
+
+% This file is part of the GNU C Library and contains locale data.
+% The Free Software Foundation does not claim any copyright interest
+% in the locale data contained in this file. The foregoing does not
+% affect the license of the GNU C Library as a whole. It does not
+% exempt you from the conditions of the license if your use would
+% otherwise be governed by that license.
+
+% Transliterations of Cyrillic letters to Latin and/or ASCII symbols.
+% Inspired by ISO 9.1995 / GOST 7.79-2000.
+% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
+% i.e. [U0401-U04F9, U2019] but only the letters covered by ISO 9.1995
+% It implements the GOST_7.79 System A (Latin Script) as a first
+% option and System B Cyrillic (ASCII) as a second option. Check
+% https://en.wikipedia.org/wiki/ISO_9 for reference.
+% The System B is extended from GOST_7.79-Russian using open sources
+% of the transliteration mappings and the "h/`" diacritics logic.
+
+% Usage examples:
+% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
+% | iconv -f ISO-8859-15 -t UTF-8 # System A
+% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
+
+% Contributions welcome for the rest of Cyrillic script in Unicode
+% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
+% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
+% Generated from UnicodeData.txt with
+% https://sourceware.org/bugzilla/attachment.cgi?id=11301.
+
+LC_CTYPE
+
+translit_start
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> <U00CB>;"<U0059><U004F>"
+% CYRILLIC CAPITAL LETTER DJE
+<U0402> <U0110>;"<U0044><U004A>"
+% CYRILLIC CAPITAL LETTER GJE
+<U0403> <U01F4>;"<U0047><U0060>"
+% CYRILLIC CAPITAL LETTER UKRAINIAN IE
+<U0404> <U00CA>;"<U0059><U0065>"
+% CYRILLIC CAPITAL LETTER DZE
+<U0405> <U1E90>;"<U005A><U0060>"
+% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0406> <U00CC>;<U0049>
+% CYRILLIC CAPITAL LETTER YI
+<U0407> <U00CF>;"<U0059><U0069>"
+% CYRILLIC CAPITAL LETTER JE
+<U0408> "<U004A><U030C>";<U004A>
+% CYRILLIC CAPITAL LETTER LJE
+<U0409> "<U004C><U0302>";"<U004C><U0060>"
+% CYRILLIC CAPITAL LETTER NJE
+<U040A> "<U004E><U0302>";"<U004E><U0060>"
+% CYRILLIC CAPITAL LETTER TSHE
+<U040B> <U0106>;"<U0054><U0053><U0048>"
+% CYRILLIC CAPITAL LETTER KJE
+<U040C> <U1E30>;"<U004B><U0060>"
+% CYRILLIC CAPITAL LETTER SHORT U
+<U040E> <U016C>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER DZHE
+<U040F> "<U0044><U0302>";"<U0044><U0068>"
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> <U017D>;"<U005A><U0048>"
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC UNDEFINED
+<U0423><U0301> <U00DA>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0048>;<U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> <U0043>;"<U0043><U005A>"
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> <U010C>;"<U0043><U0048>"
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> <U0160>;"<U0053><U0048>"
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> <U015C>;"<U0053><U0048><U0048>"
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> <U02BA>;"<U0041><U0060>"
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> <U0059>;"<U0059><U0060>"
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U02B9>;<U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> <U00C8>;"<U0045><U0060>"
+% CYRILLIC CAPITAL LETTER YU
+<U042E> <U00DB>;"<U0059><U0055>"
+% CYRILLIC CAPITAL LETTER YA
+<U042F> <U00C2>;"<U0059><U0041>"
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> <U017E>;"<U007A><U0068>"
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC UNDEFINED
+<U0443><U0301> <U00FA>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0068>;<U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> <U0063>;"<U0063><U007A>"
+% CYRILLIC SMALL LETTER CHE
+<U0447> <U010D>;"<U0063><U0068>"
+% CYRILLIC SMALL LETTER SHA
+<U0448> <U0161>;"<U0073><U0068>"
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> <U015D>;"<U0073><U0068><U0068>"
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> <U02BA>;"<U0060><U0060>"
+% CYRILLIC SMALL LETTER YERU
+<U044B> <U0079>;"<U0079><U0060>"
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U02B9>;<U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> <U00E8>;"<U0065><U0060>"
+% CYRILLIC SMALL LETTER YU
+<U044E> <U00FB>;"<U0079><U0075>"
+% CYRILLIC SMALL LETTER YA
+<U044F> <U00E2>;"<U0079><U0061>"
+% CYRILLIC SMALL LETTER IO
+<U0451> <U00EB>;"<U0079><U006F>"
+% CYRILLIC SMALL LETTER DJE
+<U0452> <U0111>;"<U0064><U006A>"
+% CYRILLIC SMALL LETTER GJE
+<U0453> <U01F5>;"<U0067><U0060>"
+% CYRILLIC SMALL LETTER UKRAINIAN IE
+<U0454> <U00EA>;"<U0079><U0065>"
+% CYRILLIC SMALL LETTER DZE
+<U0455> <U1E91>;"<U007A><U0060>"
+% CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0456> <U00EC>;<U0069>
+% CYRILLIC SMALL LETTER YI
+<U0457> <U00EF>;"<U0079><U0069>"
+% CYRILLIC SMALL LETTER JE
+<U0458> <U01F0>;<U006A>
+% CYRILLIC SMALL LETTER LJE
+<U0459> "<U006C><U0302>";"<U006C><U0060>"
+% CYRILLIC SMALL LETTER NJE
+<U045A> "<U006E><U0302>";"<U006E><U0060>"
+% CYRILLIC SMALL LETTER TSHE
+<U045B> <U0107>;"<U0074><U0073><U0068>"
+% CYRILLIC SMALL LETTER KJE
+<U045C> <U1E31>;"<U006B><U0060>"
+% CYRILLIC SMALL LETTER SHORT U
+<U045E> <U016D>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER DZHE
+<U045F> "<U0064><U0302>";"<U0064><U0068>"
+% CYRILLIC CAPITAL LETTER BIG YUS
+<U046A> <U01CD>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BIG YUS
+<U046B> <U01CE>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER FITA
+<U0472> "<U0046><U0300>";"<U0046><U0068>"
+% CYRILLIC SMALL LETTER FITA
+<U0473> "<U0066><U0300>";"<U0066><U0068>"
+% CYRILLIC CAPITAL LETTER IZHITSA
+<U0474> <U1EF2>;"<U0059><U0068>"
+% CYRILLIC SMALL LETTER IZHITSA
+<U0475> <U1EF3>;"<U0079><U0068>"
+% CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+<U048C> <U011A>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER SEMISOFT SIGN
+<U048D> <U011B>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+<U0490> "<U0047><U0300>";"<U0047><U0060>"
+% CYRILLIC SMALL LETTER GHE WITH UPTURN
+<U0491> "<U0067><U0300>";"<U0067><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH STROKE
+<U0492> <U0120>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH STROKE
+<U0493> <U0121>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+<U0494> <U011E>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+<U0495> <U011F>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+<U0496> "<U017D><U0327>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+<U0497> "<U017E><U0327>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+<U049A> <U0136>;"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH DESCENDER
+<U049B> <U0137>;"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH STROKE
+<U049E> "<U004B><U0304>";"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH STROKE
+<U049F> "<U006B><U0304>";"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+<U04A2> <U1E46>;"<U004E><U0060>"
+% CYRILLIC SMALL LETTER EN WITH DESCENDER
+<U04A3> <U1E47>;"<U006E><U0060>"
+% CYRILLIC CAPITAL LIGATURE EN GHE
+<U04A4> <U1E44>;"<U004E><U0047>"
+% CYRILLIC SMALL LIGATURE EN GHE
+<U04A5> <U1E45>;"<U006E><U0067>"
+% CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+<U04A6> <U1E54>;"<U0050><U0060>"
+% CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+<U04A7> <U1E55>;"<U0070><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN HA
+<U04A8> <U00D2>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN HA
+<U04A9> <U00F2>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+<U04AA> <U00C7>;"<U0043><U0060>"
+% CYRILLIC SMALL LETTER ES WITH DESCENDER
+<U04AB> <U00E7>;"<U0043><U0060>"
+% CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+<U04AC> <U0162>;"<U0054><U0060>"
+% CYRILLIC SMALL LETTER TE WITH DESCENDER
+<U04AD> <U0163>;"<U0074><U0060>"
+% CYRILLIC CAPITAL LETTER STRAIGHT U
+<U04AE> <U00D9>;<U0055>
+% CYRILLIC SMALL LETTER STRAIGHT U
+<U04AF> <U00F9>;<U0075>
+% CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+<U04B2> <U1E28>;"<U0048><U0060>"
+% CYRILLIC SMALL LETTER HA WITH DESCENDER
+<U04B3> <U1E29>;"<U0068><U0060>"
+% CYRILLIC CAPITAL LIGATURE TE TSE
+<U04B4> "<U0043><U0304>";"<U0054><U0043><U005A>"
+% CYRILLIC SMALL LIGATURE TE TSE
+<U04B5> "<U0063><U0304>";"<U0074><U0063><U007A>"
+% CYRILLIC CAPITAL LETTER SHHA
+<U04BA> <U1E24>;"<U0053><U0048><U0060>"
+% CYRILLIC SMALL LETTER SHHA
+<U04BB> <U1E25>;"<U0053><U0048><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+<U04BC> "<U0043><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE
+<U04BD> "<U0063><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BE> "<U00C7><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BF> "<U00E7><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC LETTER PALOCHKA
+<U04C0> <U2021>;<U0069>
+% CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+<U04C1> "<U005A><U0306>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH BREVE
+<U04C2> "<U007A><U0306>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+<U04CB> <U00C7>;"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER KHAKASSIAN CHE
+<U04CC> <U00E7>;"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH BREVE
+<U04D0> <U0102>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH BREVE
+<U04D1> <U0103>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+<U04D2> <U00C4>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH DIAERESIS
+<U04D3> <U00E4>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER IE WITH BREVE
+<U04D6> <U0114>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER IE WITH BREVE
+<U04D7> <U0115>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER SCHWA
+<U04D8> "<U0041><U030B>";"<U0041><U0060>"
+% CYRILLIC SMALL LETTER SCHWA
+<U04D9> "<U0061><U030B>";"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+<U04DC> "<U005A><U0304>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+<U04DD> "<U007A><U0304>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+<U04DE> "<U005A><U0308>";"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+<U04DF> "<U007A><U0308>";"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+<U04E0> <U0179>;"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN DZE
+<U04E1> <U017A>;"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+<U04E4> <U00CE>;"<U0049><U0060>"
+% CYRILLIC SMALL LETTER I WITH DIAERESIS
+<U04E5> <U00EE>;"<U0069><U0060>"
+% CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+<U04E6> <U00D6>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER O WITH DIAERESIS
+<U04E7> <U00F6>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER BARRED O
+<U04E8> <U00D4>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BARRED O
+<U04E9> <U00F4>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+<U04F0> <U00DC>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DIAERESIS
+<U04F1> <U00FC>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+<U04F2> <U0170>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+<U04F3> <U0171>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+<U04F4> "<U0043><U0308>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+<U04F5> "<U0063><U0308>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+<U04F8> <U0178>;"<U0059><U0060>"
+% CYRILLIC SMALL LETTER YERU WITH DIAERESIS
+<U04F9> <U00FF>;"<U0079><U0060>"
+% RIGHT SINGLE QUOTATION MARK
+<U2019> <U2035>;<U0027>
+
+translit_end
+
+END LC_CTYPE
diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
--- a/localedata/locales/ts_ZA 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ts_ZA 2018-10-11 15:10:50.000000000 +0000
@@ -62,6 +62,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
--- a/localedata/locales/unm_US 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/unm_US 2018-10-11 15:10:51.000000000 +0000
@@ -48,6 +48,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
--- a/localedata/locales/ur_IN 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ur_IN 2018-10-11 15:10:51.000000000 +0000
@@ -46,6 +46,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
--- a/localedata/locales/ur_PK 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ur_PK 2018-10-11 15:10:51.000000000 +0000
@@ -57,6 +57,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
--- a/localedata/locales/ve_ZA 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ve_ZA 2018-10-11 15:10:51.000000000 +0000
@@ -65,6 +65,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
--- a/localedata/locales/vi_VN 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/vi_VN 2018-10-11 15:10:51.000000000 +0000
@@ -57,6 +57,7 @@
% dong sign -> d// -> dd
<U20AB> "<U0111>";"<U0064><U0064>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
--- a/localedata/locales/wa_BE 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/wa_BE 2018-10-11 15:10:51.000000000 +0000
@@ -59,6 +59,7 @@
<U00C5> "A<U030A>";"A";"AU"
<U00E5> "a<U030A>";"a";"au"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
--- a/localedata/locales/wo_SN 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/wo_SN 2018-10-11 15:10:51.000000000 +0000
@@ -54,6 +54,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
--- a/localedata/locales/xh_ZA 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/xh_ZA 2018-10-11 15:10:51.000000000 +0000
@@ -64,6 +64,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
--- a/localedata/locales/yi_US 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/yi_US 2018-10-11 15:10:51.000000000 +0000
@@ -66,6 +66,7 @@
<U05F0> "<U05D5><U05D5>";"ww"
<U05F1> "<U05D5><U05D9>";"wj"
<U05F2> "<U05D9><U05D9>";"jj"
+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/yuw_PG b/localedata/locales/yuw_PG
--- a/localedata/locales/yuw_PG 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/yuw_PG 2018-10-11 15:10:51.000000000 +0000
@@ -40,6 +40,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
--- a/localedata/locales/zh_CN 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/zh_CN 2018-10-11 15:10:51.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end

class "hanzi"; /
diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
--- a/localedata/locales/zu_ZA 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/zu_ZA 2018-10-11 15:10:51.000000000 +0000
@@ -68,6 +68,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
Egor Kobylkin
2018-10-11 21:33:00 UTC
Permalink
Dear locale maintainers,

fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"

https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]

add Cyrillic transliteration table translit_cyrillic file

https://sourceware.org/bugzilla/attachment.cgi?id=11317 [7]

to localedata/locales/ and include it in all your locales going forward.

Patch included inline below.

From this patch I have excluded locales that already mention cyrillic or
have a transliteration table for it:
az_AZ
iso14651_t1_common
ky_KG
mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
***@cyrillic

Their maintainers are requested to make an explicit decision on how and
whether at all to include this patch.

Current bug effect:

The glibc wiki explicitly lists this use case as the test example

https://sourceware.org/glibc/wiki/Locales#Testing_Locales :

LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt

currently it fails on Cyrillic texts in most locales including ru_RU [1]
[8] [9]:

LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt |grep CYRILLIC

CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.

- It produces a string of question marks and spaces.

This is what it should produce and it does so after the patch applied:

CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.


Root problem and the fix:

The root problem is the missing transliteration table that I am
supplying here. Furthermore it has to be referenced/included into the
active locale at the compilation time to be used by iconv.



COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.

Examples: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
compatible transcription and iconv -f UTF-8 -t ISO-8859-15//TRANSLIT |
iconv -f ISO-8859-15 -t UTF-8 will produce Latin transliteration as per
ISO 9.1995.

While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration/transcription has only Latin/ASCII codes but still can
be read by a native speaker. Among other things it is useful for
processing the Cyrillic texts and filenames by programs or on systems
that are not specifically prepared to work with Cyrillic, don't have
corresponding fonts installed or can't handle UTF-8.

The transliteration table itself is attached as a file translit_cyrillic
[7]. Its content (mapping) is based on ISO 9.1995 standard [10] and its
derivative GOST 7.79-2000 official source (Federal Agency on Technical
Regulating and Metrology Of Russian Federation [2]). Technically an
independent but mostly identical source [3] was used and prepared in a
spreadsheet [6].

The documentation suggests that the transliteration tables inclusion is
done by adding *include "translit_cyrillic";""* string into LC_CTYPE
translit_start section
http://man7.org/linux/man-pages/man5/locale.5.html [5]
Practically I have searched for all locales that have a
translit_start/end stance and generated a patch for them.

The Cyrillic transliteration of e.g. Russian text may have already
worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
have their transliteration tables included inline.

I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files. Volodymyr
Lisivka <***@gmail.com>, Max Kutny <***@gmail.com> (uk_UA),
ДаМОлП КегаМ <***@gnome.org> (sr_RS) have confirmed the
exclusion.

Links:

[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11301
[7] translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11317
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=11304
[10] https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A

Best regards,
Egor Kobylkin

---
2018-10-11 Egor Kobylkin <***@kobylkin.com>

[BZ #2872]
* localedata/locales/translit_cyrillic: Add ISO 9.1995, GOST 7.79
System A transliteration System B transcription table from Cyrillic to
Latin/ASCII.
* localedata/locales/C: add include "translit_cyrillic";"" to LC_CTYPE
translit section.
* localedata/locales/aa_DJ: Likewise.
* localedata/locales/af_ZA: Likewise.
* localedata/locales/ak_GH: Likewise.
* localedata/locales/am_ET: Likewise.
* localedata/locales/ar_EG: Likewise.
* localedata/locales/be_BY: Likewise.
* localedata/locales/bem_ZM: Likewise.
* localedata/locales/ber_DZ: Likewise.
* localedata/locales/ber_MA: Likewise.
* localedata/locales/bg_BG: Likewise.
* localedata/locales/bi_VU: Likewise.
* localedata/locales/bn_BD: Likewise.
* localedata/locales/bo_CN: Likewise.
* localedata/locales/ca_ES: Likewise.
* localedata/locales/ce_RU: Likewise.
* localedata/locales/cmn_TW: Likewise.
* localedata/locales/cs_CZ: Likewise.
* localedata/locales/cv_RU: Likewise.
* localedata/locales/cy_GB: Likewise.
* localedata/locales/da_DK: Likewise.
* localedata/locales/de_DE: Likewise.
* localedata/locales/dv_MV: Likewise.
* localedata/locales/dz_BT: Likewise.
* localedata/locales/el_GR: Likewise.
* localedata/locales/en_GB: Likewise.
* localedata/locales/en_NG: Likewise.
* localedata/locales/en_ZM: Likewise.
* localedata/locales/es_CU: Likewise.
* localedata/locales/es_ES: Likewise.
* localedata/locales/et_EE: Likewise.
* localedata/locales/fa_IR: Likewise.
* localedata/locales/ff_SN: Likewise.
* localedata/locales/fi_FI: Likewise.
* localedata/locales/fr_FR: Likewise.
* localedata/locales/ga_IE: Likewise.
* localedata/locales/gd_GB: Likewise.
* localedata/locales/gu_IN: Likewise.
* localedata/locales/gv_GB: Likewise.
* localedata/locales/he_IL: Likewise.
* localedata/locales/hi_IN: Likewise.
* localedata/locales/hif_FJ: Likewise.
* localedata/locales/hr_HR: Likewise.
* localedata/locales/ht_HT: Likewise.
* localedata/locales/hu_HU: Likewise.
* localedata/locales/hy_AM: Likewise.
* localedata/locales/id_ID: Likewise.
* localedata/locales/is_IS: Likewise.
* localedata/locales/it_IT: Likewise.
* localedata/locales/ja_JP: Likewise.
* localedata/locales/kab_DZ: Likewise.
* localedata/locales/kk_KZ: Likewise.
* localedata/locales/km_KH: Likewise.
* localedata/locales/kn_IN: Likewise.
* localedata/locales/ko_KR: Likewise.
* localedata/locales/ks_IN: Likewise.
* localedata/locales/kw_GB: Likewise.
* localedata/locales/lb_LU: Likewise.
* localedata/locales/lg_UG: Likewise.
* localedata/locales/lij_IT: Likewise.
* localedata/locales/ln_CD: Likewise.
* localedata/locales/lo_LA: Likewise.
* localedata/locales/lt_LT: Likewise.
* localedata/locales/lv_LV: Likewise.
* localedata/locales/mg_MG: Likewise.
* localedata/locales/mhr_RU: Likewise.
* localedata/locales/mk_MK: Likewise.
* localedata/locales/ml_IN: Likewise.
* localedata/locales/ms_MY: Likewise.
* localedata/locales/mt_MT: Likewise.
* localedata/locales/***@latin: Likewise.
* localedata/locales/nb_NO: Likewise.
* localedata/locales/ne_NP: Likewise.
* localedata/locales/nhn_MX: Likewise.
* localedata/locales/niu_NU: Likewise.
* localedata/locales/niu_NZ: Likewise.
* localedata/locales/nl_NL: Likewise.
* localedata/locales/nr_ZA: Likewise.
* localedata/locales/oc_FR: Likewise.
* localedata/locales/om_KE: Likewise.
* localedata/locales/or_IN: Likewise.
* localedata/locales/os_RU: Likewise.
* localedata/locales/pa_IN: Likewise.
* localedata/locales/pa_PK: Likewise.
* localedata/locales/pl_PL: Likewise.
* localedata/locales/pt_PT: Likewise.
* localedata/locales/quz_PE: Likewise.
* localedata/locales/ro_RO: Likewise.
* localedata/locales/ru_RU: Likewise.
* localedata/locales/rw_RW: Likewise.
* localedata/locales/sa_IN: Likewise.
* localedata/locales/sd_IN: Likewise.
* localedata/locales/***@devanagari: Likewise.
* localedata/locales/sd_PK: Likewise.
* localedata/locales/se_NO: Likewise.
* localedata/locales/sgs_LT: Likewise.
* localedata/locales/shn_MM: Likewise.
* localedata/locales/si_LK: Likewise.
* localedata/locales/sk_SK: Likewise.
* localedata/locales/sl_SI: Likewise.
* localedata/locales/sm_WS: Likewise.
* localedata/locales/so_SO: Likewise.
* localedata/locales/sq_AL: Likewise.
* localedata/locales/ss_ZA: Likewise.
* localedata/locales/st_ZA: Likewise.
* localedata/locales/sv_SE: Likewise.
* localedata/locales/sw_KE: Likewise.
* localedata/locales/ta_IN: Likewise.
* localedata/locales/te_IN: Likewise.
* localedata/locales/th_TH: Likewise.
* localedata/locales/ti_ET: Likewise.
* localedata/locales/tn_ZA: Likewise.
* localedata/locales/to_TO: Likewise.
* localedata/locales/tpi_PG: Likewise.
* localedata/locales/tr_TR: Likewise.
* localedata/locales/ts_ZA: Likewise.
* localedata/locales/unm_US: Likewise.
* localedata/locales/ur_IN: Likewise.
* localedata/locales/ur_PK: Likewise.
* localedata/locales/ve_ZA: Likewise.
* localedata/locales/vi_VN: Likewise.
* localedata/locales/wa_BE: Likewise.
* localedata/locales/wo_SN: Likewise.
* localedata/locales/xh_ZA: Likewise.
* localedata/locales/yi_US: Likewise.
* localedata/locales/yuw_PG: Likewise.
* localedata/locales/zh_CN: Likewise.
* localedata/locales/zu_ZA: Likewise.

diff -uNr a/localedata/locales/C b/localedata/locales/C
--- a/localedata/locales/C 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/C 2018-10-11 15:10:43.000000000 +0000
@@ -2293,6 +2293,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
--- a/localedata/locales/aa_DJ 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/aa_DJ 2018-10-11 15:10:43.000000000 +0000
@@ -68,6 +68,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
--- a/localedata/locales/af_ZA 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/af_ZA 2018-10-11 15:10:43.000000000 +0000
@@ -70,6 +70,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
--- a/localedata/locales/ak_GH 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/ak_GH 2018-10-11 15:10:43.000000000 +0000
@@ -54,6 +54,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
--- a/localedata/locales/am_ET 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/am_ET 2018-10-11 15:10:43.000000000 +0000
@@ -1394,6 +1394,7 @@
<U137A> <U0060><U0039><U0030>
<U137B> <U0060><U0031><U0030><U0030>
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
--- a/localedata/locales/ar_EG 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/ar_EG 2018-10-11 15:10:43.000000000 +0000
@@ -44,6 +44,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
--- a/localedata/locales/be_BY 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/be_BY 2018-10-11 15:10:43.000000000 +0000
@@ -68,6 +68,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
--- a/localedata/locales/bem_ZM 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bem_ZM 2018-10-11 15:10:43.000000000 +0000
@@ -41,6 +41,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
--- a/localedata/locales/ber_DZ 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/ber_DZ 2018-10-11 15:10:43.000000000 +0000
@@ -165,6 +165,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
--- a/localedata/locales/ber_MA 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/ber_MA 2018-10-11 15:10:44.000000000 +0000
@@ -85,6 +85,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
--- a/localedata/locales/bg_BG 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bg_BG 2018-10-11 15:10:44.000000000 +0000
@@ -49,6 +49,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
--- a/localedata/locales/bi_VU 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bi_VU 2018-10-11 15:10:44.000000000 +0000
@@ -39,6 +39,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
--- a/localedata/locales/bn_BD 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bn_BD 2018-10-11 15:10:44.000000000 +0000
@@ -61,6 +61,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
--- a/localedata/locales/bo_CN 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bo_CN 2018-10-11 15:10:44.000000000 +0000
@@ -43,6 +43,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
--- a/localedata/locales/ca_ES 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/ca_ES 2018-10-11 15:10:44.000000000 +0000
@@ -71,6 +71,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
--- a/localedata/locales/ce_RU 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/ce_RU 2018-10-11 15:10:44.000000000 +0000
@@ -38,6 +38,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/cmn_TW b/localedata/locales/cmn_TW
--- a/localedata/locales/cmn_TW 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cmn_TW 2018-10-11 15:10:44.000000000 +0000
@@ -49,6 +49,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end

class "hanzi"; /
diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
--- a/localedata/locales/cs_CZ 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cs_CZ 2018-10-11 15:10:44.000000000 +0000
@@ -204,6 +204,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
--- a/localedata/locales/cv_RU 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cv_RU 2018-10-11 15:10:44.000000000 +0000
@@ -108,6 +108,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
--- a/localedata/locales/cy_GB 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cy_GB 2018-10-11 15:10:44.000000000 +0000
@@ -65,6 +65,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
--- a/localedata/locales/da_DK 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/da_DK 2018-10-11 15:10:44.000000000 +0000
@@ -166,6 +166,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
--- a/localedata/locales/de_DE 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/de_DE 2018-10-11 15:10:44.000000000 +0000
@@ -78,6 +78,7 @@
% DOUBLE HIGH-REVERSED-9 QUOTATION MARK
<U201F> <U00AB>;<U0022>

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
--- a/localedata/locales/dv_MV 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/dv_MV 2018-10-11 15:10:44.000000000 +0000
@@ -51,6 +51,7 @@
include "translit_combining";""


+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
--- a/localedata/locales/dz_BT 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/dz_BT 2018-10-11 15:10:44.000000000 +0000
@@ -59,6 +59,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
--- a/localedata/locales/el_GR 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/el_GR 2018-10-11 15:10:44.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
--- a/localedata/locales/en_GB 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_GB 2018-10-11 15:10:44.000000000 +0000
@@ -54,6 +54,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
--- a/localedata/locales/en_NG 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_NG 2018-10-11 15:10:45.000000000 +0000
@@ -49,6 +49,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
--- a/localedata/locales/en_ZM 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_ZM 2018-10-11 15:10:45.000000000 +0000
@@ -41,6 +41,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
--- a/localedata/locales/es_CU 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/es_CU 2018-10-11 15:10:45.000000000 +0000
@@ -59,6 +59,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
--- a/localedata/locales/es_ES 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/es_ES 2018-10-11 15:10:45.000000000 +0000
@@ -72,6 +72,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
--- a/localedata/locales/et_EE 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/et_EE 2018-10-11 15:10:45.000000000 +0000
@@ -112,6 +112,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
--- a/localedata/locales/fa_IR 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fa_IR 2018-10-11 15:10:45.000000000 +0000
@@ -78,6 +78,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
--- a/localedata/locales/ff_SN 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/ff_SN 2018-10-11 15:10:45.000000000 +0000
@@ -41,6 +41,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
--- a/localedata/locales/fi_FI 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fi_FI 2018-10-11 15:10:45.000000000 +0000
@@ -136,6 +136,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
--- a/localedata/locales/fr_FR 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fr_FR 2018-10-11 15:10:45.000000000 +0000
@@ -58,6 +58,7 @@
% In France, accents are simply omitted if they cannot be represented.
include "translit_combining";""

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
--- a/localedata/locales/ga_IE 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/ga_IE 2018-10-11 15:10:45.000000000 +0000
@@ -53,6 +53,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
--- a/localedata/locales/gd_GB 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gd_GB 2018-10-11 15:10:45.000000000 +0000
@@ -45,6 +45,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
--- a/localedata/locales/gu_IN 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gu_IN 2018-10-11 15:10:45.000000000 +0000
@@ -62,6 +62,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
--- a/localedata/locales/gv_GB 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gv_GB 2018-10-11 15:10:45.000000000 +0000
@@ -56,6 +56,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
--- a/localedata/locales/he_IL 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/he_IL 2018-10-11 15:10:45.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
--- a/localedata/locales/hi_IN 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hi_IN 2018-10-11 15:10:45.000000000 +0000
@@ -61,6 +61,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
--- a/localedata/locales/hif_FJ 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hif_FJ 2018-10-11 15:10:45.000000000 +0000
@@ -37,6 +37,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
--- a/localedata/locales/hr_HR 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hr_HR 2018-10-11 15:10:45.000000000 +0000
@@ -61,6 +61,7 @@
% transliterate <U0111> {đ} into d + j
<U0111> "<U0064><U006A>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
--- a/localedata/locales/ht_HT 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ht_HT 2018-10-11 15:10:45.000000000 +0000
@@ -57,6 +57,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
--- a/localedata/locales/hu_HU 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/hu_HU 2018-10-11 15:10:46.000000000 +0000
@@ -476,6 +476,7 @@
<U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
<U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
--- a/localedata/locales/hy_AM 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/hy_AM 2018-10-11 15:10:46.000000000 +0000
@@ -75,6 +75,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
--- a/localedata/locales/id_ID 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/id_ID 2018-10-11 15:10:46.000000000 +0000
@@ -54,6 +54,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
--- a/localedata/locales/is_IS 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/is_IS 2018-10-11 15:10:46.000000000 +0000
@@ -149,6 +149,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
--- a/localedata/locales/it_IT 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/it_IT 2018-10-11 15:10:46.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
--- a/localedata/locales/ja_JP 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ja_JP 2018-10-11 15:10:46.000000000 +0000
@@ -1681,6 +1681,7 @@
include "translit_combining";""
include "translit_cjk_variants";""

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/kab_DZ b/localedata/locales/kab_DZ
--- a/localedata/locales/kab_DZ 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kab_DZ 2018-10-11 15:10:46.000000000 +0000
@@ -41,6 +41,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
--- a/localedata/locales/kk_KZ 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kk_KZ 2018-10-11 15:10:46.000000000 +0000
@@ -157,6 +157,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
--- a/localedata/locales/km_KH 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/km_KH 2018-10-11 15:10:46.000000000 +0000
@@ -42,6 +42,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
--- a/localedata/locales/kn_IN 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kn_IN 2018-10-11 15:10:46.000000000 +0000
@@ -63,6 +63,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
--- a/localedata/locales/ko_KR 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ko_KR 2018-10-11 15:10:47.000000000 +0000
@@ -6099,6 +6099,7 @@
include "translit_combining";""
include "translit_hangul";""

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
--- a/localedata/locales/ks_IN 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ks_IN 2018-10-11 15:10:47.000000000 +0000
@@ -46,6 +46,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
--- a/localedata/locales/kw_GB 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kw_GB 2018-10-11 15:10:47.000000000 +0000
@@ -57,6 +57,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
--- a/localedata/locales/lb_LU 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lb_LU 2018-10-11 15:10:47.000000000 +0000
@@ -77,6 +77,7 @@
% LATIN SMALL LETTER E WITH CIRCUMFLEX
<U00EA> "e^"

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
--- a/localedata/locales/lg_UG 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lg_UG 2018-10-11 15:10:47.000000000 +0000
@@ -56,6 +56,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
--- a/localedata/locales/lij_IT 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lij_IT 2018-10-11 15:10:47.000000000 +0000
@@ -47,6 +47,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
--- a/localedata/locales/ln_CD 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ln_CD 2018-10-11 15:10:47.000000000 +0000
@@ -39,6 +39,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
--- a/localedata/locales/lo_LA 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lo_LA 2018-10-11 15:10:47.000000000 +0000
@@ -50,6 +50,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
--- a/localedata/locales/lt_LT 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lt_LT 2018-10-11 15:10:47.000000000 +0000
@@ -163,6 +163,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
--- a/localedata/locales/lv_LV 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lv_LV 2018-10-11 15:10:47.000000000 +0000
@@ -110,6 +110,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
--- a/localedata/locales/mg_MG 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mg_MG 2018-10-11 15:10:47.000000000 +0000
@@ -54,6 +54,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
--- a/localedata/locales/mhr_RU 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mhr_RU 2018-10-11 15:10:47.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
--- a/localedata/locales/mk_MK 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mk_MK 2018-10-11 15:10:47.000000000 +0000
@@ -48,6 +48,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
--- a/localedata/locales/ml_IN 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ml_IN 2018-10-11 15:10:47.000000000 +0000
@@ -60,6 +60,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
%
diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
--- a/localedata/locales/ms_MY 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ms_MY 2018-10-11 15:10:48.000000000 +0000
@@ -45,6 +45,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
--- a/localedata/locales/mt_MT 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mt_MT 2018-10-11 15:10:48.000000000 +0000
@@ -47,6 +47,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/***@latin
b/localedata/locales/***@latin
--- a/localedata/locales/***@latin 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/***@latin 2018-10-11 15:10:48.000000000 +0000
@@ -52,6 +52,7 @@
% accents are simply omitted if they cannot be represented.
include "translit_combining";""

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
--- a/localedata/locales/nb_NO 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nb_NO 2018-10-11 15:10:48.000000000 +0000
@@ -154,6 +154,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
--- a/localedata/locales/ne_NP 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ne_NP 2018-10-11 15:10:48.000000000 +0000
@@ -43,6 +43,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
--- a/localedata/locales/nhn_MX 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nhn_MX 2018-10-11 15:10:48.000000000 +0000
@@ -59,6 +59,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
--- a/localedata/locales/niu_NU 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/niu_NU 2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
--- a/localedata/locales/niu_NZ 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/niu_NZ 2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
--- a/localedata/locales/nl_NL 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nl_NL 2018-10-11 15:10:48.000000000 +0000
@@ -56,6 +56,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
--- a/localedata/locales/nr_ZA 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nr_ZA 2018-10-11 15:10:48.000000000 +0000
@@ -64,6 +64,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
--- a/localedata/locales/oc_FR 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/oc_FR 2018-10-11 15:10:48.000000000 +0000
@@ -54,6 +54,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
--- a/localedata/locales/om_KE 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/om_KE 2018-10-11 15:10:48.000000000 +0000
@@ -138,6 +138,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
--- a/localedata/locales/or_IN 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/or_IN 2018-10-11 15:10:48.000000000 +0000
@@ -62,6 +62,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
--- a/localedata/locales/os_RU 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/os_RU 2018-10-11 15:10:48.000000000 +0000
@@ -69,6 +69,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
--- a/localedata/locales/pa_IN 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pa_IN 2018-10-11 15:10:48.000000000 +0000
@@ -60,6 +60,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
--- a/localedata/locales/pa_PK 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pa_PK 2018-10-11 15:10:48.000000000 +0000
@@ -57,6 +57,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
--- a/localedata/locales/pl_PL 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pl_PL 2018-10-11 15:10:48.000000000 +0000
@@ -116,6 +116,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
--- a/localedata/locales/pt_PT 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pt_PT 2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
--- a/localedata/locales/quz_PE 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/quz_PE 2018-10-11 15:10:48.000000000 +0000
@@ -55,6 +55,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
--- a/localedata/locales/ro_RO 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/ro_RO 2018-10-11 15:10:49.000000000 +0000
@@ -143,6 +143,7 @@
<U0162> "<U021A>";"<U0054>"
<U0163> "<U021B>";"<U0074>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
--- a/localedata/locales/ru_RU 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/ru_RU 2018-10-11 15:10:49.000000000 +0000
@@ -73,6 +73,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
--- a/localedata/locales/rw_RW 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/rw_RW 2018-10-11 15:10:49.000000000 +0000
@@ -45,6 +45,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
--- a/localedata/locales/sa_IN 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sa_IN 2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
--- a/localedata/locales/sd_IN 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_IN 2018-10-11 15:10:49.000000000 +0000
@@ -46,6 +46,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/***@devanagari
b/localedata/locales/***@devanagari
--- a/localedata/locales/***@devanagari 2018-10-11 15:10:18.000000000
+0000
+++ b/localedata/locales/***@devanagari 2018-10-11 15:10:49.000000000
+0000
@@ -44,6 +44,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
--- a/localedata/locales/sd_PK 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_PK 2018-10-11 15:10:49.000000000 +0000
@@ -39,6 +39,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
--- a/localedata/locales/se_NO 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/se_NO 2018-10-11 15:10:49.000000000 +0000
@@ -204,6 +204,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
--- a/localedata/locales/sgs_LT 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sgs_LT 2018-10-11 15:10:49.000000000 +0000
@@ -58,6 +58,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/shn_MM b/localedata/locales/shn_MM
--- a/localedata/locales/shn_MM 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/shn_MM 2018-10-11 15:10:49.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
--- a/localedata/locales/si_LK 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/si_LK 2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
--- a/localedata/locales/sk_SK 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sk_SK 2018-10-11 15:10:49.000000000 +0000
@@ -67,6 +67,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
--- a/localedata/locales/sl_SI 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sl_SI 2018-10-11 15:10:49.000000000 +0000
@@ -90,6 +90,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
--- a/localedata/locales/sm_WS 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sm_WS 2018-10-11 15:10:49.000000000 +0000
@@ -37,6 +37,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
--- a/localedata/locales/so_SO 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/so_SO 2018-10-11 15:10:49.000000000 +0000
@@ -68,6 +68,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
--- a/localedata/locales/sq_AL 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sq_AL 2018-10-11 15:10:49.000000000 +0000
@@ -45,6 +45,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
--- a/localedata/locales/ss_ZA 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ss_ZA 2018-10-11 15:10:49.000000000 +0000
@@ -66,6 +66,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
--- a/localedata/locales/st_ZA 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/st_ZA 2018-10-11 15:10:50.000000000 +0000
@@ -62,6 +62,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
--- a/localedata/locales/sv_SE 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/sv_SE 2018-10-11 15:10:50.000000000 +0000
@@ -138,6 +138,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
--- a/localedata/locales/sw_KE 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/sw_KE 2018-10-11 15:10:50.000000000 +0000
@@ -43,6 +43,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
--- a/localedata/locales/ta_IN 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ta_IN 2018-10-11 15:10:50.000000000 +0000
@@ -63,6 +63,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
--- a/localedata/locales/te_IN 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/te_IN 2018-10-11 15:10:50.000000000 +0000
@@ -63,6 +63,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
--- a/localedata/locales/th_TH 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/th_TH 2018-10-11 15:10:50.000000000 +0000
@@ -57,6 +57,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
--- a/localedata/locales/ti_ET 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ti_ET 2018-10-11 15:10:50.000000000 +0000
@@ -864,6 +864,7 @@
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>

include "translit_combining";""
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
--- a/localedata/locales/tn_ZA 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/tn_ZA 2018-10-11 15:10:50.000000000 +0000
@@ -67,6 +67,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
--- a/localedata/locales/to_TO 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/to_TO 2018-10-11 15:10:50.000000000 +0000
@@ -36,6 +36,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
--- a/localedata/locales/tpi_PG 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/tpi_PG 2018-10-11 15:10:50.000000000 +0000
@@ -44,6 +44,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
--- a/localedata/locales/tr_TR 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/tr_TR 2018-10-11 15:10:50.000000000 +0000
@@ -2423,6 +2423,7 @@

% TURKISH LIRA SIGN
<U20BA> "<U0054><U004C>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/translit_cyrillic
b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000
+0000
+++ b/localedata/locales/translit_cyrillic 2018-10-11 15:10:52.000000000
+0000
@@ -0,0 +1,383 @@
+escape_char /
+comment_char %
+
+% This file is part of the GNU C Library and contains locale data.
+% The Free Software Foundation does not claim any copyright interest
+% in the locale data contained in this file. The foregoing does not
+% affect the license of the GNU C Library as a whole. It does not
+% exempt you from the conditions of the license if your use would
+% otherwise be governed by that license.
+
+% Transliterations of Cyrillic letters to Latin and/or ASCII symbols.
+% Inspired by ISO 9.1995 / GOST 7.79-2000.
+% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
+% i.e. [U0401-U04F9, U2019] but only the letters covered by ISO 9.1995
+% It implements the GOST_7.79 System A (Latin Script) as a first
+% option and System B Cyrillic (ASCII) as a second option. Check
+% https://en.wikipedia.org/wiki/ISO_9 for reference.
+% The System B is extended from GOST_7.79-Russian using open sources
+% of the transliteration mappings and the "h/`" diacritics logic.
+
+% Usage examples:
+% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
+% | iconv -f ISO-8859-15 -t UTF-8 # System A
+% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
+
+% Contributions welcome for the rest of Cyrillic script in Unicode
+% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
+% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
+% Generated from UnicodeData.txt with
+% https://sourceware.org/bugzilla/attachment.cgi?id=11301.
+
+LC_CTYPE
+
+translit_start
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> <U00CB>;"<U0059><U004F>"
+% CYRILLIC CAPITAL LETTER DJE
+<U0402> <U0110>;"<U0044><U004A>"
+% CYRILLIC CAPITAL LETTER GJE
+<U0403> <U01F4>;"<U0047><U0060>"
+% CYRILLIC CAPITAL LETTER UKRAINIAN IE
+<U0404> <U00CA>;"<U0059><U0065>"
+% CYRILLIC CAPITAL LETTER DZE
+<U0405> <U1E90>;"<U005A><U0060>"
+% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0406> <U00CC>;<U0049>
+% CYRILLIC CAPITAL LETTER YI
+<U0407> <U00CF>;"<U0059><U0069>"
+% CYRILLIC CAPITAL LETTER JE
+<U0408> "<U004A><U030C>";<U004A>
+% CYRILLIC CAPITAL LETTER LJE
+<U0409> "<U004C><U0302>";"<U004C><U0060>"
+% CYRILLIC CAPITAL LETTER NJE
+<U040A> "<U004E><U0302>";"<U004E><U0060>"
+% CYRILLIC CAPITAL LETTER TSHE
+<U040B> <U0106>;"<U0054><U0053><U0048>"
+% CYRILLIC CAPITAL LETTER KJE
+<U040C> <U1E30>;"<U004B><U0060>"
+% CYRILLIC CAPITAL LETTER SHORT U
+<U040E> <U016C>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER DZHE
+<U040F> "<U0044><U0302>";"<U0044><U0068>"
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> <U017D>;"<U005A><U0048>"
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC UNDEFINED
+<U0423><U0301> <U00DA>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0048>;<U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> <U0043>;"<U0043><U005A>"
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> <U010C>;"<U0043><U0048>"
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> <U0160>;"<U0053><U0048>"
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> <U015C>;"<U0053><U0048><U0048>"
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> <U02BA>;"<U0041><U0060>"
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> <U0059>;"<U0059><U0060>"
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U02B9>;<U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> <U00C8>;"<U0045><U0060>"
+% CYRILLIC CAPITAL LETTER YU
+<U042E> <U00DB>;"<U0059><U0055>"
+% CYRILLIC CAPITAL LETTER YA
+<U042F> <U00C2>;"<U0059><U0041>"
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> <U017E>;"<U007A><U0068>"
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC UNDEFINED
+<U0443><U0301> <U00FA>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0068>;<U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> <U0063>;"<U0063><U007A>"
+% CYRILLIC SMALL LETTER CHE
+<U0447> <U010D>;"<U0063><U0068>"
+% CYRILLIC SMALL LETTER SHA
+<U0448> <U0161>;"<U0073><U0068>"
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> <U015D>;"<U0073><U0068><U0068>"
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> <U02BA>;"<U0060><U0060>"
+% CYRILLIC SMALL LETTER YERU
+<U044B> <U0079>;"<U0079><U0060>"
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U02B9>;<U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> <U00E8>;"<U0065><U0060>"
+% CYRILLIC SMALL LETTER YU
+<U044E> <U00FB>;"<U0079><U0075>"
+% CYRILLIC SMALL LETTER YA
+<U044F> <U00E2>;"<U0079><U0061>"
+% CYRILLIC SMALL LETTER IO
+<U0451> <U00EB>;"<U0079><U006F>"
+% CYRILLIC SMALL LETTER DJE
+<U0452> <U0111>;"<U0064><U006A>"
+% CYRILLIC SMALL LETTER GJE
+<U0453> <U01F5>;"<U0067><U0060>"
+% CYRILLIC SMALL LETTER UKRAINIAN IE
+<U0454> <U00EA>;"<U0079><U0065>"
+% CYRILLIC SMALL LETTER DZE
+<U0455> <U1E91>;"<U007A><U0060>"
+% CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0456> <U00EC>;<U0069>
+% CYRILLIC SMALL LETTER YI
+<U0457> <U00EF>;"<U0079><U0069>"
+% CYRILLIC SMALL LETTER JE
+<U0458> <U01F0>;<U006A>
+% CYRILLIC SMALL LETTER LJE
+<U0459> "<U006C><U0302>";"<U006C><U0060>"
+% CYRILLIC SMALL LETTER NJE
+<U045A> "<U006E><U0302>";"<U006E><U0060>"
+% CYRILLIC SMALL LETTER TSHE
+<U045B> <U0107>;"<U0074><U0073><U0068>"
+% CYRILLIC SMALL LETTER KJE
+<U045C> <U1E31>;"<U006B><U0060>"
+% CYRILLIC SMALL LETTER SHORT U
+<U045E> <U016D>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER DZHE
+<U045F> "<U0064><U0302>";"<U0064><U0068>"
+% CYRILLIC CAPITAL LETTER BIG YUS
+<U046A> <U01CD>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BIG YUS
+<U046B> <U01CE>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER FITA
+<U0472> "<U0046><U0300>";"<U0046><U0068>"
+% CYRILLIC SMALL LETTER FITA
+<U0473> "<U0066><U0300>";"<U0066><U0068>"
+% CYRILLIC CAPITAL LETTER IZHITSA
+<U0474> <U1EF2>;"<U0059><U0068>"
+% CYRILLIC SMALL LETTER IZHITSA
+<U0475> <U1EF3>;"<U0079><U0068>"
+% CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+<U048C> <U011A>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER SEMISOFT SIGN
+<U048D> <U011B>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+<U0490> "<U0047><U0300>";"<U0047><U0060>"
+% CYRILLIC SMALL LETTER GHE WITH UPTURN
+<U0491> "<U0067><U0300>";"<U0067><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH STROKE
+<U0492> <U0120>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH STROKE
+<U0493> <U0121>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+<U0494> <U011E>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+<U0495> <U011F>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+<U0496> "<U017D><U0327>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+<U0497> "<U017E><U0327>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+<U049A> <U0136>;"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH DESCENDER
+<U049B> <U0137>;"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH STROKE
+<U049E> "<U004B><U0304>";"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH STROKE
+<U049F> "<U006B><U0304>";"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+<U04A2> <U1E46>;"<U004E><U0060>"
+% CYRILLIC SMALL LETTER EN WITH DESCENDER
+<U04A3> <U1E47>;"<U006E><U0060>"
+% CYRILLIC CAPITAL LIGATURE EN GHE
+<U04A4> <U1E44>;"<U004E><U0047>"
+% CYRILLIC SMALL LIGATURE EN GHE
+<U04A5> <U1E45>;"<U006E><U0067>"
+% CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+<U04A6> <U1E54>;"<U0050><U0060>"
+% CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+<U04A7> <U1E55>;"<U0070><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN HA
+<U04A8> <U00D2>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN HA
+<U04A9> <U00F2>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+<U04AA> <U00C7>;"<U0043><U0060>"
+% CYRILLIC SMALL LETTER ES WITH DESCENDER
+<U04AB> <U00E7>;"<U0043><U0060>"
+% CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+<U04AC> <U0162>;"<U0054><U0060>"
+% CYRILLIC SMALL LETTER TE WITH DESCENDER
+<U04AD> <U0163>;"<U0074><U0060>"
+% CYRILLIC CAPITAL LETTER STRAIGHT U
+<U04AE> <U00D9>;<U0055>
+% CYRILLIC SMALL LETTER STRAIGHT U
+<U04AF> <U00F9>;<U0075>
+% CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+<U04B2> <U1E28>;"<U0048><U0060>"
+% CYRILLIC SMALL LETTER HA WITH DESCENDER
+<U04B3> <U1E29>;"<U0068><U0060>"
+% CYRILLIC CAPITAL LIGATURE TE TSE
+<U04B4> "<U0043><U0304>";"<U0054><U0043><U005A>"
+% CYRILLIC SMALL LIGATURE TE TSE
+<U04B5> "<U0063><U0304>";"<U0074><U0063><U007A>"
+% CYRILLIC CAPITAL LETTER SHHA
+<U04BA> <U1E24>;"<U0053><U0048><U0060>"
+% CYRILLIC SMALL LETTER SHHA
+<U04BB> <U1E25>;"<U0053><U0048><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+<U04BC> "<U0043><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE
+<U04BD> "<U0063><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BE> "<U00C7><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BF> "<U00E7><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC LETTER PALOCHKA
+<U04C0> <U2021>;<U0069>
+% CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+<U04C1> "<U005A><U0306>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH BREVE
+<U04C2> "<U007A><U0306>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+<U04CB> <U00C7>;"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER KHAKASSIAN CHE
+<U04CC> <U00E7>;"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH BREVE
+<U04D0> <U0102>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH BREVE
+<U04D1> <U0103>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+<U04D2> <U00C4>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH DIAERESIS
+<U04D3> <U00E4>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER IE WITH BREVE
+<U04D6> <U0114>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER IE WITH BREVE
+<U04D7> <U0115>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER SCHWA
+<U04D8> "<U0041><U030B>";"<U0041><U0060>"
+% CYRILLIC SMALL LETTER SCHWA
+<U04D9> "<U0061><U030B>";"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+<U04DC> "<U005A><U0304>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+<U04DD> "<U007A><U0304>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+<U04DE> "<U005A><U0308>";"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+<U04DF> "<U007A><U0308>";"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+<U04E0> <U0179>;"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN DZE
+<U04E1> <U017A>;"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+<U04E4> <U00CE>;"<U0049><U0060>"
+% CYRILLIC SMALL LETTER I WITH DIAERESIS
+<U04E5> <U00EE>;"<U0069><U0060>"
+% CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+<U04E6> <U00D6>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER O WITH DIAERESIS
+<U04E7> <U00F6>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER BARRED O
+<U04E8> <U00D4>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BARRED O
+<U04E9> <U00F4>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+<U04F0> <U00DC>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DIAERESIS
+<U04F1> <U00FC>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+<U04F2> <U0170>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+<U04F3> <U0171>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+<U04F4> "<U0043><U0308>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+<U04F5> "<U0063><U0308>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+<U04F8> <U0178>;"<U0059><U0060>"
+% CYRILLIC SMALL LETTER YERU WITH DIAERESIS
+<U04F9> <U00FF>;"<U0079><U0060>"
+% RIGHT SINGLE QUOTATION MARK
+<U2019> <U2035>;<U0027>
+
+translit_end
+
+END LC_CTYPE
diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
--- a/localedata/locales/ts_ZA 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ts_ZA 2018-10-11 15:10:50.000000000 +0000
@@ -62,6 +62,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
--- a/localedata/locales/unm_US 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/unm_US 2018-10-11 15:10:51.000000000 +0000
@@ -48,6 +48,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
--- a/localedata/locales/ur_IN 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ur_IN 2018-10-11 15:10:51.000000000 +0000
@@ -46,6 +46,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
--- a/localedata/locales/ur_PK 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ur_PK 2018-10-11 15:10:51.000000000 +0000
@@ -57,6 +57,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
--- a/localedata/locales/ve_ZA 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ve_ZA 2018-10-11 15:10:51.000000000 +0000
@@ -65,6 +65,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
--- a/localedata/locales/vi_VN 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/vi_VN 2018-10-11 15:10:51.000000000 +0000
@@ -57,6 +57,7 @@
% dong sign -> d// -> dd
<U20AB> "<U0111>";"<U0064><U0064>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
--- a/localedata/locales/wa_BE 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/wa_BE 2018-10-11 15:10:51.000000000 +0000
@@ -59,6 +59,7 @@
<U00C5> "A<U030A>";"A";"AU"
<U00E5> "a<U030A>";"a";"au"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
--- a/localedata/locales/wo_SN 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/wo_SN 2018-10-11 15:10:51.000000000 +0000
@@ -54,6 +54,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
--- a/localedata/locales/xh_ZA 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/xh_ZA 2018-10-11 15:10:51.000000000 +0000
@@ -64,6 +64,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
--- a/localedata/locales/yi_US 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/yi_US 2018-10-11 15:10:51.000000000 +0000
@@ -66,6 +66,7 @@
<U05F0> "<U05D5><U05D5>";"ww"
<U05F1> "<U05D5><U05D9>";"wj"
<U05F2> "<U05D9><U05D9>";"jj"
+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/yuw_PG b/localedata/locales/yuw_PG
--- a/localedata/locales/yuw_PG 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/yuw_PG 2018-10-11 15:10:51.000000000 +0000
@@ -40,6 +40,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
--- a/localedata/locales/zh_CN 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/zh_CN 2018-10-11 15:10:51.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end

class "hanzi"; /
diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
--- a/localedata/locales/zu_ZA 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/zu_ZA 2018-10-11 15:10:51.000000000 +0000
@@ -68,6 +68,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
Egor Kobylkin
2018-10-12 14:05:59 UTC
Permalink
Dear locale maintainers,

fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"

https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]

add the Cyrillic transliteration table translit_cyrillic file

https://sourceware.org/bugzilla/attachment.cgi?id=11317 [7]

to localedata/locales/ and include it in all your locales going forward.

The patch included inline below.

From this patch I have excluded locales that already mention cyrillic or
have a transliteration table for it:
az_AZ
iso14651_t1_common
ky_KG
mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
***@cyrillic

Their maintainers are requested to make an explicit decision on how and
whether at all to include this patch.

Current bug effect:

The glibc wiki explicitly lists this use case as the test example

https://sourceware.org/glibc/wiki/Locales#Testing_Locales :

LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt

currently it fails on Cyrillic texts in most locales including ru_RU [1]
[8] [9]:

LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt |grep CYRILLIC

CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.

- It produces a string of question marks and spaces.

This is what it should produce and it does so after the patch applied:

CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.


The root problem and the fix:

The root problem is the missing transliteration table that I am
supplying here. Furthermore it has to be referenced/included into the
active locale at the compilation time to be used by iconv.



COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.

Examples: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
compatible transcription and iconv -f UTF-8 -t ISO-8859-15//TRANSLIT |
iconv -f ISO-8859-15 -t UTF-8 will produce Latin transliteration as per
ISO 9.1995.

While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration/transcription has only Latin/ASCII codes but still can
be read by a native speaker. Among other things it is useful for
processing the Cyrillic texts and filenames by programs or on systems
that are not specifically prepared to work with Cyrillic, don't have
corresponding fonts installed or can't handle UTF-8.

The transliteration table itself is attached as a file translit_cyrillic
[7]. Its content (mapping) is based on ISO 9.1995 standard [10] and its
derivative GOST 7.79-2000 official source (Federal Agency on Technical
Regulating and Metrology Of Russian Federation [2]). Technically an
independent but mostly identical source [3] was used and prepared in a
spreadsheet [6].

The documentation suggests that the transliteration tables inclusion is
done by adding *include "translit_cyrillic";""* string into LC_CTYPE
translit_start section
http://man7.org/linux/man-pages/man5/locale.5.html [5]
Practically I have searched for all locales that have a
translit_start/end stance and generated a patch for them.

The Cyrillic transliteration of e.g. Russian text may have already
worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
have their transliteration tables included inline.

I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files. Volodymyr
Lisivka <***@gmail.com>, Max Kutny <***@gmail.com> (uk_UA),
ДаМОлП КегаМ <***@gnome.org> (sr_RS) have confirmed the
exclusion.

Links:

[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11301
[7] translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11317
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=11304
[10] https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A

Best regards,
Egor Kobylkin

---
2018-10-11 Egor Kobylkin <***@kobylkin.com>

[BZ #2872]
* localedata/locales/translit_cyrillic: Add ISO 9.1995, GOST 7.79
System A transliteration System B transcription table from Cyrillic to
Latin/ASCII.
* localedata/locales/C: Add include "translit_cyrillic";"" to LC_CTYPE
translit section.
* localedata/locales/aa_DJ: Likewise.
* localedata/locales/af_ZA: Likewise.
* localedata/locales/ak_GH: Likewise.
* localedata/locales/am_ET: Likewise.
* localedata/locales/ar_EG: Likewise.
* localedata/locales/be_BY: Likewise.
* localedata/locales/bem_ZM: Likewise.
* localedata/locales/ber_DZ: Likewise.
* localedata/locales/ber_MA: Likewise.
* localedata/locales/bg_BG: Likewise.
* localedata/locales/bi_VU: Likewise.
* localedata/locales/bn_BD: Likewise.
* localedata/locales/bo_CN: Likewise.
* localedata/locales/ca_ES: Likewise.
* localedata/locales/ce_RU: Likewise.
* localedata/locales/cmn_TW: Likewise.
* localedata/locales/cs_CZ: Likewise.
* localedata/locales/cv_RU: Likewise.
* localedata/locales/cy_GB: Likewise.
* localedata/locales/da_DK: Likewise.
* localedata/locales/de_DE: Likewise.
* localedata/locales/dv_MV: Likewise.
* localedata/locales/dz_BT: Likewise.
* localedata/locales/el_GR: Likewise.
* localedata/locales/en_GB: Likewise.
* localedata/locales/en_NG: Likewise.
* localedata/locales/en_ZM: Likewise.
* localedata/locales/es_CU: Likewise.
* localedata/locales/es_ES: Likewise.
* localedata/locales/et_EE: Likewise.
* localedata/locales/fa_IR: Likewise.
* localedata/locales/ff_SN: Likewise.
* localedata/locales/fi_FI: Likewise.
* localedata/locales/fr_FR: Likewise.
* localedata/locales/ga_IE: Likewise.
* localedata/locales/gd_GB: Likewise.
* localedata/locales/gu_IN: Likewise.
* localedata/locales/gv_GB: Likewise.
* localedata/locales/he_IL: Likewise.
* localedata/locales/hi_IN: Likewise.
* localedata/locales/hif_FJ: Likewise.
* localedata/locales/hr_HR: Likewise.
* localedata/locales/ht_HT: Likewise.
* localedata/locales/hu_HU: Likewise.
* localedata/locales/hy_AM: Likewise.
* localedata/locales/id_ID: Likewise.
* localedata/locales/is_IS: Likewise.
* localedata/locales/it_IT: Likewise.
* localedata/locales/ja_JP: Likewise.
* localedata/locales/kab_DZ: Likewise.
* localedata/locales/kk_KZ: Likewise.
* localedata/locales/km_KH: Likewise.
* localedata/locales/kn_IN: Likewise.
* localedata/locales/ko_KR: Likewise.
* localedata/locales/ks_IN: Likewise.
* localedata/locales/kw_GB: Likewise.
* localedata/locales/lb_LU: Likewise.
* localedata/locales/lg_UG: Likewise.
* localedata/locales/lij_IT: Likewise.
* localedata/locales/ln_CD: Likewise.
* localedata/locales/lo_LA: Likewise.
* localedata/locales/lt_LT: Likewise.
* localedata/locales/lv_LV: Likewise.
* localedata/locales/mg_MG: Likewise.
* localedata/locales/mhr_RU: Likewise.
* localedata/locales/mk_MK: Likewise.
* localedata/locales/ml_IN: Likewise.
* localedata/locales/ms_MY: Likewise.
* localedata/locales/mt_MT: Likewise.
* localedata/locales/***@latin: Likewise.
* localedata/locales/nb_NO: Likewise.
* localedata/locales/ne_NP: Likewise.
* localedata/locales/nhn_MX: Likewise.
* localedata/locales/niu_NU: Likewise.
* localedata/locales/niu_NZ: Likewise.
* localedata/locales/nl_NL: Likewise.
* localedata/locales/nr_ZA: Likewise.
* localedata/locales/oc_FR: Likewise.
* localedata/locales/om_KE: Likewise.
* localedata/locales/or_IN: Likewise.
* localedata/locales/os_RU: Likewise.
* localedata/locales/pa_IN: Likewise.
* localedata/locales/pa_PK: Likewise.
* localedata/locales/pl_PL: Likewise.
* localedata/locales/pt_PT: Likewise.
* localedata/locales/quz_PE: Likewise.
* localedata/locales/ro_RO: Likewise.
* localedata/locales/ru_RU: Likewise.
* localedata/locales/rw_RW: Likewise.
* localedata/locales/sa_IN: Likewise.
* localedata/locales/sd_IN: Likewise.
* localedata/locales/***@devanagari: Likewise.
* localedata/locales/sd_PK: Likewise.
* localedata/locales/se_NO: Likewise.
* localedata/locales/sgs_LT: Likewise.
* localedata/locales/shn_MM: Likewise.
* localedata/locales/si_LK: Likewise.
* localedata/locales/sk_SK: Likewise.
* localedata/locales/sl_SI: Likewise.
* localedata/locales/sm_WS: Likewise.
* localedata/locales/so_SO: Likewise.
* localedata/locales/sq_AL: Likewise.
* localedata/locales/ss_ZA: Likewise.
* localedata/locales/st_ZA: Likewise.
* localedata/locales/sv_SE: Likewise.
* localedata/locales/sw_KE: Likewise.
* localedata/locales/ta_IN: Likewise.
* localedata/locales/te_IN: Likewise.
* localedata/locales/th_TH: Likewise.
* localedata/locales/ti_ET: Likewise.
* localedata/locales/tn_ZA: Likewise.
* localedata/locales/to_TO: Likewise.
* localedata/locales/tpi_PG: Likewise.
* localedata/locales/tr_TR: Likewise.
* localedata/locales/ts_ZA: Likewise.
* localedata/locales/unm_US: Likewise.
* localedata/locales/ur_IN: Likewise.
* localedata/locales/ur_PK: Likewise.
* localedata/locales/ve_ZA: Likewise.
* localedata/locales/vi_VN: Likewise.
* localedata/locales/wa_BE: Likewise.
* localedata/locales/wo_SN: Likewise.
* localedata/locales/xh_ZA: Likewise.
* localedata/locales/yi_US: Likewise.
* localedata/locales/yuw_PG: Likewise.
* localedata/locales/zh_CN: Likewise.
* localedata/locales/zu_ZA: Likewise.

diff -uNr a/localedata/locales/C b/localedata/locales/C
--- a/localedata/locales/C 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/C 2018-10-11 15:10:43.000000000 +0000
@@ -2293,6 +2293,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
--- a/localedata/locales/aa_DJ 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/aa_DJ 2018-10-11 15:10:43.000000000 +0000
@@ -68,6 +68,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
--- a/localedata/locales/af_ZA 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/af_ZA 2018-10-11 15:10:43.000000000 +0000
@@ -70,6 +70,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
--- a/localedata/locales/ak_GH 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/ak_GH 2018-10-11 15:10:43.000000000 +0000
@@ -54,6 +54,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
--- a/localedata/locales/am_ET 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/am_ET 2018-10-11 15:10:43.000000000 +0000
@@ -1394,6 +1394,7 @@
<U137A> <U0060><U0039><U0030>
<U137B> <U0060><U0031><U0030><U0030>
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
--- a/localedata/locales/ar_EG 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/ar_EG 2018-10-11 15:10:43.000000000 +0000
@@ -44,6 +44,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
--- a/localedata/locales/be_BY 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/be_BY 2018-10-11 15:10:43.000000000 +0000
@@ -68,6 +68,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
--- a/localedata/locales/bem_ZM 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bem_ZM 2018-10-11 15:10:43.000000000 +0000
@@ -41,6 +41,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
--- a/localedata/locales/ber_DZ 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/ber_DZ 2018-10-11 15:10:43.000000000 +0000
@@ -165,6 +165,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
--- a/localedata/locales/ber_MA 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/ber_MA 2018-10-11 15:10:44.000000000 +0000
@@ -85,6 +85,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
--- a/localedata/locales/bg_BG 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bg_BG 2018-10-11 15:10:44.000000000 +0000
@@ -49,6 +49,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
--- a/localedata/locales/bi_VU 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bi_VU 2018-10-11 15:10:44.000000000 +0000
@@ -39,6 +39,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
--- a/localedata/locales/bn_BD 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bn_BD 2018-10-11 15:10:44.000000000 +0000
@@ -61,6 +61,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
--- a/localedata/locales/bo_CN 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bo_CN 2018-10-11 15:10:44.000000000 +0000
@@ -43,6 +43,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
--- a/localedata/locales/ca_ES 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/ca_ES 2018-10-11 15:10:44.000000000 +0000
@@ -71,6 +71,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
--- a/localedata/locales/ce_RU 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/ce_RU 2018-10-11 15:10:44.000000000 +0000
@@ -38,6 +38,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/cmn_TW b/localedata/locales/cmn_TW
--- a/localedata/locales/cmn_TW 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cmn_TW 2018-10-11 15:10:44.000000000 +0000
@@ -49,6 +49,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end

class "hanzi"; /
diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
--- a/localedata/locales/cs_CZ 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cs_CZ 2018-10-11 15:10:44.000000000 +0000
@@ -204,6 +204,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
--- a/localedata/locales/cv_RU 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cv_RU 2018-10-11 15:10:44.000000000 +0000
@@ -108,6 +108,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
--- a/localedata/locales/cy_GB 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cy_GB 2018-10-11 15:10:44.000000000 +0000
@@ -65,6 +65,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
--- a/localedata/locales/da_DK 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/da_DK 2018-10-11 15:10:44.000000000 +0000
@@ -166,6 +166,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
--- a/localedata/locales/de_DE 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/de_DE 2018-10-11 15:10:44.000000000 +0000
@@ -78,6 +78,7 @@
% DOUBLE HIGH-REVERSED-9 QUOTATION MARK
<U201F> <U00AB>;<U0022>

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
--- a/localedata/locales/dv_MV 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/dv_MV 2018-10-11 15:10:44.000000000 +0000
@@ -51,6 +51,7 @@
include "translit_combining";""


+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
--- a/localedata/locales/dz_BT 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/dz_BT 2018-10-11 15:10:44.000000000 +0000
@@ -59,6 +59,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
--- a/localedata/locales/el_GR 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/el_GR 2018-10-11 15:10:44.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
--- a/localedata/locales/en_GB 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_GB 2018-10-11 15:10:44.000000000 +0000
@@ -54,6 +54,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
--- a/localedata/locales/en_NG 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_NG 2018-10-11 15:10:45.000000000 +0000
@@ -49,6 +49,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
--- a/localedata/locales/en_ZM 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_ZM 2018-10-11 15:10:45.000000000 +0000
@@ -41,6 +41,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
--- a/localedata/locales/es_CU 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/es_CU 2018-10-11 15:10:45.000000000 +0000
@@ -59,6 +59,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
--- a/localedata/locales/es_ES 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/es_ES 2018-10-11 15:10:45.000000000 +0000
@@ -72,6 +72,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
--- a/localedata/locales/et_EE 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/et_EE 2018-10-11 15:10:45.000000000 +0000
@@ -112,6 +112,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
--- a/localedata/locales/fa_IR 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fa_IR 2018-10-11 15:10:45.000000000 +0000
@@ -78,6 +78,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
--- a/localedata/locales/ff_SN 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/ff_SN 2018-10-11 15:10:45.000000000 +0000
@@ -41,6 +41,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
--- a/localedata/locales/fi_FI 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fi_FI 2018-10-11 15:10:45.000000000 +0000
@@ -136,6 +136,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
--- a/localedata/locales/fr_FR 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fr_FR 2018-10-11 15:10:45.000000000 +0000
@@ -58,6 +58,7 @@
% In France, accents are simply omitted if they cannot be represented.
include "translit_combining";""

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
--- a/localedata/locales/ga_IE 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/ga_IE 2018-10-11 15:10:45.000000000 +0000
@@ -53,6 +53,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
--- a/localedata/locales/gd_GB 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gd_GB 2018-10-11 15:10:45.000000000 +0000
@@ -45,6 +45,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
--- a/localedata/locales/gu_IN 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gu_IN 2018-10-11 15:10:45.000000000 +0000
@@ -62,6 +62,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
--- a/localedata/locales/gv_GB 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gv_GB 2018-10-11 15:10:45.000000000 +0000
@@ -56,6 +56,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
--- a/localedata/locales/he_IL 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/he_IL 2018-10-11 15:10:45.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
--- a/localedata/locales/hi_IN 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hi_IN 2018-10-11 15:10:45.000000000 +0000
@@ -61,6 +61,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
--- a/localedata/locales/hif_FJ 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hif_FJ 2018-10-11 15:10:45.000000000 +0000
@@ -37,6 +37,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
--- a/localedata/locales/hr_HR 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hr_HR 2018-10-11 15:10:45.000000000 +0000
@@ -61,6 +61,7 @@
% transliterate <U0111> {đ} into d + j
<U0111> "<U0064><U006A>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
--- a/localedata/locales/ht_HT 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ht_HT 2018-10-11 15:10:45.000000000 +0000
@@ -57,6 +57,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
--- a/localedata/locales/hu_HU 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/hu_HU 2018-10-11 15:10:46.000000000 +0000
@@ -476,6 +476,7 @@
<U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
<U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
--- a/localedata/locales/hy_AM 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/hy_AM 2018-10-11 15:10:46.000000000 +0000
@@ -75,6 +75,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
--- a/localedata/locales/id_ID 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/id_ID 2018-10-11 15:10:46.000000000 +0000
@@ -54,6 +54,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
--- a/localedata/locales/is_IS 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/is_IS 2018-10-11 15:10:46.000000000 +0000
@@ -149,6 +149,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
--- a/localedata/locales/it_IT 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/it_IT 2018-10-11 15:10:46.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
--- a/localedata/locales/ja_JP 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ja_JP 2018-10-11 15:10:46.000000000 +0000
@@ -1681,6 +1681,7 @@
include "translit_combining";""
include "translit_cjk_variants";""

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/kab_DZ b/localedata/locales/kab_DZ
--- a/localedata/locales/kab_DZ 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kab_DZ 2018-10-11 15:10:46.000000000 +0000
@@ -41,6 +41,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
--- a/localedata/locales/kk_KZ 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kk_KZ 2018-10-11 15:10:46.000000000 +0000
@@ -157,6 +157,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
--- a/localedata/locales/km_KH 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/km_KH 2018-10-11 15:10:46.000000000 +0000
@@ -42,6 +42,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
--- a/localedata/locales/kn_IN 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kn_IN 2018-10-11 15:10:46.000000000 +0000
@@ -63,6 +63,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
--- a/localedata/locales/ko_KR 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ko_KR 2018-10-11 15:10:47.000000000 +0000
@@ -6099,6 +6099,7 @@
include "translit_combining";""
include "translit_hangul";""

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
--- a/localedata/locales/ks_IN 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ks_IN 2018-10-11 15:10:47.000000000 +0000
@@ -46,6 +46,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
--- a/localedata/locales/kw_GB 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kw_GB 2018-10-11 15:10:47.000000000 +0000
@@ -57,6 +57,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
--- a/localedata/locales/lb_LU 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lb_LU 2018-10-11 15:10:47.000000000 +0000
@@ -77,6 +77,7 @@
% LATIN SMALL LETTER E WITH CIRCUMFLEX
<U00EA> "e^"

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
--- a/localedata/locales/lg_UG 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lg_UG 2018-10-11 15:10:47.000000000 +0000
@@ -56,6 +56,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
--- a/localedata/locales/lij_IT 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lij_IT 2018-10-11 15:10:47.000000000 +0000
@@ -47,6 +47,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
--- a/localedata/locales/ln_CD 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ln_CD 2018-10-11 15:10:47.000000000 +0000
@@ -39,6 +39,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
--- a/localedata/locales/lo_LA 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lo_LA 2018-10-11 15:10:47.000000000 +0000
@@ -50,6 +50,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
--- a/localedata/locales/lt_LT 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lt_LT 2018-10-11 15:10:47.000000000 +0000
@@ -163,6 +163,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
--- a/localedata/locales/lv_LV 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lv_LV 2018-10-11 15:10:47.000000000 +0000
@@ -110,6 +110,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
--- a/localedata/locales/mg_MG 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mg_MG 2018-10-11 15:10:47.000000000 +0000
@@ -54,6 +54,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
--- a/localedata/locales/mhr_RU 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mhr_RU 2018-10-11 15:10:47.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
--- a/localedata/locales/mk_MK 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mk_MK 2018-10-11 15:10:47.000000000 +0000
@@ -48,6 +48,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
--- a/localedata/locales/ml_IN 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ml_IN 2018-10-11 15:10:47.000000000 +0000
@@ -60,6 +60,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
%
diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
--- a/localedata/locales/ms_MY 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ms_MY 2018-10-11 15:10:48.000000000 +0000
@@ -45,6 +45,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
--- a/localedata/locales/mt_MT 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mt_MT 2018-10-11 15:10:48.000000000 +0000
@@ -47,6 +47,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/***@latin
b/localedata/locales/***@latin
--- a/localedata/locales/***@latin 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/***@latin 2018-10-11 15:10:48.000000000 +0000
@@ -52,6 +52,7 @@
% accents are simply omitted if they cannot be represented.
include "translit_combining";""

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
--- a/localedata/locales/nb_NO 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nb_NO 2018-10-11 15:10:48.000000000 +0000
@@ -154,6 +154,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
--- a/localedata/locales/ne_NP 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ne_NP 2018-10-11 15:10:48.000000000 +0000
@@ -43,6 +43,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
--- a/localedata/locales/nhn_MX 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nhn_MX 2018-10-11 15:10:48.000000000 +0000
@@ -59,6 +59,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
--- a/localedata/locales/niu_NU 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/niu_NU 2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
--- a/localedata/locales/niu_NZ 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/niu_NZ 2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
--- a/localedata/locales/nl_NL 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nl_NL 2018-10-11 15:10:48.000000000 +0000
@@ -56,6 +56,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
--- a/localedata/locales/nr_ZA 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nr_ZA 2018-10-11 15:10:48.000000000 +0000
@@ -64,6 +64,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
--- a/localedata/locales/oc_FR 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/oc_FR 2018-10-11 15:10:48.000000000 +0000
@@ -54,6 +54,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
--- a/localedata/locales/om_KE 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/om_KE 2018-10-11 15:10:48.000000000 +0000
@@ -138,6 +138,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
--- a/localedata/locales/or_IN 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/or_IN 2018-10-11 15:10:48.000000000 +0000
@@ -62,6 +62,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
--- a/localedata/locales/os_RU 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/os_RU 2018-10-11 15:10:48.000000000 +0000
@@ -69,6 +69,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
--- a/localedata/locales/pa_IN 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pa_IN 2018-10-11 15:10:48.000000000 +0000
@@ -60,6 +60,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
--- a/localedata/locales/pa_PK 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pa_PK 2018-10-11 15:10:48.000000000 +0000
@@ -57,6 +57,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
--- a/localedata/locales/pl_PL 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pl_PL 2018-10-11 15:10:48.000000000 +0000
@@ -116,6 +116,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
--- a/localedata/locales/pt_PT 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pt_PT 2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
--- a/localedata/locales/quz_PE 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/quz_PE 2018-10-11 15:10:48.000000000 +0000
@@ -55,6 +55,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
--- a/localedata/locales/ro_RO 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/ro_RO 2018-10-11 15:10:49.000000000 +0000
@@ -143,6 +143,7 @@
<U0162> "<U021A>";"<U0054>"
<U0163> "<U021B>";"<U0074>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
--- a/localedata/locales/ru_RU 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/ru_RU 2018-10-11 15:10:49.000000000 +0000
@@ -73,6 +73,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
--- a/localedata/locales/rw_RW 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/rw_RW 2018-10-11 15:10:49.000000000 +0000
@@ -45,6 +45,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
--- a/localedata/locales/sa_IN 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sa_IN 2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
--- a/localedata/locales/sd_IN 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_IN 2018-10-11 15:10:49.000000000 +0000
@@ -46,6 +46,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/***@devanagari
b/localedata/locales/***@devanagari
--- a/localedata/locales/***@devanagari 2018-10-11 15:10:18.000000000
+0000
+++ b/localedata/locales/***@devanagari 2018-10-11 15:10:49.000000000
+0000
@@ -44,6 +44,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
--- a/localedata/locales/sd_PK 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_PK 2018-10-11 15:10:49.000000000 +0000
@@ -39,6 +39,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
--- a/localedata/locales/se_NO 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/se_NO 2018-10-11 15:10:49.000000000 +0000
@@ -204,6 +204,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
--- a/localedata/locales/sgs_LT 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sgs_LT 2018-10-11 15:10:49.000000000 +0000
@@ -58,6 +58,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/shn_MM b/localedata/locales/shn_MM
--- a/localedata/locales/shn_MM 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/shn_MM 2018-10-11 15:10:49.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
--- a/localedata/locales/si_LK 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/si_LK 2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
--- a/localedata/locales/sk_SK 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sk_SK 2018-10-11 15:10:49.000000000 +0000
@@ -67,6 +67,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
--- a/localedata/locales/sl_SI 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sl_SI 2018-10-11 15:10:49.000000000 +0000
@@ -90,6 +90,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
--- a/localedata/locales/sm_WS 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sm_WS 2018-10-11 15:10:49.000000000 +0000
@@ -37,6 +37,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
--- a/localedata/locales/so_SO 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/so_SO 2018-10-11 15:10:49.000000000 +0000
@@ -68,6 +68,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
--- a/localedata/locales/sq_AL 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sq_AL 2018-10-11 15:10:49.000000000 +0000
@@ -45,6 +45,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
--- a/localedata/locales/ss_ZA 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ss_ZA 2018-10-11 15:10:49.000000000 +0000
@@ -66,6 +66,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
--- a/localedata/locales/st_ZA 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/st_ZA 2018-10-11 15:10:50.000000000 +0000
@@ -62,6 +62,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
--- a/localedata/locales/sv_SE 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/sv_SE 2018-10-11 15:10:50.000000000 +0000
@@ -138,6 +138,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
--- a/localedata/locales/sw_KE 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/sw_KE 2018-10-11 15:10:50.000000000 +0000
@@ -43,6 +43,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
--- a/localedata/locales/ta_IN 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ta_IN 2018-10-11 15:10:50.000000000 +0000
@@ -63,6 +63,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
--- a/localedata/locales/te_IN 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/te_IN 2018-10-11 15:10:50.000000000 +0000
@@ -63,6 +63,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
--- a/localedata/locales/th_TH 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/th_TH 2018-10-11 15:10:50.000000000 +0000
@@ -57,6 +57,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
--- a/localedata/locales/ti_ET 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ti_ET 2018-10-11 15:10:50.000000000 +0000
@@ -864,6 +864,7 @@
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>

include "translit_combining";""
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
--- a/localedata/locales/tn_ZA 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/tn_ZA 2018-10-11 15:10:50.000000000 +0000
@@ -67,6 +67,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
--- a/localedata/locales/to_TO 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/to_TO 2018-10-11 15:10:50.000000000 +0000
@@ -36,6 +36,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
--- a/localedata/locales/tpi_PG 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/tpi_PG 2018-10-11 15:10:50.000000000 +0000
@@ -44,6 +44,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
--- a/localedata/locales/tr_TR 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/tr_TR 2018-10-11 15:10:50.000000000 +0000
@@ -2423,6 +2423,7 @@

% TURKISH LIRA SIGN
<U20BA> "<U0054><U004C>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/translit_cyrillic
b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000
+0000
+++ b/localedata/locales/translit_cyrillic 2018-10-11 15:10:52.000000000
+0000
@@ -0,0 +1,383 @@
+escape_char /
+comment_char %
+
+% This file is part of the GNU C Library and contains locale data.
+% The Free Software Foundation does not claim any copyright interest
+% in the locale data contained in this file. The foregoing does not
+% affect the license of the GNU C Library as a whole. It does not
+% exempt you from the conditions of the license if your use would
+% otherwise be governed by that license.
+
+% Transliterations of Cyrillic letters to Latin and/or ASCII symbols.
+% Inspired by ISO 9.1995 / GOST 7.79-2000.
+% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
+% i.e. [U0401-U04F9, U2019] but only the letters covered by ISO 9.1995
+% It implements the GOST_7.79 System A (Latin Script) as a first
+% option and System B Cyrillic (ASCII) as a second option. Check
+% https://en.wikipedia.org/wiki/ISO_9 for reference.
+% The System B is extended from GOST_7.79-Russian using open sources
+% of the transliteration mappings and the "h/`" diacritics logic.
+
+% Usage examples:
+% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
+% | iconv -f ISO-8859-15 -t UTF-8 # System A
+% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
+
+% Contributions welcome for the rest of Cyrillic script in Unicode
+% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
+% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
+% Generated from UnicodeData.txt with
+% https://sourceware.org/bugzilla/attachment.cgi?id=11301.
+
+LC_CTYPE
+
+translit_start
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> <U00CB>;"<U0059><U004F>"
+% CYRILLIC CAPITAL LETTER DJE
+<U0402> <U0110>;"<U0044><U004A>"
+% CYRILLIC CAPITAL LETTER GJE
+<U0403> <U01F4>;"<U0047><U0060>"
+% CYRILLIC CAPITAL LETTER UKRAINIAN IE
+<U0404> <U00CA>;"<U0059><U0065>"
+% CYRILLIC CAPITAL LETTER DZE
+<U0405> <U1E90>;"<U005A><U0060>"
+% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0406> <U00CC>;<U0049>
+% CYRILLIC CAPITAL LETTER YI
+<U0407> <U00CF>;"<U0059><U0069>"
+% CYRILLIC CAPITAL LETTER JE
+<U0408> "<U004A><U030C>";<U004A>
+% CYRILLIC CAPITAL LETTER LJE
+<U0409> "<U004C><U0302>";"<U004C><U0060>"
+% CYRILLIC CAPITAL LETTER NJE
+<U040A> "<U004E><U0302>";"<U004E><U0060>"
+% CYRILLIC CAPITAL LETTER TSHE
+<U040B> <U0106>;"<U0054><U0053><U0048>"
+% CYRILLIC CAPITAL LETTER KJE
+<U040C> <U1E30>;"<U004B><U0060>"
+% CYRILLIC CAPITAL LETTER SHORT U
+<U040E> <U016C>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER DZHE
+<U040F> "<U0044><U0302>";"<U0044><U0068>"
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> <U017D>;"<U005A><U0048>"
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC UNDEFINED
+<U0423><U0301> <U00DA>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0048>;<U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> <U0043>;"<U0043><U005A>"
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> <U010C>;"<U0043><U0048>"
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> <U0160>;"<U0053><U0048>"
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> <U015C>;"<U0053><U0048><U0048>"
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> <U02BA>;"<U0041><U0060>"
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> <U0059>;"<U0059><U0060>"
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U02B9>;<U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> <U00C8>;"<U0045><U0060>"
+% CYRILLIC CAPITAL LETTER YU
+<U042E> <U00DB>;"<U0059><U0055>"
+% CYRILLIC CAPITAL LETTER YA
+<U042F> <U00C2>;"<U0059><U0041>"
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> <U017E>;"<U007A><U0068>"
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC UNDEFINED
+<U0443><U0301> <U00FA>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0068>;<U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> <U0063>;"<U0063><U007A>"
+% CYRILLIC SMALL LETTER CHE
+<U0447> <U010D>;"<U0063><U0068>"
+% CYRILLIC SMALL LETTER SHA
+<U0448> <U0161>;"<U0073><U0068>"
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> <U015D>;"<U0073><U0068><U0068>"
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> <U02BA>;"<U0060><U0060>"
+% CYRILLIC SMALL LETTER YERU
+<U044B> <U0079>;"<U0079><U0060>"
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U02B9>;<U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> <U00E8>;"<U0065><U0060>"
+% CYRILLIC SMALL LETTER YU
+<U044E> <U00FB>;"<U0079><U0075>"
+% CYRILLIC SMALL LETTER YA
+<U044F> <U00E2>;"<U0079><U0061>"
+% CYRILLIC SMALL LETTER IO
+<U0451> <U00EB>;"<U0079><U006F>"
+% CYRILLIC SMALL LETTER DJE
+<U0452> <U0111>;"<U0064><U006A>"
+% CYRILLIC SMALL LETTER GJE
+<U0453> <U01F5>;"<U0067><U0060>"
+% CYRILLIC SMALL LETTER UKRAINIAN IE
+<U0454> <U00EA>;"<U0079><U0065>"
+% CYRILLIC SMALL LETTER DZE
+<U0455> <U1E91>;"<U007A><U0060>"
+% CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0456> <U00EC>;<U0069>
+% CYRILLIC SMALL LETTER YI
+<U0457> <U00EF>;"<U0079><U0069>"
+% CYRILLIC SMALL LETTER JE
+<U0458> <U01F0>;<U006A>
+% CYRILLIC SMALL LETTER LJE
+<U0459> "<U006C><U0302>";"<U006C><U0060>"
+% CYRILLIC SMALL LETTER NJE
+<U045A> "<U006E><U0302>";"<U006E><U0060>"
+% CYRILLIC SMALL LETTER TSHE
+<U045B> <U0107>;"<U0074><U0073><U0068>"
+% CYRILLIC SMALL LETTER KJE
+<U045C> <U1E31>;"<U006B><U0060>"
+% CYRILLIC SMALL LETTER SHORT U
+<U045E> <U016D>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER DZHE
+<U045F> "<U0064><U0302>";"<U0064><U0068>"
+% CYRILLIC CAPITAL LETTER BIG YUS
+<U046A> <U01CD>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BIG YUS
+<U046B> <U01CE>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER FITA
+<U0472> "<U0046><U0300>";"<U0046><U0068>"
+% CYRILLIC SMALL LETTER FITA
+<U0473> "<U0066><U0300>";"<U0066><U0068>"
+% CYRILLIC CAPITAL LETTER IZHITSA
+<U0474> <U1EF2>;"<U0059><U0068>"
+% CYRILLIC SMALL LETTER IZHITSA
+<U0475> <U1EF3>;"<U0079><U0068>"
+% CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+<U048C> <U011A>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER SEMISOFT SIGN
+<U048D> <U011B>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+<U0490> "<U0047><U0300>";"<U0047><U0060>"
+% CYRILLIC SMALL LETTER GHE WITH UPTURN
+<U0491> "<U0067><U0300>";"<U0067><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH STROKE
+<U0492> <U0120>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH STROKE
+<U0493> <U0121>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+<U0494> <U011E>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+<U0495> <U011F>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+<U0496> "<U017D><U0327>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+<U0497> "<U017E><U0327>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+<U049A> <U0136>;"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH DESCENDER
+<U049B> <U0137>;"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH STROKE
+<U049E> "<U004B><U0304>";"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH STROKE
+<U049F> "<U006B><U0304>";"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+<U04A2> <U1E46>;"<U004E><U0060>"
+% CYRILLIC SMALL LETTER EN WITH DESCENDER
+<U04A3> <U1E47>;"<U006E><U0060>"
+% CYRILLIC CAPITAL LIGATURE EN GHE
+<U04A4> <U1E44>;"<U004E><U0047>"
+% CYRILLIC SMALL LIGATURE EN GHE
+<U04A5> <U1E45>;"<U006E><U0067>"
+% CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+<U04A6> <U1E54>;"<U0050><U0060>"
+% CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+<U04A7> <U1E55>;"<U0070><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN HA
+<U04A8> <U00D2>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN HA
+<U04A9> <U00F2>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+<U04AA> <U00C7>;"<U0043><U0060>"
+% CYRILLIC SMALL LETTER ES WITH DESCENDER
+<U04AB> <U00E7>;"<U0043><U0060>"
+% CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+<U04AC> <U0162>;"<U0054><U0060>"
+% CYRILLIC SMALL LETTER TE WITH DESCENDER
+<U04AD> <U0163>;"<U0074><U0060>"
+% CYRILLIC CAPITAL LETTER STRAIGHT U
+<U04AE> <U00D9>;<U0055>
+% CYRILLIC SMALL LETTER STRAIGHT U
+<U04AF> <U00F9>;<U0075>
+% CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+<U04B2> <U1E28>;"<U0048><U0060>"
+% CYRILLIC SMALL LETTER HA WITH DESCENDER
+<U04B3> <U1E29>;"<U0068><U0060>"
+% CYRILLIC CAPITAL LIGATURE TE TSE
+<U04B4> "<U0043><U0304>";"<U0054><U0043><U005A>"
+% CYRILLIC SMALL LIGATURE TE TSE
+<U04B5> "<U0063><U0304>";"<U0074><U0063><U007A>"
+% CYRILLIC CAPITAL LETTER SHHA
+<U04BA> <U1E24>;"<U0053><U0048><U0060>"
+% CYRILLIC SMALL LETTER SHHA
+<U04BB> <U1E25>;"<U0053><U0048><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+<U04BC> "<U0043><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE
+<U04BD> "<U0063><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BE> "<U00C7><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BF> "<U00E7><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC LETTER PALOCHKA
+<U04C0> <U2021>;<U0069>
+% CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+<U04C1> "<U005A><U0306>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH BREVE
+<U04C2> "<U007A><U0306>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+<U04CB> <U00C7>;"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER KHAKASSIAN CHE
+<U04CC> <U00E7>;"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH BREVE
+<U04D0> <U0102>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH BREVE
+<U04D1> <U0103>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+<U04D2> <U00C4>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH DIAERESIS
+<U04D3> <U00E4>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER IE WITH BREVE
+<U04D6> <U0114>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER IE WITH BREVE
+<U04D7> <U0115>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER SCHWA
+<U04D8> "<U0041><U030B>";"<U0041><U0060>"
+% CYRILLIC SMALL LETTER SCHWA
+<U04D9> "<U0061><U030B>";"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+<U04DC> "<U005A><U0304>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+<U04DD> "<U007A><U0304>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+<U04DE> "<U005A><U0308>";"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+<U04DF> "<U007A><U0308>";"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+<U04E0> <U0179>;"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN DZE
+<U04E1> <U017A>;"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+<U04E4> <U00CE>;"<U0049><U0060>"
+% CYRILLIC SMALL LETTER I WITH DIAERESIS
+<U04E5> <U00EE>;"<U0069><U0060>"
+% CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+<U04E6> <U00D6>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER O WITH DIAERESIS
+<U04E7> <U00F6>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER BARRED O
+<U04E8> <U00D4>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BARRED O
+<U04E9> <U00F4>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+<U04F0> <U00DC>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DIAERESIS
+<U04F1> <U00FC>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+<U04F2> <U0170>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+<U04F3> <U0171>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+<U04F4> "<U0043><U0308>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+<U04F5> "<U0063><U0308>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+<U04F8> <U0178>;"<U0059><U0060>"
+% CYRILLIC SMALL LETTER YERU WITH DIAERESIS
+<U04F9> <U00FF>;"<U0079><U0060>"
+% RIGHT SINGLE QUOTATION MARK
+<U2019> <U2035>;<U0027>
+
+translit_end
+
+END LC_CTYPE
diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
--- a/localedata/locales/ts_ZA 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ts_ZA 2018-10-11 15:10:50.000000000 +0000
@@ -62,6 +62,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
--- a/localedata/locales/unm_US 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/unm_US 2018-10-11 15:10:51.000000000 +0000
@@ -48,6 +48,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
--- a/localedata/locales/ur_IN 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ur_IN 2018-10-11 15:10:51.000000000 +0000
@@ -46,6 +46,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
--- a/localedata/locales/ur_PK 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ur_PK 2018-10-11 15:10:51.000000000 +0000
@@ -57,6 +57,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
--- a/localedata/locales/ve_ZA 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ve_ZA 2018-10-11 15:10:51.000000000 +0000
@@ -65,6 +65,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
--- a/localedata/locales/vi_VN 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/vi_VN 2018-10-11 15:10:51.000000000 +0000
@@ -57,6 +57,7 @@
% dong sign -> d// -> dd
<U20AB> "<U0111>";"<U0064><U0064>"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
--- a/localedata/locales/wa_BE 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/wa_BE 2018-10-11 15:10:51.000000000 +0000
@@ -59,6 +59,7 @@
<U00C5> "A<U030A>";"A";"AU"
<U00E5> "a<U030A>";"a";"au"

+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
--- a/localedata/locales/wo_SN 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/wo_SN 2018-10-11 15:10:51.000000000 +0000
@@ -54,6 +54,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""

+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
--- a/localedata/locales/xh_ZA 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/xh_ZA 2018-10-11 15:10:51.000000000 +0000
@@ -64,6 +64,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE

diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
--- a/localedata/locales/yi_US 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/yi_US 2018-10-11 15:10:51.000000000 +0000
@@ -66,6 +66,7 @@
<U05F0> "<U05D5><U05D5>";"ww"
<U05F1> "<U05D5><U05D9>";"wj"
<U05F2> "<U05D9><U05D9>";"jj"
+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/yuw_PG b/localedata/locales/yuw_PG
--- a/localedata/locales/yuw_PG 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/yuw_PG 2018-10-11 15:10:51.000000000 +0000
@@ -40,6 +40,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end

END LC_CTYPE
diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
--- a/localedata/locales/zh_CN 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/zh_CN 2018-10-11 15:10:51.000000000 +0000
@@ -58,6 +58,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end

class "hanzi"; /
diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
--- a/localedata/locales/zu_ZA 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/zu_ZA 2018-10-11 15:10:51.000000000 +0000
@@ -68,6 +68,7 @@

translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
Rafal Luzynski
2018-10-13 00:59:17 UTC
Permalink
Egor,

Thank you for the update. I took a closer look at your patch so this
time my review is more complete than before although not yet fully complete.

As far as I understand, ISO-9 and its GOST variants are meant to be
universal rather than Russian-specific. Therefore it is correct to place
them in the external file, like translit_cyrillic, and then include this
file in other locales adding locale specific modifications, if required.
For example, if there are any Russian-specific rules not included in this
file, they should go to ru_RU.

The text of the ISO-9 standard is not available in public, have we got
anything better than an article in Wikipedia?

Regarding the format of your commit message, I hesitate to say anything
more because there are more experienced maintainers around here. Please
take a look at the Contribution Checklist. [1]

While at this, what is your legal relationship with GLIBC project? Have
you signed the FSF Copyright Assignment? It is not necessary for the locale
data but it might be necessary if you are going to contribute the testing code.

Regarding the tests, I think there is no complete transliteration test
suite at the moment. Probably the only test is localedata/bug-iconv-trans.c.
You can also see the collation tests placed in the same directory, they
use those multiple *.UTF-8.in files.

You can skip the tests for now.

Technical issue: Please either attach your patch to the email message or
paste it inline, not both. The patch as it is now is not applicable.
I had to edit it manually to apply.
Post by Egor Kobylkin
[...]
From this patch I have excluded locales that already mention cyrillic or
az_AZ
iso14651_t1_common
ky_KG
mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
I confirm that these locales are excluded and there are no other missing
locales.
Post by Egor Kobylkin
[...]
diff -uNr a/localedata/locales/C b/localedata/locales/C
--- a/localedata/locales/C 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/C 2018-10-11 15:10:43.000000000 +0000
There is no such file. Where have you got the source code from? Are you
sure this is glibc? :-)
Post by Egor Kobylkin
[...]
diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
--- a/localedata/locales/am_ET 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/am_ET 2018-10-11 15:10:43.000000000 +0000
@@ -1394,6 +1394,7 @@
<U137A> <U0060><U0039><U0030>
<U137B> <U0060><U0031><U0030><U0030>
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
Shouldn't “include "translit_cyrillic";""” be placed before the custom rules,
together with other includes? The same in more files, I will not mention
them all.
Post by Egor Kobylkin
[...]
+0000
+0000
Those 3 lines have been broken by the email agent, the patch is not applicable.
Post by Egor Kobylkin
[...]
diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
--- a/localedata/locales/sd_PK 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_PK 2018-10-11 15:10:49.000000000 +0000
There is no such file in glibc.
Post by Egor Kobylkin
[...]
diff -uNr a/localedata/locales/translit_cyrillic
b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000
+0000
+++ b/localedata/locales/translit_cyrillic 2018-10-11 15:10:52.000000000
+0000
Again 3 lines broken, the patch is not applicable.
Post by Egor Kobylkin
[...]
+% Contributions welcome for the rest of Cyrillic script in Unicode
+% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
I am still tempted to add more Cyrillic characters but I understand
that it must be clearly separated which transliteration rules come from
ISO-9 and which are our own invention. But that's not for now.
Post by Egor Kobylkin
[...]
+translit_start
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> <U00CB>;"<U0059><U004F>"
This says that for ASCII (GOST 7.79 System B) you would like to transliterate
"Ё" as "YO" but the table in Wikipedia says "Yo". I understand that one or
another may be correct depending on the context but we should be consistent
and also better let's stick with the standard.
Post by Egor Kobylkin
+% CYRILLIC CAPITAL LETTER DJE
+<U0402> <U0110>;"<U0044><U004A>"
This says "DJ" but System B does not mention it. Where does it come from?
Also, I think it should be "Dj" rather than "DJ".
Post by Egor Kobylkin
+% CYRILLIC CAPITAL LETTER GJE
+<U0403> <U01F4>;"<U0047><U0060>"
Correct, according to both systems.
Post by Egor Kobylkin
+% CYRILLIC CAPITAL LETTER UKRAINIAN IE
+<U0404> <U00CA>;"<U0059><U0065>"
"Ye" - correct.
Post by Egor Kobylkin
+% CYRILLIC CAPITAL LETTER DZE
+<U0405> <U1E90>;"<U005A><U0060>"
Correct.
Post by Egor Kobylkin
+% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0406> <U00CC>;<U0049>
Correct. The table mentions an alternative transliteration "I`" but
says that it is "only before vowels for Old Russian and Old Bulgarian".
I think we can skip this other variant.
Post by Egor Kobylkin
+% CYRILLIC CAPITAL LETTER YI
+<U0407> <U00CF>;"<U0059><U0069>"
"Yi" - correct.
Post by Egor Kobylkin
+% CYRILLIC CAPITAL LETTER JE
+<U0408> "<U004A><U030C>";<U004A>
Correct.
Post by Egor Kobylkin
+% CYRILLIC CAPITAL LETTER LJE
+<U0409> "<U004C><U0302>";"<U004C><U0060>"
Correct, according to the standard. If Serbian language requires "Lj"
then overrides should go to sr_RS file.
Post by Egor Kobylkin
+% CYRILLIC CAPITAL LETTER NJE
+<U040A> "<U004E><U0302>";"<U004E><U0060>"
Correct, the same comment.
Post by Egor Kobylkin
+% CYRILLIC CAPITAL LETTER TSHE
+<U040B> <U0106>;"<U0054><U0053><U0048>"
Where does "TSH" come from? It is not mentioned by the System B table.
Also I am afraid this is not correct.
Post by Egor Kobylkin
+% CYRILLIC CAPITAL LETTER KJE
+<U040C> <U1E30>;"<U004B><U0060>"
Correct.
Post by Egor Kobylkin
+% CYRILLIC CAPITAL LETTER SHORT U
+<U040E> <U016C>;"<U0055><U0060>"
"U`" - correct.
Post by Egor Kobylkin
+% CYRILLIC CAPITAL LETTER DZHE
+<U040F> "<U0044><U0302>";"<U0044><U0068>"
"Dh" - correct.
Post by Egor Kobylkin
[...]
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> <U017D>;"<U005A><U0048>"
"ZH" - shouldn't be "Zh"?
Post by Egor Kobylkin
[...]
+% CYRILLIC UNDEFINED
+<U0423><U0301> <U00DA>;"<U0055><U0060>"
1. I think it should be named "CYRILLIC CAPITAL LETTER U WITH ACUTE".
2. OK, the System A table mentions this letter but System B does not.
Somehow we should handle it. I think that "U`" is the best we can
do for now.
3. It must be tested whether this actually works.
Post by Egor Kobylkin
[...]
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0048>;<U0058>
I don't think that "H" is unavailable in any encoding therefore it will
always be transliterated as "H" and never as "X". We can't help it and
I don't think it is bad.
Post by Egor Kobylkin
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> <U0043>;"<U0043><U005A>"
1. "CZ" - maybe should be "Cz"?
2. Are we able to implement the rule: "c before i, e, y, j"?
Post by Egor Kobylkin
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> <U010C>;"<U0043><U0048>"
"CH" -> "Ch"?
Post by Egor Kobylkin
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> <U0160>;"<U0053><U0048>"
"SH" -> "Sh"?
Post by Egor Kobylkin
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> <U015C>;"<U0053><U0048><U0048>"
"SHH" -> "Shh"?
Post by Egor Kobylkin
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> <U02BA>;"<U0041><U0060>"
"A`" is only for Bulgarian and should go to bg_BG. How should
we transliterate an upper case hard sign to plain ASCII? I think
that just "``", same as lower case.
Post by Egor Kobylkin
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> <U0059>;"<U0059><U0060>"
Again, as "Y" is always available it will never be transliterated
as "Y`".
Post by Egor Kobylkin
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U02B9>;<U0060>
OK, I like it to be transliterated to plain ASCII as "`".
Post by Egor Kobylkin
+% CYRILLIC CAPITAL LETTER E
+<U042D> <U00C8>;"<U0045><U0060>"
OK
Post by Egor Kobylkin
+% CYRILLIC CAPITAL LETTER YU
+<U042E> <U00DB>;"<U0059><U0055>"
"YU" -> "Yu"?
Post by Egor Kobylkin
+% CYRILLIC CAPITAL LETTER YA
+<U042F> <U00C2>;"<U0059><U0041>"
"YA" -> "Ya"?
Post by Egor Kobylkin
[...]
I am sorry, this is of course incomplete but that's enough for tonight.

Regards,

Rafal


[1] https://sourceware.org/glibc/wiki/Contribution%20checklist
Egor Kobylkin
2018-10-13 16:58:17 UTC
Permalink
Hi Rafal,

Thanks for the thorough checking, it really helps.
Post by Rafal Luzynski
Technical issue: Please either attach your patch to the email
message or paste it inline, not both. The patch as it is now is not
applicable. I had to edit it manually to apply.
diff -uNr a/localedata/locales/C b/localedata/locales/C ---
a/localedata/locales/C 2018-10-11 15:10:12.000000000 +0000 +++
b/localedata/locales/C 2018-10-11 15:10:43.000000000 +0000
There is no such file. Where have you got the source code from?
Are you sure this is glibc? :-)
I was running my patch process against the Ubuntu 18.04 version of
localedata/locales. Now I have checked out the GitHub glibc source v2.28
and done the same. Please find the new patch attached. I am not
submitting it as a patch request because we have not yet addressed the
rest of your comments below. But at least this should be working as a
patch for you. Please let me know if there is any problem there still.
Post by Rafal Luzynski
[...] From this patch I have excluded locales that already mention
cyrillic or have a transliteration table for it: az_AZ
iso14651_t1_common ky_KG mn_MN sr_RS tg_TJ tk_TM tt_RU uk_UA uz_UZ
I confirm that these locales are excluded and there are no other
missing locales.
Because of the surprisingly different list of locales between Ubuntu and
glibc there is now a different list of excluded ones as well.

mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
***@cyrillic
uk_UA

az_AZ, ky_KG are now included because they don't have cyrillic translit
in glibc. iso14651_t1_common is still implicitly excluded, because it
doesn't have 'translit_end' string.

Somehow az_AZ and tr_TR from glibc fail to transliterate Cyrillic even
after the patch applied (az_AZ is explicitly including tr_TR). I do not
see a reason, maybe you could check?
Post by Rafal Luzynski
Regarding the tests, I think there is no complete transliteration
test suite at the moment. Probably the only test is
localedata/bug-iconv-trans.c. You can also see the collation tests
placed in the same directory, they use those multiple *.UTF-8.in
files.
You can skip the tests for now.
In the copy of localedata/bug-iconv-trans.c lines 10-11 we could just
change the list of the symbols we are now transliterating

const char str[] = "ÄÀÖöÜÌß";
const char expected[] = "AEaeOEoeUEuess";

like this

const char str[] =
"ЁЂЃЄЅІЇЈЉЊЋЌЎЏАБВГДЕЖЗИЙКЛМНОПРСТУУ́ЀХЊЧКЩъЫьЭЮЯабвгЎежзОйклЌМПпрстуу́фхцчшщЪыЬэюяёђѓєѕіїјљњћќўџѪѫѲѳюѵҌҍ
ҐґҒғҔҕҖҗҚқҞҟҢңҀҥҊҧҚҩҪҫҬҭҮүҲҳҎҵҺһҌҜҟҿӀӁӂӋӌӐӑӒӓӖӗӘәӜӝӞӟӠӡӀӥӊӧӚөӰӱӲӳӎӵӞӹ’"
const char expected[] =
"YODJG`YEZ`IYIJL`N`TSHK`U`DHABVGDEZHZIJKLMNOPRSTUU`FXCZCHSHSHHA`Y``E`YUYAabvgdezhzijklmnoprstuu`fxczchsh
shh``y``e`yuyayodjg`yez`iyijl`n`tshk`u`dhO`o`FHfhYHyhE`e`G`g`GHghGHghZH`zh`K`k`K`k`N`n`NGngP`p`O`o`C`C`
T`t`UuH`h`TCZtczSH`SH`CH`ch`CH`ch`iZH`zh`CH`ch`A`a`A`a`E`e`A`a`ZH`zh`Z`z`Z`z`I`i`O`o`O`o`U`u`U`u`CH`ch`
Y`y`'";

First I though they could just be added but not all locales
transliterate Umlauts so just extending the current test won't do as it
will fail for those locales.
Post by Rafal Luzynski
[...] diff -uNr a/localedata/locales/am_ET
b/localedata/locales/am_ET --- a/localedata/locales/am_ET
2018-10-11 15:10:11.000000000 +0000 +++ b/localedata/locales/am_ET
<U0060><U0039><U0030> <U137B> <U0060><U0031><U0030><U0030> <U137C>
<U0060><U0031><U0030><U0030><U0030><U0030> +include
"translit_cyrillic";"" translit_end % END LC_CTYPE
Shouldn't “include "translit_cyrillic";""” be placed before the
custom rules, together with other includes? The same in more files,
I will not mention them all.
If I recall correctly it is because of the
"translit_end
END LC_CTYPE"
part at the end of the translit_cyrillic. This way it works for any
locale, regardless whether it has translit itself or not. And being at
the end it does not supersede any previous transliteration that may be
there for a reason.

As with some other comments, I am not super familiar with the formats of
glibc files. So if you have a definitive suggestion - pls. formulate it
as an imperative, not a question.
Post by Rafal Luzynski
[...] +translit_start + +% CYRILLIC CAPITAL LETTER IO +<U0401>
<U00CB>;"<U0059><U004F>"
This says that for ASCII (GOST 7.79 System B) you would like to
transliterate "Ё" as "YO" but the table in Wikipedia says "Yo". I
understand that one or another may be correct depending on the
context but we should be consistent and also better let's stick with
the standard.
The choice for YO, SH, YA, ZH etc. is to avoid naming collisions for
example for "Сх" and "К" that would both transliterate to Sh:
With SH:"СхеЌа"->"Shema" but "КеЌа"->"SHema"
With Sh:"СхеЌа"->"Shema" and "КеЌа"->"Shema". Collision!
This is important e.g. for renaming files, grouping as in using uniq etc.
Post by Rafal Luzynski
+% CYRILLIC CAPITAL LETTER DJE +<U0402> <U0110>;"<U0044><U004A>"
This says "DJ" but System B does not mention it. Where does it come
from? Also, I think it should be "Dj" rather than "DJ".
I took the first two letters from its name.
Post by Rafal Luzynski
[...] +% CYRILLIC UNDEFINED +<U0423><U0301>
<U00DA>;"<U0055><U0060>"
1. I think it should be named "CYRILLIC CAPITAL LETTER U WITH ACUTE".
2. OK, the System A table mentions this letter but System B does not.
Somehow we should handle it. I think that "U`" is the best we can do
for now. 3. It must be tested whether this actually works.
1. Let's do it just before you are ready to commit the patch, because it
breaks formulas in my worksheet and I will have to do it manually?
3. I have tested and it doesn't work/gets ignored. But if you were to
handle COMBINING it would work, wouldn't it?
Post by Rafal Luzynski
[...] +% CYRILLIC CAPITAL LETTER HA +<U0425> <U0048>;<U0058>
I don't think that "H" is unavailable in any encoding therefore it
will always be transliterated as "H" and never as "X". We can't
help it and I don't think it is bad.
But we can keep this for when/if there is a way to explicitly request
transcription instead of transliteration.
Post by Rafal Luzynski
+% CYRILLIC CAPITAL LETTER TSE +<U0426> <U0043>;"<U0043><U005A>"
1. "CZ" - maybe should be "Cz"?> 2. Are we able to implement the
rule: "c before i, e, y, j"?
1. see for CYRILLIC CAPITAL LETTER IO
2. not sure what you are talking about in 2. but I believe it's not
possible as per Marko's email.
Post by Rafal Luzynski
+% CYRILLIC CAPITAL LETTER HARD SIGN +<U042A>
<U02BA>;"<U0041><U0060>"
"A`" is only for Bulgarian and should go to bg_BG. How should we
transliterate an upper case hard sign to plain ASCII? I think that
just "``", same as lower case.
This is to avoid collision. Besides AFAIK e.g. in Russian there is no
capital hard sign because there are no words starting with it.
Post by Rafal Luzynski
+% CYRILLIC CAPITAL LETTER YERU +<U042B> <U0059>;"<U0059><U0060>"
Again, as "Y" is always available it will never be transliterated as
"Y`".
But we can keep this for when/if there is a way to explicitly request
transcription instead of transliteration.


Bests,
Diego
Marko Myllynen
2018-10-15 11:04:52 UTC
Permalink
Hi,
Post by Egor Kobylkin
Post by Rafal Luzynski
Regarding the tests, I think there is no complete transliteration
test suite at the moment. Probably the only test is
localedata/bug-iconv-trans.c. You can also see the collation tests
placed in the same directory, they use those multiple *.UTF-8.in
files.
You can skip the tests for now.
First I though they could just be added but not all locales
transliterate Umlauts so just extending the current test won't do as it
will fail for those locales.
I still think a one-time check against uconv(1) (part of Unicode's ICU
project) for discrepancies.
Post by Egor Kobylkin
Post by Rafal Luzynski
[...] diff -uNr a/localedata/locales/am_ET
b/localedata/locales/am_ET --- a/localedata/locales/am_ET
2018-10-11 15:10:11.000000000 +0000 +++ b/localedata/locales/am_ET
<U0060><U0039><U0030> <U137B> <U0060><U0031><U0030><U0030> <U137C>
<U0060><U0031><U0030><U0030><U0030><U0030> +include
"translit_cyrillic";"" translit_end % END LC_CTYPE
Shouldn't “include "translit_cyrillic";""” be placed before the
custom rules, together with other includes? The same in more files,
I will not mention them all.
If I recall correctly it is because of the
"translit_end
END LC_CTYPE"
part at the end of the translit_cyrillic. This way it works for any
locale, regardless whether it has translit itself or not. And being at
the end it does not supersede any previous transliteration that may be
there for a reason.
I suspect one problem would be that the latter rule wins, so if there
are some locale-specific rules than possible translit_* inclusions would
override them if not included before the locale-specific rules.

Cheers,
--
Marko Myllynen
Egor Kobylkin
2018-10-15 11:54:53 UTC
Permalink
Post by Marko Myllynen
Hi,
Post by Egor Kobylkin
Post by Rafal Luzynski
Regarding the tests, I think there is no complete transliteration
test suite at the moment. Probably the only test is
localedata/bug-iconv-trans.c. You can also see the collation tests
placed in the same directory, they use those multiple *.UTF-8.in
files.
You can skip the tests for now.
First I though they could just be added but not all locales
transliterate Umlauts so just extending the current test won't do as it
will fail for those locales.
I still think a one-time check against uconv(1) (part of Unicode's ICU
project) for discrepancies.
Just an addition. I have changes a few constants to see whether
localedata/bug-iconv-trans.c could be made to test cyrillic. Attached is
the bug-iconv-trans-cyr.c that goes through in this form. I had to save
it as UTF-8 instead of ISO-8859-15 for localedata/bug-iconv-trans.c.
Post by Marko Myllynen
Post by Egor Kobylkin
Post by Rafal Luzynski
[...] diff -uNr a/localedata/locales/am_ET
b/localedata/locales/am_ET --- a/localedata/locales/am_ET
2018-10-11 15:10:11.000000000 +0000 +++ b/localedata/locales/am_ET
<U0060><U0039><U0030> <U137B> <U0060><U0031><U0030><U0030> <U137C>
<U0060><U0031><U0030><U0030><U0030><U0030> +include
"translit_cyrillic";"" translit_end % END LC_CTYPE
Shouldn't “include "translit_cyrillic";""” be placed before the
custom rules, together with other includes? The same in more files,
I will not mention them all.
If I recall correctly it is because of the
"translit_end
END LC_CTYPE"
part at the end of the translit_cyrillic. This way it works for any
locale, regardless whether it has translit itself or not. And being at
the end it does not supersede any previous transliteration that may be
there for a reason.
I suspect one problem would be that the latter rule wins, so if there
are some locale-specific rules than possible translit_* inclusions would
override them if not included before the locale-specific rules.
What is the best way forward here? Can somebody make an explicit
suggestion on how to change the current approach if needed?

Bests,
Egor
Rafal Luzynski
2018-10-23 23:08:33 UTC
Permalink
Hi Egor,

Thank you for your updates and again I'm sorry for my delayed response.
A general remark about this: if you are in a hurry and you need the
corrected transliteration rules for yourself or for your users then
you don't have to wait for the patch to be reviewed and accepted here.
You can make your own locale and use it, you don't need to rebuild glibc,
you don't even need root privileges to do it. The locale data subsystem
is designed to allow users create and use their own locales.

I have seen and tested locally your newer patch [1] but I will reply
in this thread because I think it is easier to reply in context.

I would like to summarize the differences between v5 [2] and v6 to make
sure that I noticed them all and that you have not introduced any changes
inadvertently. (Yes, that means I have skipped another patch which you
sent between those two.)

* Locales removed from the patch: C and sd_PK.
* Added locales: az_AZ and ky_KG.
* You consequently transliterate single uppercase Cyrillic letters
to sequences of all uppercase Latin letters in all languages (whenever
a Cyrillic letter is transliterated to more than one Latin letter),
for example "Ї" is now transliterated as "YI" rather than "Yi".

Again I must say that I experienced lots of technical difficulties to apply
the patch and I had to rework it manually because it is not applicable as
Post by Egor Kobylkin
Hi Rafal,
Thanks for the thorough checking, it really helps.
Post by Rafal Luzynski
Technical issue: Please either attach your patch to the email
message or paste it inline, not both. The patch as it is now is not
applicable. I had to edit it manually to apply.
diff -uNr a/localedata/locales/C b/localedata/locales/C ---
a/localedata/locales/C 2018-10-11 15:10:12.000000000 +0000 +++
b/localedata/locales/C 2018-10-11 15:10:43.000000000 +0000
There is no such file. Where have you got the source code from?
Are you sure this is glibc? :-)
I was running my patch process against the Ubuntu 18.04 version of
localedata/locales. Now I have checked out the GitHub glibc source v2.28
and done the same. [...]
Remarks:

* Please use the repository at https://sourceware.org/git/?p=glibc.git
rather than a copy at GitHub.
* Please use the master branch rather than 2.28.
* Commit your work locally.
* Use "git format-patch" (e.g., "git format-patch HEAD^..HEAD") to generate
the patch, then you can email it to this list.
* You can email it inline or, if your email client breaks the lines and
inserts
other unnecessary characters, send as an attachment.
* Use "git pull --rebase" to keep your work up to date.
* Read the Contribution Checklist [3] for more details.
Post by Egor Kobylkin
Post by Rafal Luzynski
[...] From this patch I have excluded locales that already mention
cyrillic or have a transliteration table for it: az_AZ
iso14651_t1_common ky_KG mn_MN sr_RS tg_TJ tk_TM tt_RU uk_UA uz_UZ
I confirm that these locales are excluded and there are no other
missing locales.
Because of the surprisingly different list of locales between Ubuntu and
glibc there is now a different list of excluded ones as well.
mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
uk_UA
az_AZ, ky_KG are now included
As far as I can see, there are no other differences between those two
patches.
Post by Egor Kobylkin
because they don't have cyrillic translit
in glibc. iso14651_t1_common is still implicitly excluded, because it
doesn't have 'translit_end' string.
Somehow az_AZ and tr_TR from glibc fail to transliterate Cyrillic even
after the patch applied (az_AZ is explicitly including tr_TR). I do not
see a reason, maybe you could check?
I noticed that az_AZ does not build at all, localedef program reports
a "circular dependency" (if I recall correctly). I think that since az_AZ
contains “copy "tr_TR"” and tr_TR already contains (in your patch)
“include "translit_cyrillic";""” you should just remove
“include "translit_cyrillic";""” from az_AZ which effectively means that
there are no changes in az_AZ. Optionally, you can add a comment to az_AZ
to explain why it does not contain “include "translit_cyrillic";""” and to
make sure that if anyone removes “copy "tr_TR"” ever in the future, the
“include "translit_cyrillic";""” will be added at the same time. I have
verified that removing that line makes the locale data build without an
error but I have not yet verified that they work as expected.
Post by Egor Kobylkin
Post by Rafal Luzynski
Regarding the tests, I think there is no complete transliteration
test suite at the moment. Probably the only test is
localedata/bug-iconv-trans.c. You can also see the collation tests
placed in the same directory, they use those multiple *.UTF-8.in
files.
You can skip the tests for now.
In the copy of localedata/bug-iconv-trans.c lines 10-11 we could just
change the list of the symbols we are now transliterating
const char str[] = "ÄäÖöÜüß";
const char expected[] = "AEaeOEoeUEuess";
like this
const char str[] =
"ЁЂЃЄЅІЇЈЉЊЋЌЎЏАБВГДЕЖЗИЙКЛМНОПРСТУУ́ФХЦЧШЩъЫьЭЮЯабвгдежзийклмнопрстуу́фхцчшщЪыЬэюяёђѓєѕіїјљњћќўџѪѫѲѳѴѵҌҍ
ҐґҒғҔҕҖҗҚқҞҟҢңҤҥҦҧҨҩҪҫҬҭҮүҲҳҴҵҺһҼҽҾҿӀӁӂӋӌӐӑӒӓӖӗӘәӜӝӞӟӠӡӤӥӦӧӨөӰӱӲӳӴӵӸӹ’"
const char expected[] =
"YODJG`YEZ`IYIJL`N`TSHK`U`DHABVGDEZHZIJKLMNOPRSTUU`FXCZCHSHSHHA`Y``E`YUYAabvgdezhzijklmnoprstuu`fxczchsh
shh``y``e`yuyayodjg`yez`iyijl`n`tshk`u`dhO`o`FHfhYHyhE`e`G`g`GHghGHghZH`zh`K`k`K`k`N`n`NGngP`p`O`o`C`C`
T`t`UuH`h`TCZtczSH`SH`CH`ch`CH`ch`iZH`zh`CH`ch`A`a`A`a`E`e`A`a`ZH`zh`Z`z`Z`z`I`i`O`o`O`o`U`u`U`u`CH`ch`
Y`y`'";
First I though they could just be added but not all locales
transliterate Umlauts so just extending the current test won't do as it
will fail for those locales.
I noticed that you pasted a patch in a Bugzilla comment. [4] If I understand
correctly you suggest to rework the existing test case to test Cyrillic
transliteration instead of German. Please don't do it: the existing test
cases may be extended but must not be removed. I think we should rework
this
test case to handle multiple locales and multiple transliteration pairs;
optionally we can add a new case instead. Currently I lean into reworking
the existing test case.
Post by Egor Kobylkin
Post by Rafal Luzynski
[...] diff -uNr a/localedata/locales/am_ET
b/localedata/locales/am_ET --- a/localedata/locales/am_ET
2018-10-11 15:10:11.000000000 +0000 +++ b/localedata/locales/am_ET
<U0060><U0039><U0030> <U137B> <U0060><U0031><U0030><U0030> <U137C>
<U0060><U0031><U0030><U0030><U0030><U0030> +include
"translit_cyrillic";"" translit_end % END LC_CTYPE
Shouldn't “include "translit_cyrillic";""” be placed before the
custom rules, together with other includes? The same in more files,
I will not mention them all.
If I recall correctly it is because of the
"translit_end
END LC_CTYPE"
part at the end of the translit_cyrillic. This way it works for any
locale, regardless whether it has translit itself or not. And being at
the end it does not supersede any previous transliteration that may be
there for a reason.
As with some other comments, I am not super familiar with the formats of
glibc files. So if you have a definitive suggestion - pls. formulate it
as an imperative, not a question.
I feel like a newcomer here so it was meant to be a question to other
more experienced maintainers but probably it's time to change this attitude.
So, also taking into account what Marko wrote, [5] please put the include
directive after all other include directives, or after the "translit_start"
directive if there are no other includes, rather than putting it just before
"translit_end". Even if putting it at the dnd works sometimes or even
always.
Same as you put #include's near top of the file when writing a C program
even
if sometimes you may put it anywhere and it will work. If you use a script
to insert your include directives then please rework it, if you insert them
manually then just move them manually.
Post by Egor Kobylkin
Post by Rafal Luzynski
[...] +translit_start + +% CYRILLIC CAPITAL LETTER IO +<U0401>
<U00CB>;"<U0059><U004F>"
This says that for ASCII (GOST 7.79 System B) you would like to
transliterate "Ё" as "YO" but the table in Wikipedia says "Yo". I
understand that one or another may be correct depending on the
context but we should be consistent and also better let's stick with
the standard.
The choice for YO, SH, YA, ZH etc. is to avoid naming collisions for
With SH:"Схема"->"Shema" but "Шема"->"SHema"
With Sh:"Схема"->"Shema" and "Шема"->"Shema". Collision!
This is important e.g. for renaming files, grouping as in using uniq etc.
I understand this idea. Is this part of any existing standard? I can't
see it regulated by GOST 7.79.

I'd rather not include the transliteration rules which seems reasonable to
us (the developers) but are not known and therefore not acceptable by the
outer world.
Post by Egor Kobylkin
Post by Rafal Luzynski
+% CYRILLIC CAPITAL LETTER DJE +<U0402> <U0110>;"<U0044><U004A>"
This says "DJ" but System B does not mention it. Where does it come
from? Also, I think it should be "Dj" rather than "DJ".
I took the first two letters from its name.
As I said previously, I would like to add more Cyrillic letters even if
they are not regulated by any standard. But let's separate them and make
it clear that these rules are based on GOST 7.79 and those are our own
invention (or come from other standard etc.) I think that all these
rules may even be in the same file but in different parts of it.
Post by Egor Kobylkin
Post by Rafal Luzynski
[...] +% CYRILLIC UNDEFINED +<U0423><U0301>
<U00DA>;"<U0055><U0060>"
1. I think it should be named "CYRILLIC CAPITAL LETTER U WITH ACUTE".
2. OK, the System A table mentions this letter but System B does not.
Somehow we should handle it. I think that "U`" is the best we can do
for now. 3. It must be tested whether this actually works.
1. Let's do it just before you are ready to commit the patch, because it
breaks formulas in my worksheet and I will have to do it manually?
3. I have tested and it doesn't work/gets ignored. But if you were to
handle COMBINING it would work, wouldn't it?
My guess is that since translit_combining just removes all those combining
diacritic characters and translit_combining is usually included before
translit_cyrillic then <U0301> is removed even before <U0423> is taken
into account. Also my another guess is that it might work good if you
just removed this rule: <U0423> would be translated to "U" and <U0301>
would remain unchanged and eventually those two characters would produce
"Ú". But, again, that's just a guess, I have not tested.
Post by Egor Kobylkin
Post by Rafal Luzynski
[...] +% CYRILLIC CAPITAL LETTER HA +<U0425> <U0048>;<U0058>
I don't think that "H" is unavailable in any encoding therefore it
will always be transliterated as "H" and never as "X". We can't
help it and I don't think it is bad.
But we can keep this for when/if there is a way to explicitly request
transcription instead of transliteration.
Note that either it will make the test cases fail or we will have to
prepare the test cases deliberately skip the translation of <U0425>
into "X" because "H" will be always working. We can't force iconv
to choose the second transliteration rule if the first one works.

That means we will have a problem to construct the test cases.
Post by Egor Kobylkin
Post by Rafal Luzynski
+% CYRILLIC CAPITAL LETTER TSE +<U0426> <U0043>;"<U0043><U005A>"
1. "CZ" - maybe should be "Cz"?> 2. Are we able to implement the
rule: "c before i, e, y, j"?
1. see for CYRILLIC CAPITAL LETTER IO
2. not sure what you are talking about in 2. but I believe it's not
possible as per Marko's email.
Hm... I can't find a good example now. Maybe I was mislead by the rules
of Cyrillic transliteration which I learned at school and which are not
necessarily universal and not necessarily useful for English readers.
Post by Egor Kobylkin
Post by Rafal Luzynski
+% CYRILLIC CAPITAL LETTER HARD SIGN +<U042A>
<U02BA>;"<U0041><U0060>"
"A`" is only for Bulgarian and should go to bg_BG. How should we
transliterate an upper case hard sign to plain ASCII? I think that
just "``", same as lower case.
This is to avoid collision.
What collision?
Post by Egor Kobylkin
Besides AFAIK e.g. in Russian there is no
capital hard sign because there are no words starting with it.
True but it can be used in ALL UPPERCASE text. Therefore we need a clear
and correct transliteration rule for it.
Post by Egor Kobylkin
Post by Rafal Luzynski
+% CYRILLIC CAPITAL LETTER YERU +<U042B> <U0059>;"<U0059><U0060>"
Again, as "Y" is always available it will never be transliterated as
"Y`".
But we can keep this for when/if there is a way to explicitly request
transcription instead of transliteration.
Again, it will be difficult or impossible to construct a correct test case
and we must be aware of this.

Regards,

Rafal


[1] https://sourceware.org/ml/libc-alpha/2018-10/msg00300.html
[2] https://sourceware.org/ml/libc-alpha/2018-10/msg00213.html
[3] https://sourceware.org/glibc/wiki/Contribution%20checklist
[4] https://sourceware.org/bugzilla/show_bug.cgi?id=2872#c47
[5] https://sourceware.org/ml/libc-alpha/2018-10/msg00232.html
Egor Kobylkin
2018-10-17 14:16:32 UTC
Permalink
Dear locale maintainers,

fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"

https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]

add the Cyrillic transliteration table translit_cyrillic file

https://sourceware.org/bugzilla/attachment.cgi?id=11340 [7]

to localedata/locales/ and include it in all your locales going forward.

The patch included inline below.

From this patch I have excluded locales that already mention cyrillic or
have a transliteration table for it:

mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
***@cyrillic
uk_UA

Their maintainers are requested to make an explicit decision on how and
whether at all to include this patch.

Current bug effect:

The glibc wiki explicitly lists this use case as the test example

https://sourceware.org/glibc/wiki/Locales#Testing_Locales :

LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt

currently it fails on Cyrillic texts in most locales including ru_RU [1]
[8] [9]:

LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt |grep CYRILLIC

CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.

- It produces a string of question marks and spaces.

This is what it should produce and it does so after the patch applied:

CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.


The root problem and the fix:

The root problem is the missing transliteration table that I am
supplying here. Furthermore it has to be referenced/included into the
active locale at the compilation time to be used by iconv.



COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.

Examples: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
compatible transcription and iconv -f UTF-8 -t ISO-8859-15//TRANSLIT |
iconv -f ISO-8859-15 -t UTF-8 will produce Latin transliteration as per
ISO 9.1995.

While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration/transcription has only Latin/ASCII codes but still can
be read by a native speaker. Among other things it is useful for
processing the Cyrillic texts and filenames by programs or on systems
that are not specifically prepared to work with Cyrillic, don't have
corresponding fonts installed or can't handle UTF-8.

The transliteration table itself is attached as a file translit_cyrillic
[7]. Its content (mapping) is based on ISO 9.1995 standard [10] and its
derivative GOST 7.79-2000 official source (Federal Agency on Technical
Regulating and Metrology Of Russian Federation [2]). Technically an
independent but mostly identical source [3] was used and prepared in a
spreadsheet [6].

The documentation suggests that the transliteration tables inclusion is
done by adding *include "translit_cyrillic";""* string into LC_CTYPE
translit_start section
http://man7.org/linux/man-pages/man5/locale.5.html [5]
Practically I have searched for all locales that have a
translit_start/end stance and generated a patch for them.

The Cyrillic transliteration of e.g. Russian text may have already
worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
have their transliteration tables included inline.

I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files. Volodymyr
Lisivka <***@gmail.com>, Max Kutny <***@gmail.com> (uk_UA),
ДаМОлП КегаМ <***@gnome.org> (sr_RS) have confirmed the
exclusion.

Links:

[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11301
[7] translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11340
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=11304
[10] https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A

Best regards,
Egor Kobylkin

---
2018-10-17 Egor Kobylkin <***@kobylkin.com>

[BZ #2872]
* localedata/locales/translit_cyrillic: Add ISO 9.1995, GOST 7.79
System A transliteration System B transcription table from Cyrillic to
Latin/ASCII.
* localedata/locales/aa_DJ: Add 'include "translit_cyrillic";""' to
LC_CTYPE translit section.
* localedata/locales/af_ZA: Likewise.
* localedata/locales/ak_GH: Likewise.
* localedata/locales/am_ET: Likewise.
* localedata/locales/ar_EG: Likewise.
* localedata/locales/az_AZ: Likewise.
* localedata/locales/be_BY: Likewise.
* localedata/locales/bem_ZM: Likewise.
* localedata/locales/ber_DZ: Likewise.
* localedata/locales/ber_MA: Likewise.
* localedata/locales/bg_BG: Likewise.
* localedata/locales/bi_VU: Likewise.
* localedata/locales/bn_BD: Likewise.
* localedata/locales/bo_CN: Likewise.
* localedata/locales/ca_ES: Likewise.
* localedata/locales/ce_RU: Likewise.
* localedata/locales/cmn_TW: Likewise.
* localedata/locales/cs_CZ: Likewise.
* localedata/locales/cv_RU: Likewise.
* localedata/locales/cy_GB: Likewise.
* localedata/locales/da_DK: Likewise.
* localedata/locales/de_DE: Likewise.
* localedata/locales/dv_MV: Likewise.
* localedata/locales/dz_BT: Likewise.
* localedata/locales/el_GR: Likewise.
* localedata/locales/en_GB: Likewise.
* localedata/locales/en_NG: Likewise.
* localedata/locales/en_ZM: Likewise.
* localedata/locales/es_CU: Likewise.
* localedata/locales/es_ES: Likewise.
* localedata/locales/et_EE: Likewise.
* localedata/locales/fa_IR: Likewise.
* localedata/locales/ff_SN: Likewise.
* localedata/locales/fi_FI: Likewise.
* localedata/locales/fr_FR: Likewise.
* localedata/locales/ga_IE: Likewise.
* localedata/locales/gd_GB: Likewise.
* localedata/locales/gu_IN: Likewise.
* localedata/locales/gv_GB: Likewise.
* localedata/locales/he_IL: Likewise.
* localedata/locales/hi_IN: Likewise.
* localedata/locales/hif_FJ: Likewise.
* localedata/locales/hr_HR: Likewise.
* localedata/locales/ht_HT: Likewise.
* localedata/locales/hu_HU: Likewise.
* localedata/locales/hy_AM: Likewise.
* localedata/locales/id_ID: Likewise.
* localedata/locales/is_IS: Likewise.
* localedata/locales/it_IT: Likewise.
* localedata/locales/ja_JP: Likewise.
* localedata/locales/kab_DZ: Likewise.
* localedata/locales/kk_KZ: Likewise.
* localedata/locales/km_KH: Likewise.
* localedata/locales/kn_IN: Likewise.
* localedata/locales/ko_KR: Likewise.
* localedata/locales/ks_IN: Likewise.
* localedata/locales/kw_GB: Likewise.
* localedata/locales/ky_KG: Likewise.
* localedata/locales/lb_LU: Likewise.
* localedata/locales/lg_UG: Likewise.
* localedata/locales/lij_IT: Likewise.
* localedata/locales/ln_CD: Likewise.
* localedata/locales/lo_LA: Likewise.
* localedata/locales/lt_LT: Likewise.
* localedata/locales/lv_LV: Likewise.
* localedata/locales/mg_MG: Likewise.
* localedata/locales/mhr_RU: Likewise.
* localedata/locales/mk_MK: Likewise.
* localedata/locales/ml_IN: Likewise.
* localedata/locales/ms_MY: Likewise.
* localedata/locales/mt_MT: Likewise.
* localedata/locales/***@latin: Likewise.
* localedata/locales/nb_NO: Likewise.
* localedata/locales/ne_NP: Likewise.
* localedata/locales/nhn_MX: Likewise.
* localedata/locales/niu_NU: Likewise.
* localedata/locales/niu_NZ: Likewise.
* localedata/locales/nl_NL: Likewise.
* localedata/locales/nr_ZA: Likewise.
* localedata/locales/oc_FR: Likewise.
* localedata/locales/om_KE: Likewise.
* localedata/locales/or_IN: Likewise.
* localedata/locales/os_RU: Likewise.
* localedata/locales/pa_IN: Likewise.
* localedata/locales/pa_PK: Likewise.
* localedata/locales/pl_PL: Likewise.
* localedata/locales/pt_PT: Likewise.
* localedata/locales/quz_PE: Likewise.
* localedata/locales/ro_RO: Likewise.
* localedata/locales/ru_RU: Likewise.
* localedata/locales/rw_RW: Likewise.
* localedata/locales/sa_IN: Likewise.
* localedata/locales/sd_IN: Likewise.
* localedata/locales/***@devanagari: Likewise.
* localedata/locales/se_NO: Likewise.
* localedata/locales/sgs_LT: Likewise.
* localedata/locales/shn_MM: Likewise.
* localedata/locales/si_LK: Likewise.
* localedata/locales/sk_SK: Likewise.
* localedata/locales/sl_SI: Likewise.
* localedata/locales/sm_WS: Likewise.
* localedata/locales/so_SO: Likewise.
* localedata/locales/sq_AL: Likewise.
* localedata/locales/ss_ZA: Likewise.
* localedata/locales/st_ZA: Likewise.
* localedata/locales/sv_SE: Likewise.
* localedata/locales/sw_KE: Likewise.
* localedata/locales/ta_IN: Likewise.
* localedata/locales/te_IN: Likewise.
* localedata/locales/th_TH: Likewise.
* localedata/locales/ti_ET: Likewise.
* localedata/locales/tn_ZA: Likewise.
* localedata/locales/to_TO: Likewise.
* localedata/locales/tpi_PG: Likewise.
* localedata/locales/tr_TR: Likewise.
* localedata/locales/ts_ZA: Likewise.
* localedata/locales/unm_US: Likewise.
* localedata/locales/ur_IN: Likewise.
* localedata/locales/ur_PK: Likewise.
* localedata/locales/ve_ZA: Likewise.
* localedata/locales/vi_VN: Likewise.
* localedata/locales/wa_BE: Likewise.
* localedata/locales/wo_SN: Likewise.
* localedata/locales/xh_ZA: Likewise.
* localedata/locales/yi_US: Likewise.
* localedata/locales/yuw_PG: Likewise.
* localedata/locales/zh_CN: Likewise.
* localedata/locales/zu_ZA: Likewise.
Egor Kobylkin
2018-11-01 22:51:55 UTC
Permalink
Changelog v7:
* Generated against git://sourceware.org/git/glibc.git master with git
format-patch.
* The 'include "translit_cyrillic";""' now immediately follows last
'include "translit_XXX";""' string (was inserted just before
translit_end previously.)
* Only the locales already having 'include .*translit.*;""' are patched
(see the list for manual exclusions below, full list of included locales
at the end of the email in the commit section.)
* Excluded az_AZ completely to avoid circular reference from tr_TR via
“copy "tr_TR"”.


Changelog v6:
* Locales removed from the patch: C and sd_PK.
* Added locales: az_AZ and ky_KG.
* Consistently transliterate single uppercase Cyrillic letters
to sequences of all uppercase Latin letters in all languages (whenever
a Cyrillic letter is transliterated to more than one Latin letter),
for example "Ї" is now transliterated as "YI" rather than "Yi".

Dear locale maintainers,

fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"

https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]

add the Cyrillic transliteration table translit_cyrillic file

https://sourceware.org/bugzilla/attachment.cgi?id=11340 [7]

to localedata/locales/ and include it in all your locales going forward.

The patch included inline below.

From this patch I have excluded locales that already mention cyrillic or
have a transliteration table for it:

mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
***@cyrillic
uk_UA

Their maintainers are requested to make an explicit decision on how and
whether at all to include this patch.

Current bug effect:

The glibc wiki explicitly lists this use case as the test example

https://sourceware.org/glibc/wiki/Locales#Testing_Locales :

LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt

currently it fails on Cyrillic texts in most locales including ru_RU [1]
[8] [9]:

LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt |grep CYRILLIC

CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.

- It produces a string of question marks and spaces.

This is what it should produce and it does so after the patch applied:

CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.


The root problem and the fix:

The root problem is the missing transliteration table that I am
supplying here. Furthermore it has to be referenced/included into the
active locale at the compilation time to be used by iconv.



COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.

Examples: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
compatible transcription and iconv -f UTF-8 -t ISO-8859-15//TRANSLIT |
iconv -f ISO-8859-15 -t UTF-8 will produce Latin transliteration as per
ISO 9.1995.

While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration/transcription has only Latin/ASCII codes but still can
be read by a native speaker. Among other things it is useful for
processing the Cyrillic texts and filenames by programs or on systems
that are not specifically prepared to work with Cyrillic, don't have
corresponding fonts installed or can't handle UTF-8.

The transliteration table itself is attached as a file translit_cyrillic
[7]. Its content (mapping) is based on ISO 9.1995 standard [10] and its
derivative GOST 7.79-2000 official source (Federal Agency on Technical
Regulating and Metrology Of Russian Federation [2]). Technically an
independent but mostly identical source [3] was used and prepared in a
spreadsheet [6].

The documentation suggests that the transliteration tables inclusion is
done by adding *include "translit_cyrillic";""* string into LC_CTYPE
translit_start section
http://man7.org/linux/man-pages/man5/locale.5.html [5]
Practically I have searched for all locales that already have
'include .*translit.*;""' string and generated a patch for them.

The Cyrillic transliteration of e.g. Russian text may have already
worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
have their transliteration tables included inline.

I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files. Volodymyr
Lisivka <***@gmail.com>, Max Kutny <***@gmail.com> (uk_UA),
ДаМОлП КегаМ <***@gnome.org> (sr_RS) have confirmed the
exclusion.

Links:

[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11301
[7] translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11340
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=11304
[10] https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A

Best regards,
Egor Kobylkin

---
2018-10-17 Egor Kobylkin <***@kobylkin.com>

[BZ #2872]
* localedata/locales/translit_cyrillic: Add ISO 9.1995, GOST 7.79
System A transliteration System B transcription table from Cyrillic to
Latin/ASCII.
* localedata/locales/aa_DJ: Add 'include "translit_cyrillic";""' to
LC_CTYPE translit section.
* localedata/locales/af_ZA: Likewise.
* localedata/locales/ak_GH: Likewise.
* localedata/locales/am_ET: Likewise.
* localedata/locales/ar_EG: Likewise.
* localedata/locales/be_BY: Likewise.
* localedata/locales/bem_ZM: Likewise.
* localedata/locales/ber_DZ: Likewise.
* localedata/locales/ber_MA: Likewise.
* localedata/locales/bg_BG: Likewise.
* localedata/locales/bi_VU: Likewise.
* localedata/locales/bn_BD: Likewise.
* localedata/locales/bo_CN: Likewise.
* localedata/locales/ca_ES: Likewise.
* localedata/locales/ce_RU: Likewise.
* localedata/locales/cmn_TW: Likewise.
* localedata/locales/cs_CZ: Likewise.
* localedata/locales/cv_RU: Likewise.
* localedata/locales/cy_GB: Likewise.
* localedata/locales/da_DK: Likewise.
* localedata/locales/de_DE: Likewise.
* localedata/locales/dv_MV: Likewise.
* localedata/locales/dz_BT: Likewise.
* localedata/locales/el_GR: Likewise.
* localedata/locales/en_GB: Likewise.
* localedata/locales/en_NG: Likewise.
* localedata/locales/en_ZM: Likewise.
* localedata/locales/es_CU: Likewise.
* localedata/locales/es_ES: Likewise.
* localedata/locales/et_EE: Likewise.
* localedata/locales/fa_IR: Likewise.
* localedata/locales/ff_SN: Likewise.
* localedata/locales/fi_FI: Likewise.
* localedata/locales/fr_FR: Likewise.
* localedata/locales/ga_IE: Likewise.
* localedata/locales/gd_GB: Likewise.
* localedata/locales/gu_IN: Likewise.
* localedata/locales/gv_GB: Likewise.
* localedata/locales/he_IL: Likewise.
* localedata/locales/hi_IN: Likewise.
* localedata/locales/hif_FJ: Likewise.
* localedata/locales/hr_HR: Likewise.
* localedata/locales/ht_HT: Likewise.
* localedata/locales/hu_HU: Likewise.
* localedata/locales/hy_AM: Likewise.
* localedata/locales/id_ID: Likewise.
* localedata/locales/is_IS: Likewise.
* localedata/locales/it_IT: Likewise.
* localedata/locales/ja_JP: Likewise.
* localedata/locales/kab_DZ: Likewise.
* localedata/locales/kk_KZ: Likewise.
* localedata/locales/km_KH: Likewise.
* localedata/locales/kn_IN: Likewise.
* localedata/locales/ko_KR: Likewise.
* localedata/locales/ks_IN: Likewise.
* localedata/locales/kw_GB: Likewise.
* localedata/locales/ky_KG: Likewise.
* localedata/locales/lb_LU: Likewise.
* localedata/locales/lg_UG: Likewise.
* localedata/locales/lij_IT: Likewise.
* localedata/locales/ln_CD: Likewise.
* localedata/locales/lo_LA: Likewise.
* localedata/locales/lt_LT: Likewise.
* localedata/locales/lv_LV: Likewise.
* localedata/locales/mg_MG: Likewise.
* localedata/locales/mhr_RU: Likewise.
* localedata/locales/mk_MK: Likewise.
* localedata/locales/ml_IN: Likewise.
* localedata/locales/ms_MY: Likewise.
* localedata/locales/mt_MT: Likewise.
* localedata/locales/***@latin: Likewise.
* localedata/locales/nb_NO: Likewise.
* localedata/locales/ne_NP: Likewise.
* localedata/locales/nhn_MX: Likewise.
* localedata/locales/niu_NU: Likewise.
* localedata/locales/niu_NZ: Likewise.
* localedata/locales/nl_NL: Likewise.
* localedata/locales/nr_ZA: Likewise.
* localedata/locales/oc_FR: Likewise.
* localedata/locales/om_KE: Likewise.
* localedata/locales/or_IN: Likewise.
* localedata/locales/os_RU: Likewise.
* localedata/locales/pa_IN: Likewise.
* localedata/locales/pa_PK: Likewise.
* localedata/locales/pl_PL: Likewise.
* localedata/locales/pt_PT: Likewise.
* localedata/locales/quz_PE: Likewise.
* localedata/locales/ro_RO: Likewise.
* localedata/locales/ru_RU: Likewise.
* localedata/locales/rw_RW: Likewise.
* localedata/locales/sa_IN: Likewise.
* localedata/locales/sd_IN: Likewise.
* localedata/locales/***@devanagari: Likewise.
* localedata/locales/se_NO: Likewise.
* localedata/locales/sgs_LT: Likewise.
* localedata/locales/shn_MM: Likewise.
* localedata/locales/si_LK: Likewise.
* localedata/locales/sk_SK: Likewise.
* localedata/locales/sl_SI: Likewise.
* localedata/locales/sm_WS: Likewise.
* localedata/locales/so_SO: Likewise.
* localedata/locales/sq_AL: Likewise.
* localedata/locales/ss_ZA: Likewise.
* localedata/locales/st_ZA: Likewise.
* localedata/locales/sv_SE: Likewise.
* localedata/locales/sw_KE: Likewise.
* localedata/locales/ta_IN: Likewise.
* localedata/locales/te_IN: Likewise.
* localedata/locales/th_TH: Likewise.
* localedata/locales/ti_ET: Likewise.
* localedata/locales/tn_ZA: Likewise.
* localedata/locales/to_TO: Likewise.
* localedata/locales/tpi_PG: Likewise.
* localedata/locales/tr_TR: Likewise.
* localedata/locales/ts_ZA: Likewise.
* localedata/locales/unm_US: Likewise.
* localedata/locales/ur_IN: Likewise.
* localedata/locales/ur_PK: Likewise.
* localedata/locales/ve_ZA: Likewise.
* localedata/locales/vi_VN: Likewise.
* localedata/locales/wa_BE: Likewise.
* localedata/locales/wo_SN: Likewise.
* localedata/locales/xh_ZA: Likewise.
* localedata/locales/yi_US: Likewise.
* localedata/locales/yuw_PG: Likewise.
* localedata/locales/zh_CN: Likewise.
* localedata/locales/zu_ZA: Likewise.
Egor Kobylkin
2018-11-02 00:00:26 UTC
Permalink
Changelog v8:
* Re-added missing translit_cyrillic in patch v7 (due to missing "git
add" in the script).

Changelog v7:
* Generated against git://sourceware.org/git/glibc.git master with git
format-patch.
* The 'include "translit_cyrillic";""' now immediately follows last
'include "translit_XXX";""' string (was inserted just before
translit_end previously.)
* Only the locales already having 'include .*translit.*;""' are patched
(see the list for manual exclusions below, full list of included locales
at the end of the email in the commit section.)
* Excluded az_AZ completely to avoid circular reference from tr_TR via
“copy "tr_TR"”.


Changelog v6:
* Locales removed from the patch: C and sd_PK.
* Added locales: az_AZ and ky_KG.
* Consistently transliterate single uppercase Cyrillic letters
to sequences of all uppercase Latin letters in all languages (whenever
a Cyrillic letter is transliterated to more than one Latin letter),
for example "Ї" is now transliterated as "YI" rather than "Yi".

Dear locale maintainers,

fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"

https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]

add the Cyrillic transliteration table translit_cyrillic file

https://sourceware.org/bugzilla/attachment.cgi?id=11340 [7]

to localedata/locales/ and include it in all your locales going forward.

The patch included inline below.

From this patch I have excluded locales that already mention cyrillic or
have a transliteration table for it:

mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
***@cyrillic
uk_UA

Their maintainers are requested to make an explicit decision on how and
whether at all to include this patch.

Current bug effect:

The glibc wiki explicitly lists this use case as the test example

https://sourceware.org/glibc/wiki/Locales#Testing_Locales :

LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt

currently it fails on Cyrillic texts in most locales including ru_RU [1]
[8] [9]:

LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt |grep CYRILLIC

CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.

- It produces a string of question marks and spaces.

This is what it should produce and it does so after the patch applied:

CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.


The root problem and the fix:

The root problem is the missing transliteration table that I am
supplying here. Furthermore it has to be referenced/included into the
active locale at the compilation time to be used by iconv.



COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.

Examples: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
compatible transcription and iconv -f UTF-8 -t ISO-8859-15//TRANSLIT |
iconv -f ISO-8859-15 -t UTF-8 will produce Latin transliteration as per
ISO 9.1995.

While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration/transcription has only Latin/ASCII codes but still can
be read by a native speaker. Among other things it is useful for
processing the Cyrillic texts and filenames by programs or on systems
that are not specifically prepared to work with Cyrillic, don't have
corresponding fonts installed or can't handle UTF-8.

The transliteration table itself is attached as a file translit_cyrillic
[7]. Its content (mapping) is based on ISO 9.1995 standard [10] and its
derivative GOST 7.79-2000 official source (Federal Agency on Technical
Regulating and Metrology Of Russian Federation [2]). Technically an
independent but mostly identical source [3] was used and prepared in a
spreadsheet [6].

The documentation suggests that the transliteration tables inclusion is
done by adding *include "translit_cyrillic";""* string into LC_CTYPE
translit_start section
http://man7.org/linux/man-pages/man5/locale.5.html [5]
Practically I have searched for all locales that already have
'include .*translit.*;""' string and generated a patch for them.

The Cyrillic transliteration of e.g. Russian text may have already
worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
have their transliteration tables included inline.

I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files. Volodymyr
Lisivka <***@gmail.com>, Max Kutny <***@gmail.com> (uk_UA),
ДаМОлП КегаМ <***@gnome.org> (sr_RS) have confirmed the
exclusion.

Links:

[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11301
[7] translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11340
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=11304
[10] https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A

Best regards,
Egor Kobylkin

---
2018-11-02 Egor Kobylkin <***@kobylkin.com>

[BZ #2872]
* localedata/locales/translit_cyrillic: Add ISO 9.1995, GOST 7.79
System A transliteration System B transcription table from Cyrillic to
Latin/ASCII.
* localedata/locales/aa_DJ: Add 'include "translit_cyrillic";""' to
LC_CTYPE translit section.
* localedata/locales/af_ZA: Likewise.
* localedata/locales/ak_GH: Likewise.
* localedata/locales/am_ET: Likewise.
* localedata/locales/ar_EG: Likewise.
* localedata/locales/be_BY: Likewise.
* localedata/locales/bem_ZM: Likewise.
* localedata/locales/ber_DZ: Likewise.
* localedata/locales/ber_MA: Likewise.
* localedata/locales/bg_BG: Likewise.
* localedata/locales/bi_VU: Likewise.
* localedata/locales/bn_BD: Likewise.
* localedata/locales/bo_CN: Likewise.
* localedata/locales/ca_ES: Likewise.
* localedata/locales/ce_RU: Likewise.
* localedata/locales/cmn_TW: Likewise.
* localedata/locales/cs_CZ: Likewise.
* localedata/locales/cv_RU: Likewise.
* localedata/locales/cy_GB: Likewise.
* localedata/locales/da_DK: Likewise.
* localedata/locales/de_DE: Likewise.
* localedata/locales/dv_MV: Likewise.
* localedata/locales/dz_BT: Likewise.
* localedata/locales/el_GR: Likewise.
* localedata/locales/en_GB: Likewise.
* localedata/locales/en_NG: Likewise.
* localedata/locales/en_ZM: Likewise.
* localedata/locales/es_CU: Likewise.
* localedata/locales/es_ES: Likewise.
* localedata/locales/et_EE: Likewise.
* localedata/locales/fa_IR: Likewise.
* localedata/locales/ff_SN: Likewise.
* localedata/locales/fi_FI: Likewise.
* localedata/locales/fr_FR: Likewise.
* localedata/locales/ga_IE: Likewise.
* localedata/locales/gd_GB: Likewise.
* localedata/locales/gu_IN: Likewise.
* localedata/locales/gv_GB: Likewise.
* localedata/locales/he_IL: Likewise.
* localedata/locales/hi_IN: Likewise.
* localedata/locales/hif_FJ: Likewise.
* localedata/locales/hr_HR: Likewise.
* localedata/locales/ht_HT: Likewise.
* localedata/locales/hu_HU: Likewise.
* localedata/locales/hy_AM: Likewise.
* localedata/locales/id_ID: Likewise.
* localedata/locales/is_IS: Likewise.
* localedata/locales/it_IT: Likewise.
* localedata/locales/ja_JP: Likewise.
* localedata/locales/kab_DZ: Likewise.
* localedata/locales/kk_KZ: Likewise.
* localedata/locales/km_KH: Likewise.
* localedata/locales/kn_IN: Likewise.
* localedata/locales/ko_KR: Likewise.
* localedata/locales/ks_IN: Likewise.
* localedata/locales/kw_GB: Likewise.
* localedata/locales/ky_KG: Likewise.
* localedata/locales/lb_LU: Likewise.
* localedata/locales/lg_UG: Likewise.
* localedata/locales/lij_IT: Likewise.
* localedata/locales/ln_CD: Likewise.
* localedata/locales/lo_LA: Likewise.
* localedata/locales/lt_LT: Likewise.
* localedata/locales/lv_LV: Likewise.
* localedata/locales/mg_MG: Likewise.
* localedata/locales/mhr_RU: Likewise.
* localedata/locales/mk_MK: Likewise.
* localedata/locales/ml_IN: Likewise.
* localedata/locales/ms_MY: Likewise.
* localedata/locales/mt_MT: Likewise.
* localedata/locales/***@latin: Likewise.
* localedata/locales/nb_NO: Likewise.
* localedata/locales/ne_NP: Likewise.
* localedata/locales/nhn_MX: Likewise.
* localedata/locales/niu_NU: Likewise.
* localedata/locales/niu_NZ: Likewise.
* localedata/locales/nl_NL: Likewise.
* localedata/locales/nr_ZA: Likewise.
* localedata/locales/oc_FR: Likewise.
* localedata/locales/om_KE: Likewise.
* localedata/locales/or_IN: Likewise.
* localedata/locales/os_RU: Likewise.
* localedata/locales/pa_IN: Likewise.
* localedata/locales/pa_PK: Likewise.
* localedata/locales/pl_PL: Likewise.
* localedata/locales/pt_PT: Likewise.
* localedata/locales/quz_PE: Likewise.
* localedata/locales/ro_RO: Likewise.
* localedata/locales/ru_RU: Likewise.
* localedata/locales/rw_RW: Likewise.
* localedata/locales/sa_IN: Likewise.
* localedata/locales/sd_IN: Likewise.
* localedata/locales/***@devanagari: Likewise.
* localedata/locales/se_NO: Likewise.
* localedata/locales/sgs_LT: Likewise.
* localedata/locales/shn_MM: Likewise.
* localedata/locales/si_LK: Likewise.
* localedata/locales/sk_SK: Likewise.
* localedata/locales/sl_SI: Likewise.
* localedata/locales/sm_WS: Likewise.
* localedata/locales/so_SO: Likewise.
* localedata/locales/sq_AL: Likewise.
* localedata/locales/ss_ZA: Likewise.
* localedata/locales/st_ZA: Likewise.
* localedata/locales/sv_SE: Likewise.
* localedata/locales/sw_KE: Likewise.
* localedata/locales/ta_IN: Likewise.
* localedata/locales/te_IN: Likewise.
* localedata/locales/th_TH: Likewise.
* localedata/locales/ti_ET: Likewise.
* localedata/locales/tn_ZA: Likewise.
* localedata/locales/to_TO: Likewise.
* localedata/locales/tpi_PG: Likewise.
* localedata/locales/tr_TR: Likewise.
* localedata/locales/ts_ZA: Likewise.
* localedata/locales/unm_US: Likewise.
* localedata/locales/ur_IN: Likewise.
* localedata/locales/ur_PK: Likewise.
* localedata/locales/ve_ZA: Likewise.
* localedata/locales/vi_VN: Likewise.
* localedata/locales/wa_BE: Likewise.
* localedata/locales/wo_SN: Likewise.
* localedata/locales/xh_ZA: Likewise.
* localedata/locales/yi_US: Likewise.
* localedata/locales/yuw_PG: Likewise.
* localedata/locales/zh_CN: Likewise.
* localedata/locales/zu_ZA: Likewise.
Rafal Luzynski
2018-11-02 22:22:08 UTC
Permalink
Hi Egor,

I have applied your patch locally and I am going to start reviewing it.
I can tell you already that it applies correctly but git reports these
warnings:

Applying: v8 Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
.git/rebase-apply/patch:1520: trailing whitespace.
% i.e. [U0401-U04F9, U2019] but only the letters covered by ISO 9.1995
.git/rebase-apply/patch:1521: trailing whitespace.
% It implements the GOST_7.79 System A (Latin Script) as a first
.git/rebase-apply/patch:1523: trailing whitespace.
% https://en.wikipedia.org/wiki/ISO_9 for reference.
.git/rebase-apply/patch:1524: trailing whitespace.
% The System B is extended from GOST_7.79-Russian using open sources
.git/rebase-apply/patch:1535: trailing whitespace.
% Generated from UnicodeData.txt with a spreadsheet referenced
warning: 5 lines add whitespace errors.

Also the commit message is missing from your patch because probably it is
missing from your local repository. Please re-add it and please remember
that it must contain a summary like this:

[BZ #2872]
* localedata/locales/translit_cyrillic: Add ISO 9.1995, GOST 7.79
System A transliteration System B transcription table from Cyrillic
to Latin/ASCII.
* localedata/locales/aa_DJ: Add 'include "translit_cyrillic";""' to
LC_CTYPE translit section.
* localedata/locales/af_ZA: Likewise.

Hm... as I look at this now I think it should rather be:

[BZ #2872]
* localedata/locales/translit_cyrillic: New file.
* localedata/locales/aa_DJ (LC_CTYPE): Add
“'include "translit_cyrillic";""”
* localedata/locales/af_ZA (LC_CTYPE): Likewise.

... and so on. Optionally you can use:

* localedata/locales/translit_cyrillic: New file. Supports
ISO 9.1995, GOST 7.79 System A transliteration System B
transcription table from Cyrillic to Latin/ASCII.

I will appreciate more hints about how to write the ChangeLog entry
correctly
from more experienced maintainers.
Post by Egor Kobylkin
[...]
* The 'include "translit_cyrillic";""' now immediately follows last
'include "translit_XXX";""' string (was inserted just before
translit_end previously.)
I confirm that this is the only relevant difference between v6 and v8.
Post by Egor Kobylkin
* Only the locales already having 'include .*translit.*;""' are patched
(see the list for manual exclusions below, full list of included locales
at the end of the email in the commit section.)
Has this list changed, that is, has any locale been added or removed?
Post by Egor Kobylkin
* Excluded az_AZ completely to avoid circular reference from tr_TR via
“copy "tr_TR"”.
True, this is another difference and I hope this is correct (I have not
yet tested).
Post by Egor Kobylkin
* Locales removed from the patch: C and sd_PK.
* Added locales: az_AZ and ky_KG.
Correct.
Post by Egor Kobylkin
* Consistently transliterate single uppercase Cyrillic letters
to sequences of all uppercase Latin letters in all languages (whenever
a Cyrillic letter is transliterated to more than one Latin letter),
for example "Ї" is now transliterated as "YI" rather than "Yi".
I think you have not yet explained whether this is required by any existing
standard (please provide links) or whether this is your genuine idea to
distinguish between the cases like "Ш" transliterated to "Sh" and "Сх"
also transliterated to "Sh".

Again, I have not yet started reviewing and testing, this is just a feedback
after applying the patch locally.

Regards,

Rafal
Egor Kobylkin
2018-11-02 23:27:10 UTC
Permalink
Moving everybody from To: and CC: on BCC. It seems at this stage it is
Rafal and me. It is still going to libc-alpha and libc-locales. If you
are interested to be put back on CC - please let me know.
Post by Rafal Luzynski
* Consistently transliterate single uppercase Cyrillic letters to
sequences of all uppercase Latin letters in all languages
(whenever a Cyrillic letter is transliterated to more than one
Latin letter), for example "Ї" is now transliterated as "YI" rather
than "Yi".
I think you have not yet explained whether this is required by any
existing standard (please provide links) or whether this is your
genuine idea to distinguish between the cases like "Ш" transliterated > to "Sh" and
"Сх" also transliterated to "Sh".

I remember seeing this form of the capitalization it in actual
transliterated texts long time ago but can't find a formal description
as of now. Just don't want to claim this to be my original idea.
Post by Rafal Luzynski
The choice for YO, SH, YA, ZH etc. is to avoid naming collisions for
With SH:"Схема"->"Shema" but "Шема"->"SHema"
With Sh:"Схема"->"Shema" and "Шема"->"Shema". Collision!
This is important e.g. for renaming files, grouping as in using uniq >> etc.
As for the users - I am a user and I have demonstrated the use cases
where the collisions due to "one symbol capitalization" would cause
irreversible damage to data. For a library like glibc this seems like a
relevant issue to consider.

The "two symbol capitalization" on the other hand would prevent
collision and can be easily corrected in the userspace if needed
with something like

foo="SHema"
foo="${foo:0:1}$(tr '[:upper:]' '[:lower:]' <<<${foo:1})"
echo "$f