Re: BUG #1721: mutiple bytes character string comaprison
Tatsuo Ishii <t-ishii@sra.co.jp>
From: Tatsuo Ishii <t-ishii@sra.co.jp>
To: pgman@candle.pha.pa.us
Cc: tgl@sss.pgh.pa.us, books@ejurka.com, cdliou@mail.cyut.edu.tw, pgsql-bugs@postgresql.org
Date: 2005-06-20T22:37:12Z
Lists: pgsql-bugs
> Tom Lane wrote: > > Kris Jurka <books@ejurka.com> writes: > > > On Sun, 19 Jun 2005, Tom Lane wrote: > > >> Sorry, but UTF-8 encoding doesn't work properly on Windows (yet). > > >> Use some other database encoding. > > > > > Shouldn't we forbid its creation then? > > > > There was serious discussion of that before the 8.0 release, but > > we decided not to forbid it. Check the archives; I don't recall > > the reasoning at the moment. > > UTF8 encoding works with the C locale assuming you don't care about > ordering of the character set, e.g. Japanese. No, sometimes Japanese needs char ordering too and I think this is not a Windows only problem. The real problem is Unicode defines char orderes in totally random manner because Chinese/Japanese/Korean Kanji characters are "Unified" in Unicode. To solve the problem, we can use convert UTF8 to EUC_JP using CONVERT. See archives for more details. Or you can use Unicode locale only if your platform's locale database is not broken and you only use single locale. -- Tatsuo Ishii