Thread

  1. Re: Patch for collation using ICU

    Tatsuo Ishii <t-ishii@sra.co.jp> — 2005-05-08T13:08:27Z

    > > I don't buy it. If current conversion tables does the right 
    > > thing, why we need to replace. Or if conversion tables are 
    > > not correct, why don't you fix it? I think the rule of 
    > > character conversion will not change frequently, especially 
    > > for LATIN languages. Thus maintaining cost is not too high.
    > 
    > I never said we need to, but if we're going to implement ICU,
    > then we might as well go all the way.
    
    So you admit there's no benefit using ICU for replacing existing
    conversions?
    
    Besides ICU does not support all existing conversions, I think ICU has
    serious flaw for using conversion. If I understand correctly, ICU uses
    UNICODE internally to do the conversion. For example, to implement
    SJIS->EUC_JP conversion, ICU first converts SJIS to UNICODE then
    converts UNICODE to EUC_JP. Problem is these conversion is not roud
    trip(conversion between SJIS/EUC_JP and UNICODE will lose some
    information). Thus SJIS->EUC_JP->SJIS conversion using ICU does not
    preserve original text.
    --
    Tatsuo Ishii