Thread

  1. Re: BUG #1775: UTF8 to ISO does not convert the german

    Tatsuo Ishii <t-ishii@sra.co.jp> — 2005-07-20T08:27:26Z

    > Hi Tatsuo,
    > 
    > Am Mittwoch, 20. Juli 2005 um 01:00 schrieben Sie:
    > 
    > TI> conversion tables. So if german umlauts are converted fine, there's no
    > TI> reason the conversion for german sharp s does not work.
    > 
    > TI> Marcus,
    > 
    > TI> Can you give me the exact error message from PostgreSQL when the
    > TI> conversio failed?
    > 
    > Well - actually, there IS no error message, convert() just returns an
    > empty sting. In detail, here's what I did:
    > 
    > -Set up PostgreSQL (Ascii)
    > 
    > -Imported the opengeodb
    > http://sourceforge.net/project/showfiles.php?group_id=132421
    > As they offer a PostgreSQL dump, I chose this one.
    > 
    > -Imported the dump using pgAdminIII
    > 
    > -Created a view that returns all german cities with ZIP and
    > opengeodb-locationID:
    > 
    > CREATE OR REPLACE VIEW orte_de AS
    > SELECT code.text_val AS plz, code.loc_id, town.text_val AS ort
    > FROM geodb_hierarchies hi, geodb_textdata state, geodb_textdata town, geodb_textdata code
    > WHERE hi.id_lvl2 = state.loc_id AND state.text_val = 'DE'::text
    > AND state.text_type = 500100001 AND town.loc_id = hi.loc_id
    > AND town.text_type = 500100000 AND code.loc_id = town.loc_id
    > AND code.text_type = 500300000;
    > 
    > So now I've got three columns: "plz" (zip), "ort" (city) and loc_id.
    > Assuming I want to retrieve cites in the Hamburg / Hannover area...
    > (perfect for this task, as they have pretty strange city names there
    > :)
    > 
    > SELECT plz, loc_id, ort, convert(ort using utf_8_to_iso_8859_1) as
    > ort_conv
    > from orte_de
    > where plz between 20000 and 30000
    > order by ort_conv
    > 
    > This query returns empty values for "ort_conv" if "ort" contains a
    > sharp s.
    > Btw, it seems like it is the same for "" (&Auml;), have a look at
    > loc_id 25182.
    
    I see no problem with encoding conversion itself:
    
    utf8=# \encoding latin1
    utf8=# select * from t1;
     t  
    ----
     籖
    (1 row)
    
    > SELECT plz, loc_id, ort, convert(ort using utf_8_to_iso_8859_1) as
    > ort_conv
    > from orte_de
    > where plz between 20000 and 30000
    > order by ort_conv
    
    Problem here is the result of convert(ort using utf_8_to_iso_8859_1)
    is ISO-8859-1 but your database encoding is UTF-8, so the terminal
    treats ISO-8859-1 chars as UTF-8 which will result in unexpected
    characters displayed.
    
    I guess what you want to do is:
    
    SELECT plz, loc_id, ort from orte_de
    where plz between 20000 and 30000
    order by convert(ort using utf_8_to_iso_8859_1)
    --
    Tatsuo Ishii