Re: GB18030-2022 Support in PostgreSQL

John Naylor <johncnaylorls@gmail.com>

From: John Naylor <johncnaylorls@gmail.com>
To: Chao Li <li.evan.chao@gmail.com>
Cc: Peter Eisentraut <peter@eisentraut.org>, pgsql-hackers@lists.postgresql.org, Tom Lane <tgl@sss.pgh.pa.us>, Andrew Dunstan <andrew@dunslane.net>
Date: 2025-09-29T09:32:15Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Generate EUC_CN mappings from gb18030-2022.ucm

  2. Update GB18030 encoding from version 2000 to 2022

  3. Generate GB18030 mappings from the Unicode Consortium's UCM file

On Wed, Sep 24, 2025 at 4:18 PM Chao Li <li.evan.chao@gmail.com> wrote:
>
> I found that both EUC_CN and UHC use the same XML file, so I updated both.

When you say "same file", that implies to me the file we have checked
in our repo. They have different names and the UHC file is downloaded
on demand, so it doesn't seem like we need to change UHC at all to
delete gb-18030-2000.xml. Is that right?

-- 
John Naylor
Amazon Web Services