Re: GB18030-2022 Support in PostgreSQL
Chao Li <li.evan.chao@gmail.com>
From: Chao Li <li.evan.chao@gmail.com>
To: John Naylor <johncnaylorls@gmail.com>
Cc: Peter Eisentraut <peter@eisentraut.org>,
pgsql-hackers@lists.postgresql.org,
Tom Lane <tgl@sss.pgh.pa.us>,
Andrew Dunstan <andrew@dunslane.net>
Date: 2025-09-29T10:36:27Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Generate EUC_CN mappings from gb18030-2022.ucm
- 48566180efff 19 (unreleased) landed
-
Update GB18030 encoding from version 2000 to 2022
- 5334620eef8f 19 (unreleased) landed
-
Generate GB18030 mappings from the Unicode Consortium's UCM file
- cfa6cd29271e 19 (unreleased) landed
> On Sep 29, 2025, at 17:32, John Naylor <johncnaylorls@gmail.com> wrote: > > On Wed, Sep 24, 2025 at 4:18 PM Chao Li <li.evan.chao@gmail.com> wrote: >> >> I found that both EUC_CN and UHC use the same XML file, so I updated both. > > When you say "same file", that implies to me the file we have checked > in our repo. They have different names and the UHC file is downloaded > on demand, so it doesn't seem like we need to change UHC at all to > delete gb-18030-2000.xml. Is that right? > > -- > John Naylor > Amazon Web Services “same file" was a mistake. windows-949-2000.ucm is a different file from gb-18030-2000(2022).ucm. In theory, we don’t need to change UHC if our goal is to delete gb-18030-2000.xml. However, as you can see, with switching to use ucm, UHC, EUC_CN and GB18030 now share the same download URL in the Makefile, and their perl scripts use the same logic to process UCM files, so I think it would be good for maintenance. Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/