Re: GB18030-2022 Support in PostgreSQL
Chao Li <li.evan.chao@gmail.com>
From: Chao Li <li.evan.chao@gmail.com>
To: John Naylor <johncnaylorls@gmail.com>
Cc: Peter Eisentraut <peter@eisentraut.org>,
pgsql-hackers@lists.postgresql.org, Tom Lane <tgl@sss.pgh.pa.us>,
Andrew Dunstan <andrew@dunslane.net>
Date: 2025-08-13T08:08:45Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Generate EUC_CN mappings from gb18030-2022.ucm
- 48566180efff 19 (unreleased) landed
-
Update GB18030 encoding from version 2000 to 2022
- 5334620eef8f 19 (unreleased) landed
-
Generate GB18030 mappings from the Unicode Consortium's UCM file
- cfa6cd29271e 19 (unreleased) landed
Attachments
- v1-0001-GB18030-Switch-to-using-gb-18030-2000.ucm.patch (text/plain) patch v1-0001
On 2025/8/13 15:20, Chao Li wrote: > > > Sounds good. Let me recreate the patch. > > Attached is the new patch. It downloads the UCM file in make: ``` Unicode % make gb18030_to_utf8.map wget -O gb-18030-2000.ucm --no-use-server-timestamps https://raw.githubusercontent.com/unicode-org/icu-data/d9d3a6ed27bb98a7106763e940258f0be8cd995b/charset/data/ucm/gb-18030-2000.ucm --2025-08-13 15:54:53-- https://raw.githubusercontent.com/unicode-org/icu-data/d9d3a6ed27bb98a7106763e940258f0be8cd995b/charset/data/ucm/gb-18030-2000.ucm HTTP request sent, awaiting response... 200 OK Length: 672885 (657K) [text/plain] Saving to: ‘gb-18030-2000.ucm’ gb-18030-2000.ucm 100%[=====================================>] 657.11K 2.78MB/s in 0.2s 2025-08-13 15:54:54 (2.78 MB/s) - ‘gb-18030-2000.ucm’ saved [672885/672885] '/usr/bin/perl' -I . UCS_to_GB18030.pl - Writing UTF8=>GB18030 conversion table: utf8_to_gb18030.map - Writing GB18030=>UTF8 conversion table: gb18030_to_utf8.map Unicode % git diff Unicode % ``` After regenerating the map files, there is no change found in the map files. Best regards, Chao Li (Evan) -------------------- HighGo Software Co., Ltd. https://www.highgo.com/