Re: GB18030-2022 Support in PostgreSQL

Peter Eisentraut <peter@eisentraut.org>

From: Peter Eisentraut <peter@eisentraut.org>
To: John Naylor <johncnaylorls@gmail.com>, Chao Li <li.evan.chao@gmail.com>
Cc: pgsql-hackers@lists.postgresql.org, Tom Lane <tgl@sss.pgh.pa.us>, Andrew Dunstan <andrew@dunslane.net>
Date: 2025-08-12T19:41:47Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Generate EUC_CN mappings from gb18030-2022.ucm

  2. Update GB18030 encoding from version 2000 to 2022

  3. Generate GB18030 mappings from the Unicode Consortium's UCM file

On 12.08.25 06:57, John Naylor wrote:
> Before getting to that, I thought I'd bring this up to the community:
> 
> +# Copyright (C) 2000-2009, International Business Machines
> Corporation and others.
> +# All Rights Reserved.
> 
> The previous XML file didn't contain a copyright notice -- does anyone
> want to make a case for not checking unicode-org's source file into
> our tree because of this? The 2022 update changes it to
> 
> # Copyright (C) 2016 and later: Unicode, Inc. and others.
> # License & terms of use:http://www.unicode.org/copyright.html
> # Copyright (C) 2000-2012, International Business Machines Corporation
> and others.
> # All Rights Reserved.
> 
> ...and the above links tohttps://www.unicode.org/license.txt

Could we download this file on demand, like we do for the other input 
files for the conversion mappings?