Re: GB18030-2022 Support in PostgreSQL

wenhui qiu <qiuwenhuifx@gmail.com>

From: wenhui qiu <qiuwenhuifx@gmail.com>
To: JiaoShuntian <jiaoshuntian@highgo.com>
Cc: pgsql-hackers@lists.postgresql.org
Date: 2025-08-04T09:34:48Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Generate EUC_CN mappings from gb18030-2022.ucm

  2. Update GB18030 encoding from version 2000 to 2022

  3. Generate GB18030 mappings from the Unicode Consortium's UCM file

Hi
    😂,Not long ago, many people were rushing to remove this character set
because of a security vulnerability. I was honestly quite shocked when I
saw it.


Thanks

On Mon, Aug 4, 2025 at 4:08 PM JiaoShuntian <jiaoshuntian@highgo.com> wrote:

> Hi hackers,
>
> I noticed that PostgreSQL currently supports GB18030 encoding based on the
> older GB18030-2000 standard (as seen in commits like extend GB18030
> conversion
> <https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=...>).
> However, China has since updated its mandatory character set standard
> to GB18030-2022, which includes additional characters and stricter
> compliance requirements.GB18030-2022 is now the official standard in China,
> and ensuring PostgreSQL’s full compliance would be beneficial for users in
> Chinese-speaking regions.
>
> I would like to ask:
>
> Are there any plans to upgrade PostgreSQL’s GB18030 support to the 2022
> version?Would the community be open to contributions in this area?
>
> Best regards,
>
>
> JiaoShuntian
>
> HighGo Inc.
>