Re: [PATCH] Reject ENCODING option for COPY TO FORMAT JSON
Andrew Dunstan <andrew@dunslane.net>
From: Andrew Dunstan <andrew@dunslane.net>
To: Ayush Tiwari <ayushtiwari.slg01@gmail.com>
Cc: pgsql-hackers@postgresql.org, Tom Lane <tgl@sss.pgh.pa.us>
Date: 2026-05-04T14:19:21Z
Lists: pgsql-hackers
Attachments
On 2026-04-29 We 12:49 PM, Ayush Tiwari wrote: > Hi, > > On Mon, 20 Apr 2026 at 20:31, Ayush Tiwari > <ayushtiwari.slg01@gmail.com> wrote: > > Hi, > > > On Mon, 20 Apr 2026 at 19:09, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Seems to me the correct thing here is to make it work like > the other > cases, ie perform pg_server_to_any(). I have exactly no > sympathy for > the argument about the RFC saying it must be UTF-8, not > least because > that's not in fact what is implemented (what if the server > encoding > isn't UTF-8?). > > > Agreed. I initially thought rejecting the option was the safer > route > given the RFC, but as you pointed out, we aren't enforcing > UTF-8 strictly on the server side anyway. > > > Rejecting this option altogether doesn't improve anything, not > functionally, not specs-compliance-wise, nor according to the > principle of least surprise. > > Makes sense. Implementing the conversion properly > keeps JSON format consistent with how the text and CSV formats > behave. > > > No, you don't get to punt this till later. Once we ship > v19 there's > going to be a strong expectation of backwards compatibility. > > The idea of sending UTF-8 to a client that's set > client_encoding to > something else would be risible, if it weren't a security > hazard. > > > I agree sending unconverted bytes to a mismatched > client encoding is clearly a security hazard that needs > addressing. Did > not consider the backward compatibility part, my bad. > > Was trying out adding pg_server_to_any() to the json_buf after > composite_to_json() returns, > correctly covering both explicit ENCODING option > specifications and > implicit client_encoding mismatches. > > Let me send a patch with code and associated test cases. > > Attached patch with round trip test case. Please review and let me > know if it's in the right direction. > > > I have registered this patch set in the CommitFest for tracking: > https://commitfest.postgresql.org/patch/6700/ > > Please let me know if the patch looks good, and if I need to add it > in the open items list for PG 19. > > Basically good, I think. I have modified your test a bit, testing more directly for the presence of the LATIN-1 encoded character and the absence of the UTF-8 encoded character, by reading in the file with pg_read_binary_file, and adding a test for implicit encoding by setting client_encoding. cheers andrew -- Andrew Dunstan EDB:https://www.enterprisedb.com