Re: Support UTF-8 files with BOM in COPY FROM
Robert Haas <robertmhaas@gmail.com>
From: Robert Haas <robertmhaas@gmail.com>
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: Tatsuo Ishii <ishii@postgresql.org>, david@kineticode.com, itagaki.takahiro@gmail.com, pgsql-hackers@postgresql.org
Date: 2011-09-26T17:34:23Z
Lists: pgsql-hackers
On Mon, Sep 26, 2011 at 1:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> The thing that makes me doubt that is this comment from Tatsuo Ishii: >> TI> COPY explicitly specifies the encoding (to be UTF-8 in this case). So >> TI> I think we should not regard U+FEFF as "BOM" in COPY, rather we should >> TI> regard U+FEFF as "ZERO WIDTH NO-BREAK SPACE". > > Yeah, that's a reasonable argument for rejecting the patch altogether. Yeah, or for making the behavior optional. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company