Re: Support UTF-8 files with BOM in COPY FROM

Tom Lane <tgl@sss.pgh.pa.us>

From: Tom Lane <tgl@sss.pgh.pa.us>
To: Peter Eisentraut <peter_e@gmx.net>
Cc: Robert Haas <robertmhaas@gmail.com>, Tatsuo Ishii <ishii@postgresql.org>, david@kineticode.com, itagaki.takahiro@gmail.com, pgsql-hackers@postgresql.org
Date: 2011-09-27T14:32:03Z
Lists: pgsql-hackers
Peter Eisentraut <peter_e@gmx.net> writes:
> Alternative consideration: We could allow this in CSV format if we made
> users quote the first value if it starts with a BOM.  This might be a
> reasonable way to get MS compatibility.

I don't think we can get away with a retroactive restriction on the
contents of data files.

If we're going to do this at all, I still think an explicit BOM option
for COPY, to either eat (and require) a BOM on input or emit a BOM on
output, would be the sanest way.  None of the "automatic" approaches
seem safe to me.

			regards, tom lane