Re: Support UTF-8 files with BOM in COPY FROM
David Wheeler <david@kineticode.com>
From: "David E. Wheeler" <david@kineticode.com>
To: Itagaki Takahiro <itagaki.takahiro@gmail.com>
Cc: PostgreSQL Hackers <pgsql-hackers@postgresql.org>
Date: 2011-09-26T06:14:03Z
Lists: pgsql-hackers
On Sep 25, 2011, at 9:58 PM, Itagaki Takahiro wrote: > I'd like to support UTF-8 text or csv files that has BOM (byte order mark) > in COPY FROM command. BOM will be automatically detected and ignored > if the file encoding is UTF-8. WIP patch attached. By my reading of http://unicode.org/faq/utf_bom.html#bom5, I'd say +1 So I think what you propose makes sense. > I'm thinking about only COPY FROM for reads, but if someone wants to add > BOM in COPY TO, we might also support COPY TO WITH BOM for writes. I think it would have to be optional, since "some recipients of UTF-8 encoded data do not expect a BOM." Best, David