Re: Support UTF-8 files with BOM in COPY FROM
Magnus Hagander <magnus@hagander.net>
From: Magnus Hagander <magnus@hagander.net>
To: Itagaki Takahiro <itagaki.takahiro@gmail.com>
Cc: PostgreSQL Hackers <pgsql-hackers@postgresql.org>
Date: 2011-09-26T11:12:42Z
Lists: pgsql-hackers
On Mon, Sep 26, 2011 at 06:58, Itagaki Takahiro <itagaki.takahiro@gmail.com> wrote: > Hi, > > I'd like to support UTF-8 text or csv files that has BOM (byte order mark) > in COPY FROM command. BOM will be automatically detected and ignored > if the file encoding is UTF-8. WIP patch attached. > > I'm thinking about only COPY FROM for reads, but if someone wants to add > BOM in COPY TO, we might also support COPY TO WITH BOM for writes. > > Comments welcome. I like it in general. But if we're looking at the BOM, shouldn't we also look and *reject* the file if it's a BOM for a non-UTF8 file? Say if the BOM claims it's UTF16? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/