Re: [COMMITTERS] pgsql: Force strings passed to and from plperl to be in UTF8 encoding.
Alex Hunsaker <badalex@gmail.com>
From: Alex Hunsaker <badalex@gmail.com>
To: Andrew Dunstan <andrew@dunslane.net>
Cc: PostgreSQL-development <pgsql-hackers@postgresql.org>
Date: 2011-02-12T09:18:36Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Force strings passed to and from plperl to be in UTF8 encoding.
- 50d89d422f9c 9.1.0 cited
Attachments
- plperl_utf8_mbverify.patch (text/x-patch) patch
On Sun, Feb 6, 2011 at 15:31, Andrew Dunstan <andrew@dunslane.net> wrote: > Force strings passed to and from plperl to be in UTF8 encoding. > > String are converted to UTF8 on the way into perl and to the > database encoding on the way back. This avoids a number of > observed anomalies, and ensures Perl a consistent view of the > world. So I noticed a problem while playing with this in my discussion with David Wheeler. pg_do_encoding() does nothing when the src encoding == the dest encoding. That means on a UTF-8 database we fail make sure our strings are valid utf8. An easy way to see this is to embed a null in the middle of a string: => create or replace function zerob() returns text as $$ return "abcd\0efg"; $$ language plperl; => SELECT zerob(); abcd Also It seems bogus to bogus to do any encoding conversion when we are SQL_ASCII, and its really trivial to fix. With the attached: - when we are on a utf8 database make sure to verify our output string in sv2cstr (we assume database strings coming in are already valid) - Do no string conversion when we are SQL_ASCII in or out - add plperl_helpers.h as a dep to plperl.o in our makefile - remove some redundant calls to pg_verify_mbstr() - as utf_e2u only as one caller dont pstrdup() instead have the caller check (saves some cycles and memory)