Re: Invalid byte sequence for encoding "UTF8", caused due to non wide-char-aware downcase_truncate_identifier() function on WINDOWS

Tom Lane <tgl@sss.pgh.pa.us>

From: Tom Lane <tgl@sss.pgh.pa.us>

To: Robert Haas <robertmhaas@gmail.com>

Cc: Jeevan Chalke <jeevan.chalke@enterprisedb.com>, pgsql-hackers@postgresql.org

Date: 2011-06-09T14:07:29Z

Lists: pgsql-hackers

Robert Haas <robertmhaas@gmail.com> writes:
> But now that I re-think about it, I guess what I'm confused about is
> this code here:

>                 if (ch >= 'A' && ch <= 'Z')
>                         ch += 'a' - 'A';
>                 else if (IS_HIGHBIT_SET(ch) && isupper(ch))
>                         ch = tolower(ch);
>                 result[i] = (char) ch;

The expected behavior there is that case-folding of non-ASCII characters
will occur in single-byte encodings but nothing will happen to
multi-byte characters.  We are relying on isupper() to not return true
when presented with a character fragment in a multibyte locale.

			regards, tom lane