Re: Regression with large XML data input
Erik Wienhold <ewie@ewie.name>
From: Erik Wienhold <ewie@ewie.name>
To: Michael Paquier <michael@paquier.xyz>
Cc: Postgres hackers <pgsql-hackers@lists.postgresql.org>, Tom Lane <tgl@sss.pgh.pa.us>
Date: 2025-07-24T19:01:11Z
Lists: pgsql-hackers
Attachments
- 0001-Fix-xml2-regression-v2.patch (text/plain)
On 2025-07-24 05:12 +0200, Michael Paquier wrote: > Switching back to the previous code, where we rely on > xmlParseBalancedChunkMemory() fixes the issue. A quick POC is > attached. It fails one case in check-world with SERIALIZE because I > am not sure it is possible to pass down some options through > xmlParseBalancedChunkMemory(), still the regression is gone, and I am > wondering if there is not a better solution to be able to dodge the > original problem and still accept this case. The whitespace can be preserved by setting xmlKeepBlanksDefault before parsing. See attached v2. That function is deprecated, though. But libxml2 uses thread-local globals, so it should be safe. Other than that, I see no other way to set XML_PARSE_NOBLANKS with xmlParseBalancedChunkMemory. [1] https://gitlab.gnome.org/GNOME/libxml2/-/blob/408bd0e18e6ddba5d18e51d52da0f7b3ca1b4421/parserInternals.c#L2833 -- Erik Wienhold