Re: XMLSerialize: version and explicit XML declaration
Jim Jones <jim.jones@uni-muenster.de>
From: Jim Jones <jim.jones@uni-muenster.de>
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>
Date: 2024-09-30T08:08:34Z
Lists: pgsql-hackers
Attachments
- v1-0001-Add-XMLSerialize-version-and-explicit-XML-declara.patch (text/x-patch) patch v1-0001
Hi Tom On 25.09.24 18:02, Tom Lane wrote: > AFAICS, all we do with an embedded XML version string is pass it to > libxml2's xmlNewDoc(), which is the authority on whether it means > anything. I'd be inclined to do the same here. Thanks. I used xml_is_document(), which calls xmlNewDoc(), to check if the returned document is valid or not. It then decides if an unexpected version deserves an error or just a warning. Attached v1 with the first attempt to implement these features. ==== INCLUDING / EXCLUDING XMLDECLARATION (SQL/XML X078) ==== The flags INCLUDING XMLDECLARATION and EXCLUDING XMLDECLARATION include or remove the XML declaration in the XMLSerialize output of the given DOCUMENT or CONTENT, respectively. SELECT xmlserialize( DOCUMENT '<foo><bar>42</bar></foo>'::xml AS text INCLUDING XMLDECLARATION); xmlserialize --------------------------------------------------------------- <?xml version="1.0" encoding="UTF8"?><foo><bar>42</bar></foo> (1 row) SELECT xmlserialize( DOCUMENT '<?xml version="1.0" encoding="UTF-8"?><foo><bar>42</bar></foo>'::xml AS text EXCLUDING XMLDECLARATION); xmlserialize -------------------------- <foo><bar>42</bar></foo> (1 row) If omitted, the output will contain an XML declaration only if the given XML value had one. SELECT xmlserialize( DOCUMENT '<?xml version="1.0" encoding="UTF-8"?><foo><bar>42</bar></foo>'::xml AS text); xmlserialize ---------------------------------------------------------------- <?xml version="1.0" encoding="UTF-8"?><foo><bar>42</bar></foo> (1 row) SELECT xmlserialize( DOCUMENT '<foo><bar>42</bar></foo>'::xml AS text); xmlserialize -------------------------- <foo><bar>42</bar></foo> (1 row) ==== VERSION (SQL/XML X076)==== VERSION can be used to specify the version in the XML declaration of the serialized DOCUMENT or CONTENT. SELECT xmlserialize( DOCUMENT '<foo><bar>42</bar></foo>'::xml AS text VERSION '1.0' INCLUDING XMLDECLARATION); xmlserialize --------------------------------------------------------------- <?xml version="1.0" encoding="UTF8"?><foo><bar>42</bar></foo> (1 row) In case of XML values of type DOCUMENT, the version will be validated by libxml2's xmlNewDoc(), which will raise an error for invalid versions or a warning for unsupported ones. For CONTENT values no validation is performed. SELECT xmlserialize( DOCUMENT '<foo><bar>42</bar></foo>'::xml AS text VERSION '1.1' INCLUDING XMLDECLARATION); WARNING: line 1: Unsupported version '1.1' <?xml version="1.1" encoding="UTF8"?><foo><bar>42</bar></foo> ^ xmlserialize --------------------------------------------------------------- <?xml version="1.1" encoding="UTF8"?><foo><bar>42</bar></foo> (1 row) SELECT xmlserialize( DOCUMENT '<foo><bar>42</bar></foo>'::xml AS text VERSION '2.0' INCLUDING XMLDECLARATION); ERROR: Invalid XML declaration: VERSION '2.0' SELECT xmlserialize( CONTENT '<foo><bar>42</bar></foo>'::xml AS text VERSION '2.0' INCLUDING XMLDECLARATION); xmlserialize --------------------------------------------------------------- <?xml version="2.0" encoding="UTF8"?><foo><bar>42</bar></foo> (1 row) This option is ignored if the XML value had no XML declaration and INCLUDING XMLDECLARATION was not used. SELECT xmlserialize( CONTENT '<foo><bar>42</bar></foo>'::xml AS text VERSION '1111'); xmlserialize -------------------------- <foo><bar>42</bar></foo> (1 row) Best, Jim