Commit Graph

7 Commits

Author SHA1 Message Date
Alvaro Herrera e474c2b7e4 Set correct context for XPath evaluation
According to the SQL standard, the context of XMLTABLE's XPath
row_expression is the document node of the XML input document, not the
root node.  This becomes visible when a relative path rather than
absolute is used as row expression.  Absolute paths is what was used in
original tests and docs (and the most common form used in examples
throughout the interwebs), which explains why this wasn't noticed
before.

Other functions such as xpath() and xpath_exists() also have this
problem.  While not specified by the SQL standard, it would be pretty
odd to leave those functions to behave differently than XMLTABLE, so
change them too.  However, this is a backwards-incompatible change.

No backpatch, out of fear of breaking code depending on the original
broken behavior.

Author: Markus Winand
Reported-By: Markus Winand
Reviewed-by: Álvaro Herrera
Discussion: https://postgr.es/m/0684A598-002C-42A2-AE12-F024A324EAE4@winand.at
2018-06-21 15:56:11 -04:00
Alvaro Herrera b7f0be9a7e Accept TEXT and CDATA nodes in XMLTABLE's column_expression.
Column expressions that match TEXT or CDATA nodes must return the
contents of the nodes themselves, not the content of non-existing
children (i.e. the empty string).

Author: Markus Winand
Reported-by: Markus Winand
Reviewed-by: Álvaro Herrera
Discussion: https://postgr.es/m/0684A598-002C-42A2-AE12-F024A324EAE4@winand.at
2018-06-20 12:58:12 -04:00
Noah Misch 4b865aee25 Fix previous commit's test, for non-UTF8 databases with non-XML builds.
To ensure stable output, catch one more configuration-specific error.
Back-patch to 9.3, like the commit that added the test.
2017-11-11 13:07:46 -08:00
Noah Misch 2918fcedbf Ignore XML declaration in xpath_internal(), for UTF8 databases.
When a value contained an XML declaration naming some other encoding,
this function interpreted UTF8 bytes as the named encoding, yielding
mojibake.  xml_parse() already has similar logic.  This would be
necessary but not sufficient for non-UTF8 databases, so preserve
behavior there until the xpath facility can support such databases
comprehensively.  Back-patch to 9.3 (all supported versions).

Pavel Stehule and Noah Misch

Discussion: https://postgr.es/m/CAFj8pRC-dM=tT=QkGi+Achkm+gwPmjyOayGuUfXVumCxkDgYWg@mail.gmail.com
2017-11-11 11:10:53 -08:00
Alvaro Herrera fcec6caafa Support XMLTABLE query expression
XMLTABLE is defined by the SQL/XML standard as a feature that allows
turning XML-formatted data into relational form, so that it can be used
as a <table primary> in the FROM clause of a query.

This new construct provides significant simplicity and performance
benefit for XML data processing; what in a client-side custom
implementation was reported to take 20 minutes can be executed in 400ms
using XMLTABLE.  (The same functionality was said to take 10 seconds
using nested PostgreSQL XPath function calls, and 5 seconds using
XMLReader under PL/Python).

The implemented syntax deviates slightly from what the standard
requires.  First, the standard indicates that the PASSING clause is
optional and that multiple XML input documents may be given to it; we
make it mandatory and accept a single document only.  Second, we don't
currently support a default namespace to be specified.

This implementation relies on a new executor node based on a hardcoded
method table.  (Because the grammar is fixed, there is no extensibility
in the current approach; further constructs can be implemented on top of
this such as JSON_TABLE, but they require changes to core code.)

Author: Pavel Stehule, Álvaro Herrera
Extensively reviewed by: Craig Ringer
Discussion: https://postgr.es/m/CAFj8pRAgfzMD-LoSmnMGybD0WsEznLHWap8DO79+-GTRAPR4qA@mail.gmail.com
2017-03-08 12:40:26 -03:00
Kevin Grittner e2f1765ce0 Remove xmlparse(document '') test
This one test was behaving differently between the ubuntu fix for
CVE-2015-7499 and the base "expected" file.  It's not worth having
yet another version of the expected file for this test, so drop it.
Perhaps at some point when all distros have settled down to the
same behavior on this test, it can be restored.

Problem found by me on libxml2 (2.9.1+dfsg1-3ubuntu4.6).
Solution suggested by Tom Lane.
Backpatch to 9.5, where the test was added.
2015-12-14 11:37:26 -06:00
Tom Lane 085423e3e3 Add an expected-file to match behavior of latest libxml2.
Recent releases of libxml2 do not provide error context reports for errors
detected at the very end of the input string.  This appears to be a bug, or
at least an infelicity, introduced by the fix for libxml2's CVE-2015-7499.
We can hope that this behavioral change will get undone before too long;
but the security patch is likely to spread a lot faster/further than any
follow-on cleanup, which means this behavior is likely to be present in the
wild for some time to come.  As a stopgap, add a variant regression test
expected-file that matches what you get with a libxml2 that acts this way.
2015-12-11 19:09:04 -05:00