POSIX defines the behavior of back-references thus: The back-reference expression '\n' shall match the same (possibly empty) string of characters as was matched by a subexpression enclosed between "\(" and "\)" preceding the '\n'. As far as I can see, the back-reference is supposed to consider only the data characters matched by the referenced subexpression. However, because our engine copies the NFA constructed from the referenced subexpression, it effectively enforces any constraints therein, too. As an example, '(^.)\1' ought to match 'xx', or any other string starting with two occurrences of the same character; but in our code it does not, and indeed can't match anything, because the '^' anchor constraint is included in the backref's copied NFA. If POSIX intended that, you'd think they'd mention it. Perl for one doesn't act that way, so it's hard to conclude that this isn't a bug. Fix by modifying the backref's NFA immediately after it's copied from the reference, replacing all constraint arcs by EMPTY arcs so that the constraints are treated as automatically satisfied. This still allows us to enforce matching rules that depend only on the data characters; for example, in '(^\d+).*\1' the NFA matching step will still know that the backref can only match strings of digits. Perhaps surprisingly, this change does not affect the results of any of a rather large corpus of real-world regexes. Nonetheless, I would not consider back-patching it, since it's a clear compatibility break. Patch by me, reviewed by Joel Jacobson Discussion: https://postgr.es/m/661609.1614560029@sss.pgh.pa.us |
||
---|---|---|
.. | ||
authentication | ||
examples | ||
isolation | ||
kerberos | ||
ldap | ||
locale | ||
mb | ||
modules | ||
perl | ||
recovery | ||
regress | ||
ssl | ||
subscription | ||
Makefile | ||
README |
README
PostgreSQL tests ================ This directory contains a variety of test infrastructure as well as some of the tests in PostgreSQL. Not all tests are here -- in particular, there are more in individual contrib/ modules and in src/bin. Not all these tests get run by "make check". Check src/test/Makefile to see which tests get run automatically. authentication/ Tests for authentication (but see also below) examples/ Demonstration programs for libpq that double as regression tests via "make check" isolation/ Tests for concurrent behavior at the SQL level kerberos/ Tests for Kerberos/GSSAPI authentication and encryption ldap/ Tests for LDAP-based authentication locale/ Sanity checks for locale data, encodings, etc mb/ Tests for multibyte encoding (UTF-8) support modules/ Extensions used only or mainly for test purposes, generally not suitable for installing in production databases perl/ Infrastructure for Perl-based TAP tests recovery/ Test suite for recovery and replication regress/ PostgreSQL's main regression test suite, pg_regress ssl/ Tests to exercise and verify SSL certificate handling subscription/ Tests for logical replication