Commit Graph

3198 Commits

Author SHA1 Message Date
Tom Lane 7107d58ec5 Fix misplacement of submake-generated-headers prerequisites.
The sequence "configure; cd src/pl/plpython; make -j" failed due to
trying to compile plpython's .o files before the generated headers
finished building.  (This is an important real-world case, since it's
the typical second step when building both plpython2 and plpython3.)
This happens because the submake-generated-headers target is not
placed in a way to make it a prerequisite to compiling the .o files.
Fix that.

Checking other uses of submake-generated-headers, I noted that the one
attached to pg_regress was similarly misplaced; but it's actually not
needed at all for pg_regress.o, rather regress.o, so move it to be a
prerequisite of that.

Back-patch to 9.6 where submake-generated-headers was introduced
(by commit 548af97fc).  It's not immediately clear to me why the
previous coding didn't have the same issue; but since we've not
had field reports of plpython make failing, leave it alone in the
older branches.

Pavel Raiskup and Tom Lane

Discussion: <1925924.izSMJEZO3x@unused-4-107.brq.redhat.com>
2016-10-01 13:35:13 -04:00
Peter Eisentraut a4327296df Set log_line_prefix and application name in test drivers
Before pg_regress runs psql, set the application name to the test name.
Similarly, set the application name to the test file name in the TAP
tests.  Also, set a default log_line_prefix that show the application
name, as well as the PID and a time stamp.

That way, the server log output can be correlated to the test input
files, making debugging a bit easier.
2016-09-30 21:32:33 -04:00
Tom Lane d3cd36a133 Make to_timestamp() and to_date() range-check fields of their input.
Historically, something like to_date('2009-06-40','YYYY-MM-DD') would
return '2009-07-10' because there was no prohibition on out-of-range
month or day numbers.  This has been widely panned, and it also turns
out that Oracle throws an error in such cases.  Since these functions
are nominally Oracle-compatibility features, let's change that.

There's no particular restriction on year (modulo the fact that the
scanner may not believe that more than 4 digits are year digits,
a matter to be addressed separately if at all).  But we now check month,
day, hour, minute, second, and fractional-second fields, as well as
day-of-year and second-of-day fields if those are used.

Currently, no checks are made on ISO-8601-style week numbers or day
numbers; it's not very clear what the appropriate rules would be there,
and they're probably so little used that it's not worth sweating over.

Artur Zakirov, reviewed by Amul Sul, further adjustments by me

Discussion: <1873520224.1784572.1465833145330.JavaMail.yahoo@mail.yahoo.com>
See-Also: <57786490.9010201@wars-nicht.de>
2016-09-28 14:36:17 -04:00
Tom Lane 72daabc7a3 Disallow pushing volatile quals past set-returning functions.
Pushing an upper-level restriction clause into an unflattened
subquery-in-FROM is okay when the subquery contains no SRFs in its
targetlist, or when it does but the SRFs are unreferenced by the clause
*and the clause is not volatile*.  Otherwise, we're changing the number
of times the clause is evaluated, which is bad for volatile quals, and
possibly changing the result, since a volatile qual might succeed for some
SRF output rows and not others despite not referencing any of the changing
columns.  (Indeed, if the clause is something like "random() > 0.5", the
user is probably expecting exactly that behavior.)

We had most of these restrictions down, but not the one about the upper
clause not being volatile.  Fix that, and add a regression test to
illustrate the expected behavior.

Although this is definitely a bug, it doesn't seem like back-patch
material, since possibly some users don't realize that the broken
behavior is broken and are relying on what happens now.  Also, while
the added test is quite cheap in the wake of commit a4c35ea1c, it would
be much more expensive (or else messier) in older branches.

Per report from Tom van Tilburg.

Discussion: <CAP3PPDiucxYCNev52=YPVkrQAPVF1C5PFWnrQPT7iMzO1fiKFQ@mail.gmail.com>
2016-09-27 18:43:36 -04:00
Tom Lane da6c4f6ca8 Refer to OS X as "macOS", except for the port name which is still "darwin".
We weren't terribly consistent about whether to call Apple's OS "OS X"
or "Mac OS X", and the former is probably confusing to people who aren't
Apple users.  Now that Apple has rebranded it "macOS", follow their lead
to establish a consistent naming pattern.  Also, avoid the use of the
ancient project name "Darwin", except as the port code name which does not
seem desirable to change.  (In short, this patch touches documentation and
comments, but no actual code.)

I didn't touch contrib/start-scripts/osx/, either.  I suspect those are
obsolete and due for a rewrite, anyway.

I dithered about whether to apply this edit to old release notes, but
those were responsible for quite a lot of the inconsistencies, so I ended
up changing them too.  Anyway, Apple's being ahistorical about this,
so why shouldn't we be?
2016-09-25 15:40:57 -04:00
Peter Eisentraut 8b845520fb Add tests for various connection string issues
Add tests for consistent support of connection strings in frontend
programs as well as proper handling of unusual characters in database
and user names.  These tests were developed for the issues of
CVE-2016-5424.

To allow testing of names with spaces, change the pg_regress
command-line options --create-role and --dbname to split their arguments
by comma only, not space or comma as before.  Only commas were actually
used in existing uses.

Noah Misch, Michael Paquier, Peter Eisentraut
2016-09-22 12:00:00 -04:00
Peter Eisentraut 656df624c0 Add overflow checks to money type input function
The money type input function did not have any overflow checks at all.
There were some regression tests that purported to check for overflow,
but they actually checked for the overflow behavior of the int8 type
before casting to money.  Remove those unnecessary checks and add some
that actually check the money input function.

Reviewed-by: Fabien COELHO <coelho@cri.ensmp.fr>
2016-09-14 12:00:00 -05:00
Tom Lane 0dac5b5174 Tweak targetlist-SRF tests some more.
Seems like it would be good to have a test case documenting the
existing behavior for non-top-level SRFs.
2016-09-14 19:48:50 -04:00
Tom Lane a163c006ca Tweak targetlist-SRF tests.
Add a test case showing that we don't support SRFs in window-function
arguments.  Remove a duplicate test case for SRFs in aggregate arguments.
2016-09-14 14:30:40 -04:00
Tom Lane a4c35ea1c2 Improve parser's and planner's handling of set-returning functions.
Teach the parser to reject misplaced set-returning functions during parse
analysis using p_expr_kind, in much the same way as we do for aggregates
and window functions (cf commit eaccfded9).  While this isn't complete
(it misses nesting-based restrictions), it's much better than the previous
error reporting for such cases, and it allows elimination of assorted
ad-hoc expression_returns_set() error checks.  We could add nesting checks
later if it seems important to catch all cases at parse time.

There is one case the parser will now throw error for although previous
versions allowed it, which is SRFs in the tlist of an UPDATE.  That never
behaved sensibly (since it's ill-defined which generated row should be
used to perform the update) and it's hard to see why it should not be
treated as an error.  It's a release-note-worthy change though.

Also, add a new Query field hasTargetSRFs reporting whether there are
any SRFs in the targetlist (including GROUP BY/ORDER BY expressions).
The parser can now set that basically for free during parse analysis,
and we can use it in a number of places to avoid expression_returns_set
searches.  (There will be more such checks soon.)  In some places, this
allows decontorting the logic since it's no longer expensive to check for
SRFs in the tlist --- so I made the checks parallel to the handling of
hasAggs/hasWindowFuncs wherever it seemed appropriate.

catversion bump because adding a Query field changes stored rules.

Andres Freund and Tom Lane

Discussion: <24639.1473782855@sss.pgh.pa.us>
2016-09-13 13:54:24 -04:00
Andres Freund 0dba54f166 Remove user_relns() SRF from regression tests.
The output of the function changes whenever previous (or, as in this
case, concurrent) tests leave a table in place. That causes unneeded
churn.

This should fix failures due to the tests added bfe16d1a5, like on
lapwing, caused by the tsrf test running concurrently with misc. Those
could also have been addressed by using temp tables, but that test has
annoyed me before.

Discussion: <27626.1473729905@sss.pgh.pa.us>
2016-09-12 19:37:16 -07:00
Andres Freund 9f478b4f19 Address portability issues in bfe16d1a5 test output. 2016-09-12 18:15:10 -07:00
Andres Freund bfe16d1a5d Add more tests for targetlist SRFs.
We're considering changing the implementation of targetlist SRFs
considerably, and a lot of the current behaviour isn't tested in our
regression tests. Thus it seems useful to increase coverage to avoid
accidental behaviour changes.

It's quite possible that some of the plans here will require adjustments
to avoid falling afoul of ordering differences (e.g. hashed group
bys). The buildfarm will tell us.

Reviewed-By: Tom Lane
Discussion: <20160827214829.zo2dfb5jaikii5nw@alap3.anarazel.de>
2016-09-12 17:27:47 -07:00
Alvaro Herrera 5c609a742f Fix locking a tuple updated by an aborted (sub)transaction
When heap_lock_tuple decides to follow the update chain, it tried to
also lock any version of the tuple that was created by an update that
was subsequently rolled back.  This is pointless, since for all intents
and purposes that tuple exists no more; and moreover it causes
misbehavior, as reported independently by Marko Tiikkaja and Marti
Raudsepp: some SELECT FOR UPDATE/SHARE queries may fail to return
the tuples, and assertion-enabled builds crash.

Fix by having heap_lock_updated_tuple test the xmin and return success
immediately if the tuple was created by an aborted transaction.

The condition where tuples become invisible occurs when an updated tuple
chain is followed by heap_lock_updated_tuple, which reports the problem
as HeapTupleSelfUpdated to its caller heap_lock_tuple, which in turn
propagates that code outwards possibly leading the calling code
(ExecLockRows) to believe that the tuple exists no longer.

Backpatch to 9.3.  Only on 9.5 and newer this leads to a visible
failure, because of commit 27846f02c176; before that, heap_lock_tuple
skips the whole dance when the tuple is already locked by the same
transaction, because of the ancient HeapTupleSatisfiesUpdate behavior.
Still, the buggy condition may also exist in more convoluted scenarios
involving concurrent transactions, so it seems safer to fix the bug in
the old branches too.

Discussion:
	https://www.postgresql.org/message-id/CABRT9RC81YUf1=jsmWopcKJEro=VoeG2ou6sPwyOUTx_qteRsg@mail.gmail.com
	https://www.postgresql.org/message-id/48d3eade-98d3-8b9a-477e-1a8dc32a724d@joh.to
2016-09-09 15:54:29 -03:00
Tom Lane 0ab9c56d0f Support renaming an existing value of an enum type.
Not much to be said about this patch: it does what it says on the tin.

In passing, rename AlterEnumStmt.skipIfExists to skipIfNewValExists
to clarify what it actually does.  In the discussion of this patch
we considered supporting other similar options, such as IF EXISTS
on the type as a whole or IF NOT EXISTS on the target name.  This
patch doesn't actually add any such feature, but it might happen later.

Dagfinn Ilmari Mannsåker, reviewed by Emre Hasegeli

Discussion: <CAO=2mx6uvgPaPDf-rHqG8=1MZnGyVDMQeh8zS4euRyyg4D35OQ@mail.gmail.com>
2016-09-07 16:11:56 -04:00
Tom Lane c54159d44c Make locale-dependent regex character classes work for large char codes.
Previously, we failed to recognize Unicode characters above U+7FF as
being members of locale-dependent character classes such as [[:alpha:]].
(Actually, the same problem occurs for large pg_wchar values in any
multibyte encoding, but UTF8 is the only case people have actually
complained about.)  It's impractical to get Spencer's original code to
handle character classes or ranges containing many thousands of characters,
because it insists on considering each member character individually at
regex compile time, whether or not the character will ever be of interest
at run time.  To fix, choose a cutoff point MAX_SIMPLE_CHR below which
we process characters individually as before, and deal with entire ranges
or classes as single entities above that.  We can actually make things
cheaper than before for chars below the cutoff, because the color map can
now be a simple linear array for those chars, rather than the multilevel
tree structure Spencer designed.  It's more expensive than before for
chars above the cutoff, because we must do a binary search in a list of
high chars and char ranges used in the regex pattern, plus call iswalpha()
and friends for each locale-dependent character class used in the pattern.
However, multibyte encodings are normally designed to give smaller codes
to popular characters, so that we can expect that the slow path will be
taken relatively infrequently.  In any case, the speed penalty appears
minor except when we have to apply iswalpha() etc. to high character codes
at runtime --- and the previous coding gave wrong answers for those cases,
so whether it was faster is moot.

Tom Lane, reviewed by Heikki Linnakangas

Discussion: <15563.1471913698@sss.pgh.pa.us>
2016-09-05 17:06:29 -04:00
Tom Lane 15bc038f9b Relax transactional restrictions on ALTER TYPE ... ADD VALUE.
To prevent possibly breaking indexes on enum columns, we must keep
uncommitted enum values from getting stored in tables, unless we
can be sure that any such column is new in the current transaction.

Formerly, we enforced this by disallowing ALTER TYPE ... ADD VALUE
from being executed at all in a transaction block, unless the target
enum type had been created in the current transaction.  This patch
removes that restriction, and instead insists that an uncommitted enum
value can't be referenced unless it belongs to an enum type created
in the same transaction as the value.  Per discussion, this should be
a bit less onerous.  It does require each function that could possibly
return a new enum value to SQL operations to check this restriction,
but there aren't so many of those that this seems unmaintainable.

Andrew Dunstan and Tom Lane

Discussion: <4075.1459088427@sss.pgh.pa.us>
2016-09-05 12:59:55 -04:00
Tom Lane c7f68bea22 Add regression test coverage for non-default timezone abbreviation sets.
After further reflection about the mess cleaned up in commit 39b691f25,
I decided the main bit of test coverage that was still missing was to
check that the non-default abbreviation-set files we supply are usable.
Add that.

Back-patch to supported branches, just because it seems like a good
idea to keep this all in sync.
2016-09-04 20:02:16 -04:00
Tom Lane a2d75b67bc Remove useless pg_strdup() operations.
split_to_stringlist() doesn't modify its first argument nor expect it
to remain valid after exit, so there's no need to duplicate the optarg
string at the call sites.  Per Coverity.  (This has been wrong all along,
but commit 052cc223d changed the useless calls from "strdup" to
"pg_strdup", which apparently made Coverity think it's a new bug.
It's not, but it's also not worth back-patching.)
2016-09-04 12:33:58 -04:00
Tom Lane 39b691f251 Don't require dynamic timezone abbreviations to match underlying time zone.
Previously, we threw an error if a dynamic timezone abbreviation did not
match any abbreviation recorded in the referenced IANA time zone entry.
That seemed like a good consistency check at the time, but it turns out
that a number of the abbreviations in the IANA database are things that
Olson and crew made up out of whole cloth.  Their current policy is to
remove such names in favor of using simple numeric offsets.  Perhaps
unsurprisingly, a lot of these made-up abbreviations have varied in meaning
over time, which meant that our commit b2cbced9e and later changes made
them into dynamic abbreviations.  So with newer IANA database versions
that don't mention these abbreviations at all, we fail, as reported in bug
#14307 from Neil Anderson.  It's worse than just a few unused-in-the-wild
abbreviations not working, because the pg_timezone_abbrevs view stops
working altogether (since its underlying function tries to compute the
whole view result in one call).

We considered deleting these abbreviations from our abbreviations list, but
the problem with that is that we can't stay ahead of possible future IANA
changes.  Instead, let's leave the abbreviations list alone, and treat any
"orphaned" dynamic abbreviation as just meaning the referenced time zone.
It will behave a bit differently than it used to, in that you can't any
longer override the zone's standard vs. daylight rule by using the "wrong"
abbreviation of a pair, but that's better than failing entirely.  (Also,
this solution can be interpreted as adding a small new feature, which is
that any abbreviation a user wants can be defined as referencing a time
zone name.)

Back-patch to all supported branches, since this problem affects all
of them when using tzdata 2016f or newer.

Report: <20160902031551.15674.67337@wrigleys.postgresql.org>
Discussion: <6189.1472820913@sss.pgh.pa.us>
2016-09-02 17:30:02 -04:00
Heikki Linnakangas 9cca11c915 Speed up SUM calculation in numeric aggregates.
This introduces a numeric sum accumulator, which performs better than
repeatedly calling add_var(). The performance comes from using wider digits
and delaying carry propagation, tallying positive and negative values
separately, and avoiding a round of palloc/pfree on every value. This
speeds up SUM(), as well as other standard aggregates like AVG() and
STDDEV() that also calculate a sum internally.

Reviewed-by: Andrey Borodin
Discussion: <c0545351-a467-5b76-6d46-4840d1ea8aa4@iki.fi>
2016-09-02 11:51:49 +03:00
Tom Lane 052cc223d5 Fix a bunch of places that called malloc and friends with no NULL check.
Where possible, use palloc or pg_malloc instead; otherwise, insert
explicit NULL checks.

Generally speaking, these are places where an actual OOM is quite
unlikely, either because they're in client programs that don't
allocate all that much, or they're very early in process startup
so that we'd likely have had a fork() failure instead.  Hence,
no back-patch, even though this is nominally a bug fix.

Michael Paquier, with some adjustments by me

Discussion: <CAB7nPqRu07Ot6iht9i9KRfYLpDaF2ZuUv5y_+72uP23ZAGysRg@mail.gmail.com>
2016-08-30 18:22:43 -04:00
Tom Lane 2533ff0aa5 Fix instability in parallel regression tests.
Commit f0c7b789a added a test case in case.sql that creates and then drops
both an '=' operator and the type it's for.  Given the right timing, that
can cause a "cache lookup failed for type" failure in concurrent sessions,
which see the '=' operator as a potential match for '=' in a query, but
then the type is gone by the time they inquire into its properties.
It might be nice to make that behavior more robust someday, but as a
back-patchable solution, adjust the new test case so that the operator
is never visible to other sessions.  Like the previous commit, back-patch
to all supported branches.

Discussion: <5983.1471371667@sss.pgh.pa.us>
2016-08-25 09:57:09 -04:00
Tom Lane 2c00fad286 Fix improper repetition of previous results from a hashed aggregate.
ExecReScanAgg's check for whether it could re-use a previously calculated
hashtable neglected the possibility that the Agg node might reference
PARAM_EXEC Params that are not referenced by its input plan node.  That's
okay if the Params are in upper tlist or qual expressions; but if one
appears in aggregate input expressions, then the hashtable contents need
to be recomputed when the Param's value changes.

To avoid unnecessary performance degradation in the case of a Param that
isn't within an aggregate input, add logic to the planner to determine
which Params are within aggregate inputs.  This requires a new field in
struct Agg, but fortunately we never write plans to disk, so this isn't
an initdb-forcing change.

Per report from Jeevan Chalke.  This has been broken since forever,
so back-patch to all supported branches.

Andrew Gierth, with minor adjustments by me

Report: <CAM2+6=VY8ykfLT5Q8vb9B6EbeBk-NGuLbT6seaQ+Fq4zXvrDcA@mail.gmail.com>
2016-08-24 14:38:12 -04:00
Tom Lane 77e2906821 Create an SP-GiST opclass for inet/cidr.
This seems to offer significantly better search performance than the
existing GiST opclass for inet/cidr, at least on data with a wide mix
of network mask lengths.  (That may suggest that the data splitting
heuristics in the GiST opclass could be improved.)

Emre Hasegeli, with mostly-cosmetic adjustments by me

Discussion: <CAE2gYzxtth9qatW_OAqdOjykS0bxq7AYHLuyAQLPgT7H9ZU0Cw@mail.gmail.com>
2016-08-23 15:16:30 -04:00
Robert Haas 86f31695f3 Add txid_current_ifassigned().
Add a variant of txid_current() that returns NULL if no transaction ID
is assigned.  This version can be used even on a standby server,
although it will always return NULL since no transaction IDs can be
assigned during recovery.

Craig Ringer, per suggestion from Jim Nasby.  Reviewed by Petr Jelinek
and by me.
2016-08-23 10:30:52 -04:00
Peter Eisentraut f9472d7256 Run select_parallel test by itself
Remove the plpgsql wrapping that hides the context.  So now the test
will fail if the work doesn't actually happen in a parallel worker.  Run
the test in its own test group to ensure it won't run out of resources
for that.
2016-08-22 12:00:00 -04:00
Tom Lane 8299471c37 Use LEFT JOINs in some system views in case referenced row doesn't exist.
In particular, left join to pg_authid so that rows in pg_stat_activity
don't disappear if the session's owning user has been dropped.
Also convert a few joins to pg_database to left joins, in the same spirit,
though that case might be harder to hit.  We were doing this in other
views already, so it was a bit inconsistent that these views didn't.

Oskari Saarenmaa, with some further tweaking by me

Discussion: <56E87CD8.60007@ohmu.fi>
2016-08-19 17:13:47 -04:00
Tom Lane cf9b0fea5f Implement regexp_match(), a simplified alternative to regexp_matches().
regexp_match() is like regexp_matches(), but it disallows the 'g' flag
and in consequence does not need to return a set.  Instead, it returns
a simple text array value, or NULL if there's no match.  Previously people
usually got that behavior with a sub-select, but this way is considerably
more efficient.

Documentation adjusted so that regexp_match() is presented first and then
regexp_matches() is introduced as a more complicated version.  This is
a bit historically revisionist but seems pedagogically better.

Still TODO: extend contrib/citext to support this function.

Emre Hasegeli, reviewed by David Johnston

Discussion: <CAE2gYzy42sna2ME_e3y1KLQ-4UBrB-eVF0SWn8QG39sQSeVhEw@mail.gmail.com>
2016-08-17 18:33:01 -04:00
Tom Lane 0bb51aa967 Improve parsetree representation of special functions such as CURRENT_DATE.
We implement a dozen or so parameterless functions that the SQL standard
defines special syntax for.  Up to now, that was done by converting them
into more or less ad-hoc constructs such as "'now'::text::date".  That's
messy for multiple reasons: it exposes what should be implementation
details to users, and performance is worse than it needs to be in several
cases.  To improve matters, invent a new expression node type
SQLValueFunction that can represent any of these parameterless functions.

Bump catversion because this changes stored parsetrees for rules.

Discussion: <30058.1463091294@sss.pgh.pa.us>
2016-08-16 20:33:01 -04:00
Peter Eisentraut f0fe1c8f70 Fix typos
From: Alexander Law <exclusion@gmail.com>
2016-08-16 14:52:29 -04:00
Tom Lane 9389fbd038 Remove bogus dependencies on NUMERIC_MAX_PRECISION.
NUMERIC_MAX_PRECISION is a purely arbitrary constraint on the precision
and scale you can write in a numeric typmod.  It might once have had
something to do with the allowed range of a typmod-less numeric value,
but at least since 9.1 we've allowed, and documented that we allowed,
any value that would physically fit in the numeric storage format;
which is something over 100000 decimal digits, not 1000.

Hence, get rid of numeric_in()'s use of NUMERIC_MAX_PRECISION as a limit
on the allowed range of the exponent in scientific-format input.  That was
especially silly in view of the fact that you can enter larger numbers as
long as you don't use 'e' to do it.  Just constrain the value enough to
avoid localized overflow, and let make_result be the final arbiter of what
is too large.  Likewise adjust ecpg's equivalent of this code.

Also get rid of numeric_recv()'s use of NUMERIC_MAX_PRECISION to limit the
number of base-NBASE digits it would accept.  That created a dump/restore
hazard for binary COPY without doing anything useful; the wire-format
limit on number of digits (65535) is about as tight as we would want.

In HEAD, also get rid of pg_size_bytes()'s unnecessary intimacy with what
the numeric range limit is.  That code doesn't exist in the back branches.

Per gripe from Aravind Kumar.  Back-patch to all supported branches,
since they all contain the documentation claim about allowed range of
NUMERIC (cf commit cabf5d84b).

Discussion: <2895.1471195721@sss.pgh.pa.us>
2016-08-14 15:06:01 -04:00
Tom Lane ed0097e4f9 Add SQL-accessible functions for inspecting index AM properties.
Per discussion, we should provide such functions to replace the lost
ability to discover AM properties by inspecting pg_am (cf commit
65c5fcd35).  The added functionality is also meant to displace any code
that was looking directly at pg_index.indoption, since we'd rather not
believe that the bit meanings in that field are part of any client API
contract.

As future-proofing, define the SQL API to not assume that properties that
are currently AM-wide or index-wide will remain so unless they logically
must be; instead, expose them only when inquiring about a specific index
or even specific index column.  Also provide the ability for an index
AM to override the behavior.

In passing, document pg_am.amtype, overlooked in commit 473b93287.

Andrew Gierth, with kibitzing by me and others

Discussion: <87mvl5on7n.fsf@news-spur.riddles.org.uk>
2016-08-13 18:31:14 -04:00
Tom Lane 0f249fe5f5 Fix busted Assert for CREATE MATVIEW ... WITH NO DATA.
Commit 874fe3aea changed the command tag returned for CREATE MATVIEW/CREATE
TABLE AS ... WITH NO DATA, but missed that there was code in spi.c that
expected the command tag to always be "SELECT".  Fortunately, the
consequence was only an Assert failure, so this oversight should have no
impact in production builds.

Since this code path was evidently un-exercised, add a regression test.

Per report from Shivam Saxena. Back-patch to 9.3, like the previous commit.

Michael Paquier

Report: <97218716-480B-4527-B5CD-D08D798A0C7B@dresources.com>
2016-08-11 11:22:25 -04:00
Tom Lane f0c7b789ab Fix two errors with nested CASE/WHEN constructs.
ExecEvalCase() tried to save a cycle or two by passing
&econtext->caseValue_isNull as the isNull argument to its sub-evaluation of
the CASE value expression.  If that subexpression itself contained a CASE,
then *isNull was an alias for econtext->caseValue_isNull within the
recursive call of ExecEvalCase(), leading to confusion about whether the
inner call's caseValue was null or not.  In the worst case this could lead
to a core dump due to dereferencing a null pointer.  Fix by not assigning
to the global variable until control comes back from the subexpression.
Also, avoid using the passed-in isNull pointer transiently for evaluation
of WHEN expressions.  (Either one of these changes would have been
sufficient to fix the known misbehavior, but it's clear now that each of
these choices was in itself dangerous coding practice and best avoided.
There do not seem to be any similar hazards elsewhere in execQual.c.)

Also, it was possible for inlining of a SQL function that implements the
equality operator used for a CASE comparison to result in one CASE
expression's CaseTestExpr node being inserted inside another CASE
expression.  This would certainly result in wrong answers since the
improperly nested CaseTestExpr would be caused to return the inner CASE's
comparison value not the outer's.  If the CASE values were of different
data types, a crash might result; moreover such situations could be abused
to allow disclosure of portions of server memory.  To fix, teach
inline_function to check for "bare" CaseTestExpr nodes in the arguments of
a function to be inlined, and avoid inlining if there are any.

Heikki Linnakangas, Michael Paquier, Tom Lane

Report: https://github.com/greenplum-db/gpdb/pull/327
Report: <4DDCEEB8.50602@enterprisedb.com>
Security: CVE-2016-5423
2016-08-08 10:33:46 -04:00
Noah Misch 984e5beb38 Sort out paired double quotes in \connect, \password and \crosstabview.
In arguments, these meta-commands wrongly treated each pair as closing
the double quoted string.  Make the behavior match the documentation.
This is a compatibility break, but I more expect to find software with
untested reliance on the documented behavior than software reliant on
today's behavior.  Back-patch to 9.1 (all supported versions).

Reviewed by Tom Lane and Peter Eisentraut.

Security: CVE-2016-5424
2016-08-08 10:07:46 -04:00
Peter Eisentraut 8a56d4e361 Make format() error messages consistent again
07d25a964 changed only one occurrence.  Change the other one as well.
2016-08-08 08:15:41 -04:00
Tom Lane f10eab73df Make array_to_tsvector() sort and de-duplicate the given strings.
This is required for the result to be a legal tsvector value.
Noted while fooling with Andreas Seltenreich's ts_delete() crash.

Discussion: <87invhoj6e.fsf@credativ.de>
2016-08-05 16:09:06 -04:00
Tom Lane c50d192ce3 Fix ts_delete(tsvector, text[]) to cope with duplicate array entries.
Such cases either failed an Assert, or produced a corrupt tsvector in
non-Assert builds, as reported by Andreas Seltenreich.  The reason is
that tsvector_delete_by_indices() just assumed that its input array had
no duplicates.  Fix by explicitly de-duping.

In passing, improve some comments, and fix a number of tests for null
values to use ERRCODE_NULL_VALUE_NOT_ALLOWED not
ERRCODE_INVALID_PARAMETER_VALUE.

Discussion: <87invhoj6e.fsf@credativ.de>
2016-08-05 15:14:19 -04:00
Tom Lane a3c7a993d5 Make INSERT-from-multiple-VALUES-rows handle targetlist indirection better.
Previously, if an INSERT with multiple rows of VALUES had indirection
(array subscripting or field selection) in its target-columns list, the
parser handled that by applying transformAssignedExpr() to each element
of each VALUES row independently.  This led to having ArrayRef assignment
nodes or FieldStore nodes in each row of the VALUES RTE.  That works for
simple cases, but in bug #14265 Nuri Boardman points out that it fails
if there are multiple assignments to elements/fields of the same target
column.  For such cases to work, rewriteTargetListIU() has to nest the
ArrayRefs or FieldStores together to produce a single expression to be
assigned to the column.  But it failed to find them in the top-level
targetlist and issued an error about "multiple assignments to same column".

We could possibly fix this by teaching the rewriter to apply
rewriteTargetListIU to each VALUES row separately, but that would be messy
(it would change the output rowtype of the VALUES RTE, for example) and
inefficient.  Instead, let's fix the parser so that the VALUES RTE outputs
are just the user-specified values, cast to the right type if necessary,
and then the ArrayRefs or FieldStores are applied in the top-level
targetlist to Vars representing the RTE's outputs.  This is the same
parsetree representation already used for similar cases with INSERT/SELECT
syntax, so it allows simplifications in ruleutils.c, which no longer needs
to treat INSERT-from-multiple-VALUES as its own special case.

This implementation works by applying transformAssignedExpr to the VALUES
entries as before, and then stripping off any ArrayRefs or FieldStores it
adds.  With lots of VALUES rows it would be noticeably more efficient to
not add those nodes in the first place.  But that's just an optimization
not a bug fix, and there doesn't seem to be any good way to do it without
significant refactoring.  (A non-invasive answer would be to apply
transformAssignedExpr + stripping to just the first VALUES row, and then
just forcibly cast remaining rows to the same data types exposed in the
first row.  But this way would lead to different, not-INSERT-specific
errors being reported in casting failure cases, so it doesn't seem very
nice.)  So leave that for later; this patch at least isn't making the
per-row parsing work worse, and it does make the finished parsetree
smaller, saving rewriter and planner work.

Catversion bump because stored rules containing such INSERTs would need
to change.  Because of that, no back-patch, even though this is a very
long-standing bug.

Report: <20160727005725.7438.26021@wrigleys.postgresql.org>
Discussion: <9578.1469645245@sss.pgh.pa.us>
2016-08-03 16:37:03 -04:00
Tom Lane bf4ae685ae Fix tqueue.c's range-remapping code.
It's depressingly clear that nobody ever tested this.
2016-07-29 14:13:19 -04:00
Robert Haas 3153b1a52f Eliminate a few more user-visible "cache lookup failed" errors.
Michael Paquier
2016-07-29 12:06:18 -04:00
Tom Lane 9492cf86e4 Fix assorted fallout from IS [NOT] NULL patch.
Commits 4452000f3 et al established semantics for NullTest.argisrow that
are a bit different from its initial conception: rather than being merely
a cache of whether we've determined the input to have composite type,
the flag now has the further meaning that we should apply field-by-field
testing as per the standard's definition of IS [NOT] NULL.  If argisrow
is false and yet the input has composite type, the construct instead has
the semantics of IS [NOT] DISTINCT FROM NULL.  Update the comments in
primnodes.h to clarify this, and fix ruleutils.c and deparse.c to print
such cases correctly.  In the case of ruleutils.c, this merely results in
cosmetic changes in EXPLAIN output, since the case can't currently arise
in stored rules.  However, it represents a live bug for deparse.c, which
would formerly have sent a remote query that had semantics different
from the local behavior.  (From the user's standpoint, this means that
testing a remote nested-composite column for null-ness could have had
unexpected recursive behavior much like that fixed in 4452000f3.)

In a related but somewhat independent fix, make plancat.c set argisrow
to false in all NullTest expressions constructed to represent "attnotnull"
constructs.  Since attnotnull is actually enforced as a simple null-value
check, this is a more accurate representation of the semantics; we were
previously overpromising what it meant for composite columns, which might
possibly lead to incorrect planner optimizations.  (It seems that what the
SQL spec expects a NOT NULL constraint to mean is an IS NOT NULL test, so
arguably we are violating the spec and should fix attnotnull to do the
other thing.  If we ever do, this part should get reverted.)

Back-patch, same as the previous commit.

Discussion: <10682.1469566308@sss.pgh.pa.us>
2016-07-28 16:09:15 -04:00
Tom Lane d8411a6c8b Allow functions that return sets of tuples to return simple NULLs.
ExecMakeTableFunctionResult(), which is used in SELECT FROM function(...)
cases, formerly treated a simple NULL output from a function that both
returnsSet and returnsTuple as a violation of the SRF protocol.  What seems
better is to treat a NULL output as equivalent to ROW(NULL,NULL,...).
Without this, cases such as SELECT FROM unnest(...) on an array of
composite are vulnerable to unexpected and not-very-helpful failures.
Old code comments here suggested an alternative of just ignoring
simple-NULL outputs, but that doesn't seem very principled.

This change had been hung up for a long time due to uncertainty about
how much we wanted to buy into the equivalence of simple NULL and
ROW(NULL,NULL,...).  I think that's been mostly resolved by the discussion
around bug #14235, so let's go ahead and do it.

Per bug #7808 from Joe Van Dyk.  Although this is a pretty old report,
fixing it smells a bit more like a new feature than a bug fix, and the
lack of other similar complaints suggests that we shouldn't take much risk
of destabilization by back-patching.  (Maybe that could be revisited once
this patch has withstood some field usage.)

Andrew Gierth and Tom Lane

Report: <E1TurJE-0006Es-TK@wrigleys.postgresql.org>
2016-07-26 21:34:02 -04:00
Robert Haas 976b24fb47 Change various deparsing functions to return NULL for invalid input.
Previously, some functions returned various fixed strings and others
failed with a cache lookup error.  Per discussion, standardize on
returning NULL.  Although user-exposed "cache lookup failed" error
messages might normally qualify for bug-fix treatment, no back-patch;
the risk of breaking user code which is accustomed to the current
behavior seems too high.

Michael Paquier
2016-07-26 16:07:02 -04:00
Tom Lane 4452000f31 Fix constant-folding of ROW(...) IS [NOT] NULL with composite fields.
The SQL standard appears to specify that IS [NOT] NULL's tests of field
nullness are non-recursive, ie, we shouldn't consider that a composite
field with value ROW(NULL,NULL) is null for this purpose.
ExecEvalNullTest got this right, but eval_const_expressions did not,
leading to weird inconsistencies depending on whether the expression
was such that the planner could apply constant folding.

Also, adjust the docs to mention that IS [NOT] DISTINCT FROM NULL can be
used as a substitute test if a simple null check is wanted for a rowtype
argument.  That motivated reordering things so that IS [NOT] DISTINCT FROM
is described before IS [NOT] NULL.  In HEAD, I went a bit further and added
a table showing all the comparison-related predicates.

Per bug #14235.  Back-patch to all supported branches, since it's certainly
undesirable that constant-folding should change the semantics.

Report and patch by Andrew Gierth; assorted wordsmithing and revised
regression test cases by me.

Report: <20160708024746.1410.57282@wrigleys.postgresql.org>
2016-07-26 15:25:02 -04:00
Tom Lane 9d7abca901 Fix regression tests to work in Welsh locale.
Welsh (cy_GB) apparently sorts 'dd' after 'f', creating problems
analogous to the odd sorting of 'aa' in Danish.  Adjust regression
test case to not use data that provokes that.

Jeff Janes

Patch: <CAMkU=1zx-pqcfSApL2pYDQurPOCfcYU0wJorsmY1OrYPiXRbLw@mail.gmail.com>
2016-07-22 15:41:39 -04:00
Tom Lane b3399cb0f6 Make core regression tests safe for Danish locale.
Some tests added in 9.5 depended on 'aa' sorting before 'bb', which
doesn't hold true in Danish.  Use slightly different test data to
avoid the problem.

Jeff Janes

Report: <CAMkU=1w-cEDbA+XHdNb=YS_4wvZbs66Ni9KeSJKAJGNJyOsgQw@mail.gmail.com>
2016-07-21 13:11:00 -04:00
Tom Lane 18555b1323 Establish conventions about global object names used in regression tests.
To ensure that "make installcheck" can be used safely against an existing
installation, we need to be careful about what global object names
(database, role, and tablespace names) we use; otherwise we might
accidentally clobber important objects.  There's been a weak consensus that
test databases should have names including "regression", and that test role
names should start with "regress_", but we didn't have any particular rule
about tablespace names; and neither of the other rules was followed with
any consistency either.

This commit moves us a long way towards having a hard-and-fast rule that
regression test databases must have names including "regression", and that
test role and tablespace names must start with "regress_".  It's not
completely there because I did not touch some test cases in rolenames.sql
that test creation of special role names like "session_user".  That will
require some rethinking of exactly what we want to test, whereas the intent
of this patch is just to hit all the cases in which the needed renamings
are cosmetic.

There is no enforcement mechanism in this patch either, but if we don't
add one we can expect that the tests will soon be violating the convention
again.  Again, that's not such a cosmetic change and it will require
discussion.  (But I did use a quick-hack enforcement patch to find these
cases.)

Discussion: <16638.1468620817@sss.pgh.pa.us>
2016-07-17 18:42:43 -04:00
Tom Lane 606ccc5e7e Improve test case exercising the sorting path for hash index build.
On second thought, we should probably do at least a minimal check that
the constructed index is valid, since the big problem with the most
recent breakage was not whether the sorting was correct but that the
index had incorrect hash codes placed in it.
2016-07-16 16:25:43 -04:00