Commit Graph

5243 Commits

Author SHA1 Message Date
Robert Haas
ea91a6be89 Remove IRIX port.
Development of IRIX has been discontinued, and support is scheduled
to end in December of 2013.  Therefore, there will be no supported
versions of this operating system by the time PostgreSQL 9.4 is
released.  Furthermore, we have no maintainer for this platform.
2013-10-18 08:14:21 -04:00
Bruce Momjian
7778ddc7a2 Allow 5+ digit years for non-ISO timestamp/date strings, where appropriate
Report from Haribabu Kommi
2013-10-16 13:22:55 -04:00
Robert Haas
05a0283e7a Fix details missed by dynamic shared memory patch.
Additional documentation update, and a comment fix.

Both issues reported by Amit Kapila.
2013-10-14 08:00:26 -04:00
Peter Eisentraut
5b6d08cd29 Add use of asprintf()
Add asprintf(), pg_asprintf(), and psprintf() to simplify string
allocation and composition.  Replacement implementations taken from
NetBSD.

Reviewed-by: Álvaro Herrera <alvherre@2ndquadrant.com>
Reviewed-by: Asif Naeem <anaeem.it@gmail.com>
2013-10-13 00:09:18 -04:00
Kevin Grittner
4cbb646334 Fix several possibly non-portable gaffs in record_image_ops.
Sparc machines in the buildfarm were made happy by the previous
fix, but PowerPC machines still are still failing.  Hopefully this
will cure that.
2013-10-11 13:02:52 -05:00
Alvaro Herrera
31cf1a1a43 Rework SSL renegotiation code
The existing renegotiation code was home for several bugs: it might
erroneously report that renegotiation had failed; it might try to
execute another renegotiation while the previous one was pending; it
failed to terminate the connection if the renegotiation never actually
took place; if a renegotiation was started, the byte count was reset,
even if the renegotiation wasn't completed (this isn't good from a
security perspective because it means continuing to use a session that
should be considered compromised due to volume of data transferred.)

The new code is structured to avoid these pitfalls: renegotiation is
started a little earlier than the limit has expired; the handshake
sequence is retried until it has actually returned successfully, and no
more than that, but if it fails too many times, the connection is
closed.  The byte count is reset only when the renegotiation has
succeeded, and if the renegotiation byte count limit expires, the
connection is terminated.

This commit only touches the master branch, because some of the changes
are controversial.  If everything goes well, a back-patch might be
considered.

Per discussion started by message
20130710212017.GB4941@eldon.alvh.no-ip.org
2013-10-10 23:45:20 -03:00
Kevin Grittner
15e46fd1dd Fix bug in record_image_ops on big endian machines.
The buildfarm pointed out the problem.

Fix based on suggestion by Robert Haas.
2013-10-10 11:25:30 -05:00
Andrew Dunstan
4d212bac17 json_typeof function.
Andrew Tipton.
2013-10-10 12:21:59 -04:00
Peter Eisentraut
261c7d4b65 Revive line type
Change the input/output format to {A,B,C}, to match the internal
representation.

Complete the implementations of line_in, line_out, line_recv, line_send.
Remove comments and error messages about the line type not being
implemented.  Add regression tests for existing line operators and
functions.

Reviewed-by: rui hua <365507506hua@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@2ndquadrant.com>
Reviewed-by: Jeevan Chalke <jeevan.chalke@enterprisedb.com>
2013-10-09 22:34:38 -04:00
Robert Haas
0ac5e5a7e1 Allow dynamic allocation of shared memory segments.
Patch by myself and Amit Kapila.  Design help from Noah Misch.  Review
by Andres Freund.
2013-10-09 21:05:02 -04:00
Kevin Grittner
f566515192 Add record_image_ops opclass for matview concurrent refresh.
REFRESH MATERIALIZED VIEW CONCURRENTLY was broken for any matview
containing a column of a type without a default btree operator
class.  It also did not produce results consistent with a non-
concurrent REFRESH or a normal view if any column was of a type
which allowed user-visible differences between values which
compared as equal according to the type's default btree opclass.
Concurrent matview refresh was modified to use the new operators
to solve these problems.

Documentation was added for record comparison, both for the
default btree operator class for record, and the newly added
operators.  Regression tests now check for proper behavior both
for a matview with a box column and a matview containing a citext
column.

Reviewed by Steve Singer, who suggested some of the doc language.
2013-10-09 14:26:09 -05:00
Bruce Momjian
0c6b675076 Centralize effective_cache_size default setting 2013-10-09 08:33:12 -04:00
Bruce Momjian
6648775028 Update postgres.conf.sample for effective_cache_size's new default 2013-10-08 12:50:05 -04:00
Bruce Momjian
ee1e5662d8 Auto-tune effective_cache size to be 4x shared buffers 2013-10-08 12:12:24 -04:00
Bruce Momjian
a54141aebc Issue error on SET outside transaction block in some cases
Issue error for SET LOCAL/CONSTRAINTS/TRANSACTION outside a transaction
block, as they have no effect.

Per suggestion from Morten Hustveit
2013-10-04 13:50:28 -04:00
Bruce Momjian
d50f281210 Adjust C comments that would be wrap-able. 2013-10-01 19:45:01 -04:00
Robert Haas
4334639f4b Allow printf-style padding specifications in log_line_prefix.
David Rowley, after a suggestion from Heikki Linnakangas.  Reviewed by
Albe Laurenz, and further edited by me.
2013-09-26 17:56:31 -04:00
Heikki Linnakangas
77ae7f7c35 Plug memory leak in range_cmp function.
B-tree operators are not allowed to leak memory into the current memory
context. Range_cmp leaked detoasted copies of the arguments. That caused
a quick out-of-memory error when creating an index on a range column.

Reported by Marian Krucina, bug #8468.
2013-09-25 16:02:00 +03:00
Heikki Linnakangas
0892ecbc01 Add a GUC to report whether data page checksums are enabled.
Bernd Helmle
2013-09-16 14:36:01 +03:00
Kevin Grittner
277607d600 Eliminate pg_rewrite.ev_attr column and related dead code.
Commit 95ef6a3448 removed the
ability to create rules on an individual column as of 7.3, but
left some residual code which has since been useless.  This cleans
up that dead code without any change in behavior other than
dropping the useless column from the catalog.
2013-09-05 14:03:43 -05:00
Heikki Linnakangas
20cb18db46 Make catalog cache hash tables resizeable.
If the hash table backing a catalog cache becomes too full (fillfactor > 2),
enlarge it. A new buckets array, double the size of the old, is allocated,
and all entries in the old hash are moved to the right bucket in the new
hash.

This has two benefits. First, cache lookups don't get so expensive when
there are lots of entries in a cache, like if you access hundreds of
thousands of tables. Second, we can make the (initial) sizes of the caches
much smaller, which saves memory.

This patch dials down the initial sizes of the catcaches. The new sizes are
chosen so that a backend that only runs a few basic queries still won't need
to enlarge any of them.
2013-09-05 20:20:03 +03:00
Bruce Momjian
f5c2f5a8f6 Add GUC descriptions for compile-time postgresql.conf settings
Previous text was "No description available".

Tianyin Xu
2013-09-04 17:44:04 -04:00
Tom Lane
0c66a22377 Update comments concerning PGC_S_TEST.
This GUC context value was once only used by ALTER DATABASE SET and
ALTER USER SET.  That's not true anymore, though, so rewrite the
comments to be a bit more general.

Patch in HEAD only, since this is just an internal documentation issue.
2013-09-03 18:56:22 -04:00
Tom Lane
0d3f4406df Allow aggregate functions to be VARIADIC.
There's no inherent reason why an aggregate function can't be variadic
(even VARIADIC ANY) if its transition function can handle the case.
Indeed, this patch to add the feature touches none of the planner or
executor, and little of the parser; the main missing stuff was DDL and
pg_dump support.

It is true that variadic aggregates can create the same sort of ambiguity
about parameters versus ORDER BY keys that was complained of when we
(briefly) had both one- and two-argument forms of string_agg().  However,
the policy formed in response to that discussion only said that we'd not
create any built-in aggregates with varying numbers of arguments, not that
we shouldn't allow users to do it.  So the logical extension of that is
we can allow users to make variadic aggregates as long as we're wary about
shipping any such in core.

In passing, this patch allows aggregate function arguments to be named, to
the extent of remembering the names in pg_proc and dumping them in pg_dump.
You can't yet call an aggregate using named-parameter notation.  That seems
like a likely future extension, but it'll take some work, and it's not what
this patch is really about.  Likewise, there's still some work needed to
make window functions handle VARIADIC fully, but I left that for another
day.

initdb forced because of new aggvariadic field in Aggref parse nodes.
2013-09-03 17:08:46 -04:00
Alvaro Herrera
e246cfc95f Initialize cached OID to Invalid in new hash entries
Andres Freund; bug detected by valgrind
2013-08-27 14:53:17 -04:00
Tom Lane
2aac3399ae Account better for planning cost when choosing whether to use custom plans.
The previous coding in plancache.c essentially used 10% of the estimated
runtime as its cost estimate for planning.  This can be pretty bogus,
especially when the estimated runtime is very small, such as in a simple
expression plan created by plpgsql, or a simple INSERT ... VALUES.

While we don't have a really good handle on how planning time compares
to runtime, it seems reasonable to use an estimate based on the number of
relations referenced in the query, with a rather large multiplier.  This
patch uses 1000 * cpu_operator_cost * (nrelations + 1), so that even a
trivial query will be charged 1000 * cpu_operator_cost for planning.
This should address the problem reported by Marc Cousin and others that
9.2 and up prefer custom plans in cases where the planning time greatly
exceeds what can be saved.
2013-08-24 15:14:17 -04:00
Tom Lane
3d5282c6f0 Emit a log message if output is about to be redirected away from stderr.
We've seen multiple cases of people looking at the postmaster's original
stderr output to try to diagnose problems, not realizing/remembering that
their logging configuration is set up to send log messages somewhere else.
This seems particularly likely to happen in prepackaged distributions,
since many packagers patch the code to change the factory-standard logging
configuration to something more in line with their platform conventions.

In hopes of reducing confusion, emit a LOG message about this at the point
in startup where we are about to switch log output away from the original
stderr, providing a pointer to where to look instead.  This message will
appear as the last thing in the original stderr output.  (We might later
also try to emit such link messages when logging parameters are changed
on-the-fly; but that case seems to be both noticeably harder to do nicely,
and much less frequently a problem in practice.)

Per discussion, back-patch to 9.3 but not further.
2013-08-13 15:24:52 -04:00
Peter Eisentraut
072457b360 Message punctuation and pluralization fixes 2013-08-09 08:02:44 -04:00
Peter Eisentraut
9d775d8894 Message style improvements 2013-08-07 22:48:40 -04:00
Tom Lane
221e92f64c Make sure float4in/float8in accept all standard spellings of "infinity".
The C99 and POSIX standards require strtod() to accept all these spellings
(case-insensitively): "inf", "+inf", "-inf", "infinity", "+infinity",
"-infinity".  However, pre-C99 systems might accept only some or none of
these, and apparently Windows still doesn't accept "inf".  To avoid
surprising cross-platform behavioral differences, manually check for each
of these spellings if strtod() fails.  We were previously handling just
"infinity" and "-infinity" that way, but since C99 is most of the world
now, it seems likely that applications are expecting all these spellings
to work.

Per bug #8355 from Basil Peace.  It turns out this fix won't actually
resolve his problem, because Python isn't being this careful; but that
doesn't mean we shouldn't be.
2013-08-03 12:40:27 -04:00
Alvaro Herrera
706f9dd914 Fix old visibility bug in HeapTupleSatisfiesDirty
If a tuple is locked but not updated by a concurrent transaction,
HeapTupleSatisfiesDirty would return that transaction's Xid in xmax,
causing callers to wait on it, when it is not necessary (in fact, if the
other transaction had used a multixact instead of a plain Xid to mark
the tuple, HeapTupleSatisfiesDirty would have behave differently and
*not* returned the Xmax).

This bug was introduced in commit 3f7fbf85dc, dated December 1998,
so it's almost 15 years old now.  However, it's hard to see this
misbehave, because before we had NOWAIT the only consequence of this is
that transactions would wait for slightly more time than necessary; so
it's not surprising that this hasn't been reported yet.

Craig Ringer and Andres Freund
2013-08-02 17:02:36 -04:00
Robert Haas
813fb03155 Remove SnapshotNow and HeapTupleSatisfiesNow.
We now use MVCC catalog scans, and, per discussion, have eliminated
all other remaining uses of SnapshotNow, so that we can now get rid of
it.  This will break third-party code which is still using it, which
is intentional, as we want such code to be updated to do things the
new way.
2013-08-01 10:46:19 -04:00
Stephen Frost
ddef1a39c6 Allow a context to be passed in for error handling
As pointed out by Tom Lane, we can allow other users of the error
handler callbacks to provide their own memory context by adding
the context to use to ErrorData and using that instead of explicitly
using ErrorContext.

This then allows GetErrorContextStack() to be called from inside
exception handlers, so modify plpgsql to take advantage of that and
add an associated regression test for it.
2013-08-01 01:07:20 -04:00
Alvaro Herrera
a59516b631 Fix mis-indented lines
Per Coverity
2013-07-31 17:57:15 -04:00
Tom Lane
d074b4e50d Fix regexp_matches() handling of zero-length matches.
We'd find the same match twice if it was of zero length and not immediately
adjacent to the previous match.  replace_text_regexp() got similar cases
right, so adjust this search logic to match that.  Note that even though
the regexp_split_to_xxx() functions share this code, they did not display
equivalent misbehavior, because the second match would be considered
degenerate and ignored.

Jeevan Chalke, with some cosmetic changes by me.
2013-07-31 11:31:22 -04:00
Greg Stark
c62736cc37 Add SQL Standard WITH ORDINALITY support for UNNEST (and any other SRF)
Author: Andrew Gierth, David Fetter
Reviewers: Dean Rasheed, Jeevan Chalke, Stephen Frost
2013-07-29 16:38:01 +01:00
Robert Haas
ed93feb808 Change currtid functions to use an MVCC snapshot, not SnapshotNow.
This has a slight performance cost, but the only known consumers
of these functions, known at the SQL level as currtid and currtid2,
is pgsql-odbc; whose usage, we hope, is not sufficiently intensive
to make this a problem.

Per discussion.
2013-07-25 16:32:02 -04:00
Robert Haas
3483f4332d Don't use SnapshotNow in get_actual_variable_range.
Instead, use the active snapshot.  Per Tom Lane, this function is
most interested in knowing the range of tuples our scan will actually
see.

This is another step towards full removal of SnapshotNow.
2013-07-25 14:30:00 -04:00
Stephen Frost
9bd0feeba8 Improvements to GetErrorContextStack()
As GetErrorContextStack() borrowed setup and tear-down code from other
places, it was less than clear that it must only be called as a
top-level entry point into the error system and can't be called by an
exception handler (unlike the rest of the error system, which is set up
to be reentrant-safe).

Being called from an exception handler is outside the charter of
GetErrorContextStack(), so add a bit more protection against it,
improve the comments addressing why we have to set up an errordata
stack for this function at all, and add a few more regression tests.

Lack of clarity pointed out by Tom Lane; all bugs are mine.
2013-07-25 09:41:55 -04:00
Stephen Frost
8312832567 Add GET DIAGNOSTICS ... PG_CONTEXT in PL/PgSQL
This adds the ability to get the call stack as a string from within a
PL/PgSQL function, which can be handy for logging to a table, or to
include in a useful message to an end-user.

Pavel Stehule, reviewed by Rushabh Lathia and rather heavily whacked
around by Stephen Frost.
2013-07-24 18:53:27 -04:00
Tom Lane
b32a25c3d5 Fix booltestsel() for case where we have NULL stats but not MCV stats.
In a boolean column that contains mostly nulls, ANALYZE might not find
enough non-null values to populate the most-common-values stats,
but it would still create a pg_statistic entry with stanullfrac set.
The logic in booltestsel() for this situation did the wrong thing for
"col IS NOT TRUE" and "col IS NOT FALSE" tests, forgetting that null
values would satisfy these tests (so that the true selectivity would
be close to one, not close to zero).  Per bug #8274.

Fix by Andrew Gierth, some comment-smithing by me.
2013-07-24 00:44:09 -04:00
Tom Lane
10a509d829 Move strip_implicit_coercions() from optimizer to nodeFuncs.c.
Use of this function has spread into the parser and rewriter, so it seems
like time to pull it out of the optimizer and put it into the more central
nodeFuncs module.  This eliminates the need to #include optimizer/clauses.h
in most of the calling files, demonstrating that this function was indeed a
bit outside the normal code reference patterns.
2013-07-23 18:21:19 -04:00
Tom Lane
ef655663c5 Further hacking on ruleutils' new column-alias-assignment code.
After further thought about implicit coercions appearing in a joinaliasvars
list, I realized that they represent an additional reason why we might need
to reference the join output column directly instead of referencing an
underlying column.  Consider SELECT x FROM t1 LEFT JOIN t2 USING (x) where
t1.x is of type date while t2.x is of type timestamptz.  The merged output
variable is of type timestamptz, but it won't go to null when t2 does,
therefore neither t1.x nor t2.x is a valid substitute reference.

The code in get_variable() actually gets this case right, since it knows
it shouldn't look through a coercion, but we failed to ensure that the
unqualified output column name would be globally unique.  To fix, modify
the code that trawls for a dangerous situation so that it actually scans
through an unnamed join's joinaliasvars list to see if there are any
non-simple-Var entries.
2013-07-23 17:55:04 -04:00
Tom Lane
a7cd853b75 Change post-rewriter representation of dropped columns in joinaliasvars.
It's possible to drop a column from an input table of a JOIN clause in a
view, if that column is nowhere actually referenced in the view.  But it
will still be there in the JOIN clause's joinaliasvars list.  We used to
replace such entries with NULL Const nodes, which is handy for generation
of RowExpr expansion of a whole-row reference to the view.  The trouble
with that is that it can't be distinguished from the situation after
subquery pull-up of a constant subquery output expression below the JOIN.
Instead, replace such joinaliasvars with null pointers (empty expression
trees), which can't be confused with pulled-up expressions.  expandRTE()
still emits the old convention, though, for convenience of RowExpr
generation and to reduce the risk of breaking extension code.

In HEAD and 9.3, this patch also fixes a problem with some new code in
ruleutils.c that was failing to cope with implicitly-casted joinaliasvars
entries, as per recent report from Feike Steenbergen.  That oversight was
because of an inadequate description of the data structure in parsenodes.h,
which I've now corrected.  There were some pre-existing oversights of the
same ilk elsewhere, which I believe are now all fixed.
2013-07-23 16:23:45 -04:00
Robert Haas
0518eceec3 Adjust HeapTupleSatisfies* routines to take a HeapTuple.
Previously, these functions took a HeapTupleHeader, but upcoming
patches for logical replication will introduce new a new snapshot
type under which the tuple's TID will be used to lookup (CMIN, CMAX)
for visibility determination purposes.  This makes that information
available.  Code churn is minimal since HeapTupleSatisfiesVisibility
took the HeapTuple anyway, and deferenced it before calling the
satisfies function.

Independently of logical replication, this allows t_tableOid and
t_self to be cross-checked via assertions in tqual.c.  This seems
like a useful way to make sure that all callers are setting these
values properly, which has been previously put forward as
desirable.

Andres Freund, reviewed by Álvaro Herrera
2013-07-22 13:38:44 -04:00
Alvaro Herrera
0aeb5ae204 Silence compiler warning on an unused variable
Also, tweak wording in comments (per Andres) and documentation (myself)
to point out that it's the database's default tablespace that can be
passed as 0, not DEFAULTTABLESPACE_OID.  Robert Haas noticed the bug in
the code, but didn't update the accompanying prose.
2013-07-22 13:15:13 -04:00
Robert Haas
f01d1ae3a1 Add infrastructure for mapping relfilenodes to relation OIDs.
Future patches are expected to introduce logical replication that
works by decoding WAL.  WAL contains relfilenodes rather than relation
OIDs, so this infrastructure will be needed to find the relation OID
based on WAL contents.

If logical replication does not make it into this release, we probably
should consider reverting this, since it will add some overhead to DDL
operations that create new relations.  One additional index insert per
pg_class row is not a large overhead, but it's more than zero.
Another way of meeting the needs of logical replication would be to
the relation OID to WAL, but that would burden DML operations, not
only DDL.

Andres Freund, with some changes by me.  Design review, in earlier
versions, by Álvaro Herrera.
2013-07-22 11:09:10 -04:00
Peter Eisentraut
ff41a5de09 Clean up new JSON API typedefs
The new JSON API uses a bit of an unusual typedef scheme, where for
example OkeysState is a pointer to okeysState.  And that's not applied
consistently either.  Change that to the more usual PostgreSQL style
where struct typedefs are upper case, and use pointers explicitly.
2013-07-20 06:38:31 -04:00
Alvaro Herrera
6737aa72ba Fix HeapTupleSatisfiesVacuum on aborted updater xacts
By using only the macro that checks infomask bits
HEAP_XMAX_IS_LOCKED_ONLY to verify whether a multixact is not an
updater, and not the full HeapTupleHeaderIsOnlyLocked, it would come to
the wrong result in case of a multixact containing an aborted update;
therefore returning the wrong result code.  This would cause predicate.c
to break completely (as in bug report #8273 from David Leverton), and
certain index builds would misbehave.  As far as I can tell, other
callers of the bogus routine would make harmless mistakes or not be
affected by the difference at all; so this was a pretty narrow case.

Also, no other user of the HEAP_XMAX_IS_LOCKED_ONLY macro is as
careless; they all check specifically for the HEAP_XMAX_IS_MULTI case,
and they all verify whether the updater is InvalidXid before concluding
that it's a valid updater.  So there doesn't seem to be any similar bug.
2013-07-19 18:47:37 -04:00
Tom Lane
d9f37e6661 Add checks for valid multibyte character length in UtfToLocal, LocalToUtf.
This is mainly to suppress "uninitialized variable" warnings from very
recent versions of gcc.  But it seems like a good robustness thing anyway,
not to mention that we might someday decide to support 6-byte UTF8.

Per report from Karol Trzcionka.  No back-patch since there's no reason
at the moment to think this is more than cosmetic.
2013-07-18 21:55:38 -04:00
Andrew Dunstan
d26888bc4d Move checking an explicit VARIADIC "any" argument into the parser.
This is more efficient and simpler . It does mean that an untyped NULL
can no longer be used in such cases, which should be mentioned in
Release Notes, but doesn't seem a terrible loss. The workaround is to
cast the NULL to some array type.

Pavel Stehule, reviewed by Jeevan Chalke.
2013-07-18 11:52:12 -04:00
Heikki Linnakangas
3f2adace1e Fix end-of-loop optimization in pglz_find_match() function.
After the recent pglz optimization patch, the next/prev pointers in the
hash table are never NULL, INVALID_ENTRY_PTR is used to represent invalid
entries instead. The end-of-loop check in pglz_find_match() function didn't
get the memo. The result was the same from a correctness point of view, but
because the NULL-check would never fail, the tiny optimization turned into
a pessimization.

Reported by Stephen Frost, using Coverity scanner.
2013-07-17 20:37:09 +03:00
Noah Misch
b560ec1b0d Implement the FILTER clause for aggregate function calls.
This is SQL-standard with a few extensions, namely support for
subqueries and outer references in clause expressions.

catversion bump due to change in Aggref and WindowFunc.

David Fetter, reviewed by Dean Rasheed.
2013-07-16 20:15:36 -04:00
Robert Haas
42c80c696e Assert that syscache lookups don't happen outside transactions.
Andres Freund
2013-07-15 13:31:36 -04:00
Stephen Frost
273dcd1628 Ensure 64bit arithmetic when calculating tapeSpace
In tuplesort.c:inittapes(), we calculate tapeSpace by first figuring
out how many 'tapes' we can use (maxTapes) and then multiplying the
result by the tape buffer overhead for each.  Unfortunately, when
we are on a system with an 8-byte long, we allow work_mem to be
larger than 2GB and that allows maxTapes to be large enough that the
32bit arithmetic can overflow when multiplied against the buffer
overhead.

When this overflow happens, we end up adding the overflow to the
amount of space available, causing the amount of memory allocated to
be larger than work_mem.

Note that to reach this point, you have to set work mem to at least
24GB and be sorting a set which is at least that size.  Given that a
user who can set work_mem to 24GB could also set it even higher, if
they were looking to run the system out of memory, this isn't
considered a security issue.

This overflow risk was found by the Coverity scanner.

Back-patch to all supported branches, as this issue has existed
since before 8.4.
2013-07-14 16:26:16 -04:00
Peter Eisentraut
070518ddab Add session_preload_libraries configuration parameter
This is like shared_preload_libraries except that it takes effect at
backend start and can be changed without a full postmaster restart.  It
is like local_preload_libraries except that it is still only settable by
a superuser.  This can be a better way to load modules such as
auto_explain.

Since there are now three preload parameters, regroup the documentation
a bit.  Put all parameters into one section, explain common
functionality only once, update the descriptions to reflect current and
future realities.

Reviewed-by: Dimitri Fontaine <dimitri@2ndQuadrant.fr>
2013-07-12 21:23:50 -04:00
Peter Eisentraut
7888c61238 Fix bool abuse
path_encode's "closed" argument used to take three values: TRUE, FALSE,
or -1, while being of type bool.  Replace that with a three-valued enum
for more clarity.
2013-07-08 22:42:39 -04:00
Heikki Linnakangas
9a20a9b21b Improve scalability of WAL insertions.
This patch replaces WALInsertLock with a number of WAL insertion slots,
allowing multiple backends to insert WAL records to the WAL buffers
concurrently. This is particularly useful for parallel loading large amounts
of data on a system with many CPUs.

This has one user-visible change: switching to a new WAL segment with
pg_switch_xlog() now fills the remaining unused portion of the segment with
zeros. This potentially adds some overhead, but it has been a very common
practice by DBA's to clear the "tail" of the segment with an external
pg_clearxlogtail utility anyway, to make the WAL files compress better.
With this patch, it's no longer necessary to do that.

This patch adds a new GUC, xloginsert_slots, to tune the number of WAL
insertion slots. Performance testing suggests that the default, 8, works
pretty well for all kinds of worklods, but I left the GUC in place to allow
others with different hardware to test that easily. We might want to remove
that before release.

Reviewed by Andres Freund.
2013-07-08 11:23:56 +03:00
Magnus Hagander
c87ff71f37 Expose the estimation of number of changed tuples since last analyze
This value, now pg_stat_all_tables.n_mod_since_analyze, was already
tracked and used by autovacuum, but not exposed to the user.

Mark Kirkwood, review by Laurenz Albe
2013-07-05 15:10:15 +02:00
Noah Misch
79e0f87a15 Use type "int64" for memory accounting in tuplesort.c/tuplestore.c.
Commit 263865a489 switched tuplesort.c and
tuplestore.c variables representing memory usage from type "long" to
type "Size".  This was unnecessary; I thought doing so avoided overflow
scenarios on 64-bit Windows, but guc.c already limited work_mem so as to
prevent the overflow.  It was also incomplete, not touching the logic
that assumed a signed data type.  Change the affected variables to
"int64".  This is perfect for 64-bit platforms, and it reduces the need
to contemplate platform-specific overflow scenarios.  It also puts us
close to being able to support work_mem over 2 GiB on 64-bit Windows.

Per report from Andres Freund.
2013-07-04 23:13:54 -04:00
Robert Haas
6bc8ef0b7f Add new GUC, max_worker_processes, limiting number of bgworkers.
In 9.3, there's no particular limit on the number of bgworkers;
instead, we just count up the number that are actually registered,
and use that to set MaxBackends.  However, that approach causes
problems for Hot Standby, which needs both MaxBackends and the
size of the lock table to be the same on the standby as on the
master, yet it may not be desirable to run the same bgworkers in
both places.  9.3 handles that by failing to notice the problem,
which will probably work fine in nearly all cases anyway, but is
not theoretically sound.

A further problem with simply counting the number of registered
workers is that new workers can't be registered without a
postmaster restart.  This is inconvenient for administrators,
since bouncing the postmaster causes an interruption of service.
Moreover, there are a number of applications for background
processes where, by necessity, the background process must be
started on the fly (e.g. parallel query).  While this patch
doesn't actually make it possible to register new background
workers after startup time, it's a necessary prerequisite.

Patch by me.  Review by Michael Paquier.
2013-07-04 11:24:24 -04:00
Fujii Masao
2ef085d0e6 Get rid of pg_class.reltoastidxid.
Treat TOAST index just the same as normal one and get the OID
of TOAST index from pg_index but not pg_class.reltoastidxid.
This change allows us to handle multiple TOAST indexes, and
which is required infrastructure for upcoming
REINDEX CONCURRENTLY feature.

Patch by Michael Paquier, reviewed by Andres Freund and me.
2013-07-04 03:24:09 +09:00
Robert Haas
568d4138c6 Use an MVCC snapshot, rather than SnapshotNow, for catalog scans.
SnapshotNow scans have the undesirable property that, in the face of
concurrent updates, the scan can fail to see either the old or the new
versions of the row.  In many cases, we work around this by requiring
DDL operations to hold AccessExclusiveLock on the object being
modified; in some cases, the existing locking is inadequate and random
failures occur as a result.  This commit doesn't change anything
related to locking, but will hopefully pave the way to allowing lock
strength reductions in the future.

The major issue has held us back from making this change in the past
is that taking an MVCC snapshot is significantly more expensive than
using a static special snapshot such as SnapshotNow.  However, testing
of various worst-case scenarios reveals that this problem is not
severe except under fairly extreme workloads.  To mitigate those
problems, we avoid retaking the MVCC snapshot for each new scan;
instead, we take a new snapshot only when invalidation messages have
been processed.  The catcache machinery already requires that
invalidation messages be sent before releasing the related heavyweight
lock; else other backends might rely on locally-cached data rather
than scanning the catalog at all.  Thus, making snapshot reuse
dependent on the same guarantees shouldn't break anything that wasn't
already subtly broken.

Patch by me.  Review by Michael Paquier and Andres Freund.
2013-07-02 09:47:01 -04:00
Bruce Momjian
7408c5d29b Add timezone offset output option to to_char()
Add ability for to_char() to output the timezone's UTC offset (OF).  We
already have the ability to return the timezone abbeviation (TZ/tz).
Per request from Andrew Dunstan
2013-07-01 13:40:32 -04:00
Heikki Linnakangas
031cc55bbe Optimize pglz compressor for small inputs.
The pglz compressor has a significant startup cost, because it has to
initialize to zeros the history-tracking hash table. On a 64-bit system, the
hash table was 64kB in size. While clearing memory is pretty fast, for very
short inputs the relative cost of that was quite large.

This patch alleviates that in two ways. First, instead of storing pointers
in the hash table, store 16-bit indexes into the hist_entries array. That
slashes the size of the hash table to 1/2 or 1/4 of the original, depending
on the pointer width. Secondly, adjust the size of the hash table based on
input size. For very small inputs, you don't need a large hash table to
avoid collisions.

Review by Amit Kapila.
2013-07-01 11:00:14 +03:00
Noah Misch
263865a489 Permit super-MaxAllocSize allocations with MemoryContextAllocHuge().
The MaxAllocSize guard is convenient for most callers, because it
reduces the need for careful attention to overflow, data type selection,
and the SET_VARSIZE() limit.  A handful of callers are happy to navigate
those hazards in exchange for the ability to allocate a larger chunk.
Introduce MemoryContextAllocHuge() and repalloc_huge().  Use this in
tuplesort.c and tuplestore.c, enabling internal sorts of up to INT_MAX
tuples, a factor-of-48 increase.  In particular, B-tree index builds can
now benefit from much-larger maintenance_work_mem settings.

Reviewed by Stephen Frost, Simon Riggs and Jeff Janes.
2013-06-27 14:53:57 -04:00
Noah Misch
19085116ee Cooperate with the Valgrind instrumentation framework.
Valgrind "client requests" in aset.c and mcxt.c teach Valgrind and its
Memcheck tool about the PostgreSQL allocator.  This makes Valgrind
roughly as sensitive to memory errors involving palloc chunks as it is
to memory errors involving malloc chunks.  Further client requests in
PageAddItem() and printtup() verify that all bits being added to a
buffer page or furnished to an output function are predictably-defined.
Those tests catch failures of C-language functions to fully initialize
the bits of a Datum, which in turn stymie optimizations that rely on
_equalConst().  Define the USE_VALGRIND symbol in pg_config_manual.h to
enable these additions.  An included "suppression file" silences nominal
errors we don't plan to fix.

Reviewed in earlier versions by Peter Geoghegan and Korry Douglas.
2013-06-26 20:22:25 -04:00
Noah Misch
a855148a29 Refactor aset.c and mcxt.c in preparation for Valgrind cooperation.
Move some repeated debugging code into functions and store intermediates
in variables where not presently necessary.  No code-generation changes
in a production build, and no functional changes.  This simplifies and
focuses the main patch.
2013-06-26 19:56:03 -04:00
Noah Misch
5f538ad004 Renovate display of non-ASCII messages on Windows.
GNU gettext selects a default encoding for the messages it emits in a
platform-specific manner; it uses the Windows ANSI code page on Windows
and follows LC_CTYPE on other platforms.  This is inconvenient for
PostgreSQL server processes, so realize consistent cross-platform
behavior by calling bind_textdomain_codeset() on Windows each time we
permanently change LC_CTYPE.  This primarily affects SQL_ASCII databases
and processes like the postmaster that do not attach to a database,
making their behavior consistent with PostgreSQL on non-Windows
platforms.  Messages from SQL_ASCII databases use the encoding implied
by the database LC_CTYPE, and messages from non-database processes use
LC_CTYPE from the postmaster system environment.  PlatformEncoding
becomes unused, so remove it.

Make write_console() prefer WriteConsoleW() to write() regardless of the
encodings in use.  In this situation, write() will invariably mishandle
non-ASCII characters.

elog.c has assumed that messages conform to the database encoding.
While usually true, this does not hold for SQL_ASCII and MULE_INTERNAL.
Introduce MessageEncoding to track the actual encoding of message text.
The present consumers are Windows-specific code for converting messages
to UTF16 for use in system interfaces.  This fixes the appearance in
Windows event logs and consoles of translated messages from SQL_ASCII
processes like the postmaster.  Note that SQL_ASCII inherently disclaims
a strong notion of encoding, so non-ASCII byte sequences interpolated
into messages by %s may yet yield a nonsensical message.  MULE_INTERNAL
has similar problems at present, albeit for a different reason: its lack
of libiconv support or a conversion to UTF8.

Consequently, one need no longer restart Windows with a different
Windows ANSI code page to broadly test backend logging under a given
language.  Changing the user's locale ("Format") is enough.  Several
accounts can simultaneously run postmasters under different locales, all
correctly logging localized messages to Windows event logs and consoles.

Alexander Law and Noah Misch
2013-06-26 11:17:33 -04:00
Fujii Masao
bab54e383d Support TB (terabyte) memory unit in GUC variables.
Patch by Simon Riggs, reviewed by Jeff Janes and me.
2013-06-20 08:17:14 +09:00
Jeff Davis
b8fd1a09f3 Add buffer_std flag to MarkBufferDirtyHint().
MarkBufferDirtyHint() writes WAL, and should know if it's got a
standard buffer or not. Currently, the only callers where buffer_std
is false are related to the FSM.

In passing, rename XLOG_HINT to XLOG_FPI, which is more descriptive.

Back-patch to 9.3.
2013-06-17 08:02:12 -07:00
Tom Lane
a64ca63e59 Use WaitLatch, not pg_usleep, for delaying in pg_sleep().
This avoids platform-dependent behavior wherein pg_sleep() might fail to be
interrupted by statement timeout, query cancel, SIGTERM, etc.  Also, since
there's no reason to wake up once a second any more, we can reduce the
power consumption of a sleeping backend a tad.

Back-patch to 9.3, since use of SA_RESTART for SIGALRM makes this a bigger
issue than it used to be.
2013-06-15 16:23:24 -04:00
Tom Lane
c62866eeaf Remove special-case treatment of LOG severity level in standalone mode.
elog.c has historically treated LOG messages as low-priority during
bootstrap and standalone operation.  This has led to confusion and even
masked a bug, because the normal expectation of code authors is that
elog(LOG) will put something into the postmaster log, and that wasn't
happening during initdb.  So get rid of the special-case rule and make
the priority order the same as it is in normal operation.  To keep from
cluttering initdb's output and the behavior of a standalone backend,
tweak the severity level of three messages routinely issued by xlog.c
during startup and shutdown so that they won't appear in these cases.
Per my proposal back in December.
2013-06-13 23:15:15 -04:00
Noah Misch
66008564f8 Avoid reading past datum end when parsing JSON.
Several loops in the JSON parser examined a byte in memory just before
checking whether its address was in-bounds, so they could read one byte
beyond the datum's allocation.  A SIGSEGV is possible.  New in 9.3, so
no back-patch.
2013-06-12 19:51:12 -04:00
Tom Lane
dc3eb56383 Improve updatability checking for views and foreign tables.
Extend the FDW API (which we already changed for 9.3) so that an FDW can
report whether specific foreign tables are insertable/updatable/deletable.
The default assumption continues to be that they're updatable if the
relevant executor callback function is supplied by the FDW, but finer
granularity is now possible.  As a test case, add an "updatable" option to
contrib/postgres_fdw.

This patch also fixes the information_schema views, which previously did
not think that foreign tables were ever updatable, and fixes
view_is_auto_updatable() so that a view on a foreign table can be
auto-updatable.

initdb forced due to changes in information_schema views and the functions
they rely on.  This is a bit unfortunate to do post-beta1, but if we don't
change this now then we'll have another API break for FDWs when we do
change it.

Dean Rasheed, somewhat editorialized on by Tom Lane
2013-06-12 17:53:33 -04:00
Andrew Dunstan
78ed8e03c6 Fix unescaping of JSON Unicode escapes, especially for non-UTF8.
Per discussion  on -hackers. We treat Unicode escapes when unescaping
them similarly to the way we treat them in PostgreSQL string literals.
Escapes in the ASCII range are always accepted, no matter what the
database encoding. Escapes for higher code points are only processed in
UTF8 databases, and attempts to process them in other databases will
result in an error. \u0000 is never unescaped, since it would result in
an impermissible null byte.
2013-06-12 13:35:24 -04:00
Tom Lane
e262755bfc Fix cache flush hazard in cache_record_field_properties().
We need to increment the refcount on the composite type's cached tuple
descriptor while we do lookups of its column types.  Otherwise a cache
flush could occur and release the tuple descriptor before we're done with
it.  This fails reliably with -DCLOBBER_CACHE_ALWAYS, but the odds of a
failure in a production build seem rather low (since the pfree'd descriptor
typically wouldn't get scribbled on immediately).  That may explain the
lack of any previous reports.  Buildfarm issue noted by Christian Ullrich.

Back-patch to 9.1 where the bogus code was added.
2013-06-11 17:26:42 -04:00
Andrew Dunstan
94e3311b97 Handle Unicode surrogate pairs correctly when processing JSON.
In 9.2, Unicode escape sequences are not analysed at all other than
to make sure that they are in the form \uXXXX. But in 9.3 many of the
new operators and functions try to turn JSON text values into text in
the server encoding, and this includes de-escaping Unicode escape
sequences. This processing had not taken into account the possibility
that this might contain a surrogate pair to designate a character
outside the BMP. That is now handled correctly.

This also enforces correct use of surrogate pairs, something that is not
done by the type's input routines. This fact is noted in the docs.
2013-06-08 09:12:48 -04:00
Stephen Frost
f129615fe7 Additional spelling corrections
A few more minor spelling corrections, no functional changes.

Thom Brown
2013-06-03 08:40:27 -04:00
Stephen Frost
c9fc28a7f1 Minor spelling fixes
Fix a few spelling mistakes.

Per bug report #8193 from Lajos Veres.
2013-06-01 10:18:59 -04:00
Noah Misch
97c4d9b7c7 Don't emit non-canonical empty arrays in array_remove().
Dean Rasheed
2013-05-31 21:50:59 -04:00
Peter Eisentraut
97a11fd0e3 postgresql.conf.sample: Improve whitespace 2013-05-29 22:00:13 -04:00
Bruce Momjian
9af4159fce pgindent run for release 9.3
This is the first run of the Perl-based pgindent script.  Also update
pgindent instructions.
2013-05-29 16:58:43 -04:00
Tom Lane
403bd6a18b Fix crash when trying to display a NOTIFY rule action.
Fixes oversight in commit 2ffa740be9.
Per report from Josh Kupershmidt.

I think we've broken this case before, so let's add a regression test
this time.
2013-05-16 16:47:26 -04:00
Tom Lane
35d50b527a Fix to_number() to correctly ignore thousands separator when it's '.'.
The existing code in NUM_numpart_from_char has hard-wired logic to treat
'.' as decimal point, even when we're using a locale-aware format string
and the locale says that '.' is the thousands separator.  This results in
clearly wrong answers in FM mode (where we must be able to identify the
decimal point location), as per bug report from Patryk Kordylewski.

Since the initialization code in NUM_prepare_locale already sets up
Np->decimal as either the locale decimal-point string or "." depending
on which decimal-point format code was used, there's really no need to
have any extra logic at all in NUM_numpart_from_char: we only need to
test for a match to Np->decimal.

(Note: AFAICS there's nothing in here that explicitly checks for thousands
separators --- rather, any unmatched character is silently skipped over.
That's pretty bogus IMO but it's not the issue being complained of.)

This is a longstanding bug, but it's possible that some existing apps
are depending on '.' being recognized as decimal point even when using
a D format code.  Hence, no back-patch.  We should probably list this
as a potential incompatibility in the 9.3 release notes.
2013-05-11 16:35:03 -04:00
Tom Lane
69cc60dcfd Guard against input_rows == 0 in estimate_num_groups().
This case doesn't normally happen, because the planner usually clamps
all row estimates to at least one row; but I found that it can arise
when dealing with relations excluded by constraints.  Without a defense,
estimate_num_groups() can return zero, which leads to divisions by zero
inside the planner as well as assertion failures in the executor.

An alternative fix would be to change set_dummy_rel_pathlist() to make
the size estimate for a dummy relation 1 row instead of 0, but that seemed
pretty ugly; and probably someday we'll want to drop the convention that
the minimum rowcount estimate is 1 row.

Back-patch to 8.4, as the problem can be demonstrated that far back.
2013-05-10 17:15:30 -04:00
Tom Lane
1d6c72a55b Move materialized views' is-populated status into their pg_class entries.
Previously this state was represented by whether the view's disk file had
zero or nonzero size, which is problematic for numerous reasons, since it's
breaking a fundamental assumption about heap storage.  This was done to
allow unlogged matviews to revert to unpopulated status after a crash
despite our lack of any ability to update catalog entries post-crash.
However, this poses enough risk of future problems that it seems better to
not support unlogged matviews until we can find another way.  Accordingly,
revert that choice as well as a number of existing kluges forced by it
in favor of creating a pg_class.relispopulated flag column.
2013-05-06 13:27:22 -04:00
Bruce Momjian
8b06e6aba8 Revert idea of zer-padding padding session id in log_line_prefix
Removal of doc adjustment and release note mention as well.
2013-05-06 08:59:39 -04:00
Andrew Dunstan
5f8b4319b9 Use correct length to convert json unicode escapes.
Bug reported on IRC - fix due to Andrew Gierth.
2013-05-01 18:47:18 -04:00
Tom Lane
ac63dca607 Fix longstanding race condition in plancache.c.
When creating or manipulating a cached plan for a transaction control
command (particularly ROLLBACK), we must not perform any catalog accesses,
since we might be in an aborted transaction.  However, plancache.c busily
saved or examined the search_path for every cached plan.  If we were
unlucky enough to do this at a moment where the path's expansion into
schema OIDs wasn't already cached, we'd do some catalog accesses; and with
some more bad luck such as an ill-timed signal arrival, that could lead to
crashes or Assert failures, as exhibited in bug #8095 from Nachiket Vaidya.
Fortunately, there's no real need to consider the search path for such
commands, so we can just skip the relevant steps when the subject statement
is a TransactionStmt.  This is somewhat related to bug #5269, though the
failure happens during initial cached-plan creation rather than
revalidation.

This bug has been there since the plan cache was invented, so back-patch
to all supported branches.
2013-04-20 17:00:23 -04:00
Peter Eisentraut
cc26ea9fe2 Clean up references to SQL92
In most cases, these were just references to the SQL standard in
general.  In a few cases, a contrast was made between SQL92 and later
standards -- those have been kept unchanged.
2013-04-20 11:04:41 -04:00
Andrew Dunstan
728ec9731f Correct handling of NULL arguments in json funcs.
Per gripe from Tom Lane.
2013-04-15 16:20:21 -04:00
Kevin Grittner
52e6e33ab4 Create a distinction between a populated matview and a scannable one.
The intent was that being populated would, long term, be just one
of the conditions which could affect whether a matview was
scannable; being populated should be necessary but not always
sufficient to scan the relation.  Since only CREATE and REFRESH
currently determine the scannability, names and comments
accidentally conflated these concepts, leading to confusion.

Also add missing locking for the SQL function which allows a
test for scannability, and fix a modularity violatiion.

Per complaints from Tom Lane, although its not clear that these
will satisfy his concerns.  Hopefully this will at least better
frame the discussion.
2013-04-09 13:02:49 -05:00
Tom Lane
3ccae48f44 Support indexing of regular-expression searches in contrib/pg_trgm.
This works by extracting trigrams from the given regular expression,
in generally the same spirit as the previously-existing support for
LIKE searches, though of course the details are far more complicated.

Currently, only GIN indexes are supported.  We might be able to make
it work with GiST indexes later.

The implementation includes adding API functions to backend/regex/
to provide a view of the search NFA created from a regular expression.
These functions are meant to be generic enough to be supportable in
a standalone version of the regex library, should that ever happen.

Alexander Korotkov, reviewed by Heikki Linnakangas and Tom Lane
2013-04-09 01:06:54 -04:00
Andrew Dunstan
e75feb2834 Fix off by one error in JSON extract path code.
Bug report by David Wheeler, diagnosis assistance from Tom Lane.
2013-04-04 18:26:52 -04:00
Tom Lane
f7b0006f42 Avoid updating our PgBackendStatus entry when track_activities is off.
The point of turning off track_activities is to avoid this reporting
overhead, but a thinko in commit 4f42b546fd
caused pgstat_report_activity() to perform half of its updates anyway.
Fix that, and also make sure that we clear all the now-disabled fields
when transitioning to the non-reporting state.
2013-04-03 14:13:28 -04:00
Tom Lane
17fe2793ea Fix insecure parsing of server command-line switches.
An oversight in commit e710b65c1c allowed
database names beginning with "-" to be treated as though they were secure
command-line switches; and this switch processing occurs before client
authentication, so that even an unprivileged remote attacker could exploit
the bug, needing only connectivity to the postmaster's port.  Assorted
exploits for this are possible, some requiring a valid database login,
some not.  The worst known problem is that the "-r" switch can be invoked
to redirect the process's stderr output, so that subsequent error messages
will be appended to any file the server can write.  This can for example be
used to corrupt the server's configuration files, so that it will fail when
next restarted.  Complete destruction of database tables is also possible.

Fix by keeping the database name extracted from a startup packet fully
separate from command-line switches, as had already been done with the
user name field.

The Postgres project thanks Mitsumasa Kondo for discovering this bug,
Kyotaro Horiguchi for drafting the fix, and Noah Misch for recognizing
the full extent of the danger.

Security: CVE-2013-1899
2013-04-01 14:00:51 -04:00
Tom Lane
ce9ab88981 Make REPLICATION privilege checks test current user not authenticated user.
The pg_start_backup() and pg_stop_backup() functions checked the privileges
of the initially-authenticated user rather than the current user, which is
wrong.  For example, a user-defined index function could successfully call
these functions when executed by ANALYZE within autovacuum.  This could
allow an attacker with valid but low-privilege database access to interfere
with creation of routine backups.  Reported and fixed by Noah Misch.

Security: CVE-2013-1901
2013-04-01 13:09:24 -04:00
Andrew Dunstan
a570c98d7f Add new JSON processing functions and parser API.
The JSON parser is converted into a recursive descent parser, and
exposed for use by other modules such as extensions. The API provides
hooks for all the significant parser event such as the beginning and end
of objects and arrays, and providing functions to handle these hooks
allows for fairly simple construction of a wide variety of JSON
processing functions. A set of new basic processing functions and
operators is also added, which use this API, including operations to
extract array elements, object fields, get the length of arrays and the
set of keys of a field, deconstruct an object into a set of key/value
pairs, and create records from JSON objects and arrays of objects.

Catalog version bumped.

Andrew Dunstan, with some documentation assistance from Merlin Moncure.
2013-03-29 14:12:13 -04:00
Alvaro Herrera
473ab40c8b Add sql_drop event for event triggers
This event takes place just before ddl_command_end, and is fired if and
only if at least one object has been dropped by the command.  (For
instance, DROP TABLE IF EXISTS of a table that does not in fact exist
will not lead to such a trigger firing).  Commands that drop multiple
objects (such as DROP SCHEMA or DROP OWNED BY) will cause a single event
to fire.  Some firings might be surprising, such as
ALTER TABLE DROP COLUMN.

The trigger is fired after the drop has taken place, because that has
been deemed the safest design, to avoid exposing possibly-inconsistent
internal state (system catalogs as well as current transaction) to the
user function code.  This means that careful tracking of object
identification is required during the object removal phase.

Like other currently existing events, there is support for tag
filtering.

To support the new event, add a new pg_event_trigger_dropped_objects()
set-returning function, which returns a set of rows comprising the
objects affected by the command.  This is to be used within the user
function code, and is mostly modelled after the recently introduced
pg_identify_object() function.

Catalog version bumped due to the new function.

Dimitri Fontaine and Álvaro Herrera
Review by Robert Haas, Tom Lane
2013-03-28 13:05:48 -03:00
Simon Riggs
593c39d156 Revoke bc5334d867 2013-03-28 09:18:02 +00:00
Simon Riggs
d139a5e26b Revoke 7a5a59d378 2013-03-28 09:12:55 +00:00
Simon Riggs
7a5a59d378 Set recovery_config_directory for EXEC_BACKEND.
Remove comment questioning whether this is necessary for DataDir.
From buildfarm failures on Windows.
2013-03-27 16:35:38 +00:00
Simon Riggs
bc5334d867 Allow external recovery_config_directory
If required, recovery.conf can now be located outside of the data directory.
Server needs read/write permissions on this directory.
2013-03-27 11:45:42 +00:00
Simon Riggs
96ef3b8ff1 Allow I/O reliability checks using 16-bit checksums
Checksums are set immediately prior to flush out of shared buffers
and checked when pages are read in again. Hint bit setting will
require full page write when block is dirtied, which causes various
infrastructure changes. Extensive comments, docs and README.

WARNING message thrown if checksum fails on non-all zeroes page;
ERROR thrown but can be disabled with ignore_checksum_failure = on.

Feature enabled by an initdb option, since transition from option off
to option on is long and complex and has not yet been implemented.
Default is not to use checksums.

Checksum used is WAL CRC-32 truncated to 16-bits.

Simon Riggs, Jeff Davis, Greg Smith
Wide input and assistance from many community members. Thank you.
2013-03-22 13:54:07 +00:00
Simon Riggs
13fe298ca0 Change commit_delay to be SUSET for 9.3+
Prior to 9.3 the commit_delay affected only the current user,
whereas now only the group leader waits while holding the
WALWriteLock. Deliberate or accidental settings to a poor
value could seriously degrade performance for all users.
Privileges may be delegated by SECURITY DEFINER functions
for anyone that needs per-user settings in real situations.
Request for change from Peter Geoghegan
2013-03-22 12:01:16 +00:00
Heikki Linnakangas
f897c4744f Fix "element <@ range" cost estimation.
The statistics-based cost estimation patch for range types broke that, by
incorrectly assuming that the left operand of all range oeprators is a
range. That lead to a "type x is not a range type" error. Because it took so
long for anyone to notice, add a regression test for that case.

We still don't do proper statistics-based cost estimation for that, so you
just get a default constant estimate. We should look into implementing that,
but this patch at least fixes the regression.

Spotted by Tom Lane, when testing query from Josh Berkus.
2013-03-21 11:21:51 +02:00
Alvaro Herrera
f8348ea32e Allow extracting machine-readable object identity
Introduce pg_identify_object(oid,oid,int4), which is similar in spirit
to pg_describe_object but instead produces a row of machine-readable
information to uniquely identify the given object, without resorting to
OIDs or other internal representation.  This is intended to be used in
the event trigger implementation, to report objects being operated on;
but it has usefulness of its own.

Catalog version bumped because of the new function.
2013-03-20 18:19:19 -03:00
Tom Lane
6ac7facdd3 Improve signal-handler lockout mechanism in timeout.c.
Rather than doing a fairly-expensive setitimer() call to prevent interrupts
from happening, let's just invent a simple boolean flag that the signal
handler is required to check.  This is not only faster but considerably
more robust than before, since the previous code effectively assumed that
only ITIMER_REAL events would ever fire the SIGALRM handler, which is
obviously something that can be broken easily by third-party code.

Zoltán Böszörményi and Tom Lane
2013-03-17 22:42:19 -04:00
Tom Lane
da5aeccf64 Move pqsignal() to libpgport.
We had two copies of this function in the backend and libpq, which was
already pretty bogus, but it turns out that we need it in some other
programs that don't use libpq (such as pg_test_fsync).  So put it where
it probably should have been all along.  The signal-mask-initialization
support in src/backend/libpq/pqsignal.c stays where it is, though, since
we only need that in the backend.
2013-03-17 12:06:42 -04:00
Tom Lane
d43837d030 Add lock_timeout configuration parameter.
This GUC allows limiting the time spent waiting to acquire any one
heavyweight lock.

In support of this, improve the recently-added timeout infrastructure
to permit efficiently enabling or disabling multiple timeouts at once.
That reduces the performance hit from turning on lock_timeout, though
it's still not zero.

Zoltán Böszörményi, reviewed by Tom Lane,
Stephen Frost, and Hari Babu
2013-03-16 23:22:57 -04:00
Tom Lane
73e7025bd8 Extend format() to handle field width and left/right alignment.
This change adds some more standard sprintf() functionality to format().

Pavel Stehule, reviewed by Dean Rasheed and Kyotaro Horiguchi
2013-03-14 22:56:56 -04:00
Heikki Linnakangas
59d0bf9dca Add cost estimation of range @> and <@ operators.
The estimates are based on the existing lower bound histogram, and a new
histogram of range lengths.

Bump catversion, because the range length histogram now needs to be present
in statistic slot kind 6, or you get an error on @> and <@ queries. (A
re-ANALYZE would be enough to fix that, though)

Alexander Korotkov, with some refactoring by me.
2013-03-14 15:36:56 +02:00
Andrew Dunstan
38fb4d978c JSON generation improvements.
This adds the following:

    json_agg(anyrecord) -> json
    to_json(any) -> json
    hstore_to_json(hstore) -> json (also used as a cast)
    hstore_to_json_loose(hstore) -> json

The last provides heuristic treatment of numbers and booleans.

Also, in json generation, if any non-builtin type has a cast to json,
that function is used instead of the type's output function.

Andrew Dunstan, reviewed by Steve Singer.

Catalog version bumped.
2013-03-10 17:35:36 -04:00
Heikki Linnakangas
23f10b6473 SP-GiST support of the range adjacent operator -|-
Alexander Korotkov, reviewed by Jeff Davis.
2013-03-08 15:03:19 +02:00
Tom Lane
1908abc4a3 Arrange to cache FdwRoutine structs in foreign tables' relcache entries.
This saves several catalog lookups per reference.  It's not all that
exciting right now, because we'd managed to minimize the number of places
that need to fetch the data; but the upcoming writable-foreign-tables patch
needs this info in a lot more places.
2013-03-06 23:48:09 -05:00
Robert Haas
f90cc26982 Code beautification for object-access hook machinery.
KaiGai Kohei
2013-03-06 20:53:25 -05:00
Tom Lane
80b011ef0a Fix to_char() to use ASCII-only case-folding rules where appropriate.
formatting.c used locale-dependent case folding rules in some code paths
where the result isn't supposed to be locale-dependent, for example
to_char(timestamp, 'DAY').  Since the source data is always just ASCII
in these cases, that usually didn't matter ... but it does matter in
Turkish locales, which have unusual treatment of "i" and "I".  To confuse
matters even more, the misbehavior was only visible in UTF8 encoding,
because in single-byte encodings we used pg_toupper/pg_tolower which
don't have locale-specific behavior for ASCII characters.  Fix by providing
intentionally ASCII-only case-folding functions and using these where
appropriate.  Per bug #7913 from Adnan Dursun.  Back-patch to all active
branches, since it's been like this for a long time.
2013-03-05 13:02:30 -05:00
Tom Lane
542eeba269 Fix overflow check in tm2timestamp (this time for sure).
I fixed this code back in commit 841b4a2d5, but didn't think carefully
enough about the behavior near zero, which meant it improperly rejected
1999-12-31 24:00:00.  Per report from Magnus Hagander.
2013-03-04 15:13:31 -05:00
Tom Lane
bc61878682 Fix map_sql_value_to_xml_value() to treat domains like their base types.
This was already the case for domains over arrays, but not for domains
over certain built-in types such as boolean.  The special formatting
rules for those types should apply to domains over them as well.
Per discussion.

While this is a bug fix, it's also a behavioral change that seems likely
to trip up some applications.  So no back-patch.

Pavel Stehule
2013-03-03 19:32:22 -05:00
Kevin Grittner
3bf3ab8c56 Add a materialized view relations.
A materialized view has a rule just like a view and a heap and
other physical properties like a table.  The rule is only used to
populate the table, references in queries refer to the
materialized data.

This is a minimal implementation, but should still be useful in
many cases.  Currently data is only populated "on demand" by the
CREATE MATERIALIZED VIEW and REFRESH MATERIALIZED VIEW statements.
It is expected that future releases will add incremental updates
with various timings, and that a more refined concept of defining
what is "fresh" data will be developed.  At some point it may even
be possible to have queries use a materialized in place of
references to underlying tables, but that requires the other
above-mentioned features to be working first.

Much of the documentation work by Robert Haas.
Review by Noah Misch, Thom Brown, Robert Haas, Marko Tiikkaja
Security review by KaiGai Kohei, with a decision on how best to
implement sepgsql still pending.
2013-03-03 18:23:31 -06:00
Alvaro Herrera
a730183926 Move relpath() to libpgcommon
This enables non-backend code, such as pg_xlogdump, to use it easily.
The previous location, in src/backend/catalog/catalog.c, made that
essentially impossible because that file depends on many backend-only
facilities; so this needs to live separately.
2013-02-21 22:46:17 -03:00
Alvaro Herrera
187492b6c2 Split pgstat file in smaller pieces
We now write one file per database and one global file, instead of
having the whole thing in a single huge file.  This reduces the I/O that
must be done when partial data is required -- which is all the time,
because each process only needs information on its own database anyway.
Also, the autovacuum launcher does not need data about tables and
functions in each database; having the global stats for all DBs is
enough.

Catalog version bumped because we have a new subdir under PGDATA.

Author: Tomas Vondra.  Some rework by Álvaro
Testing by Jeff Janes
Other discussion by Heikki Linnakangas, Tom Lane.
2013-02-18 18:12:52 -03:00
Peter Eisentraut
9475db3a4e Add ALTER ROLE ALL SET command
This generalizes the existing ALTER ROLE ... SET and ALTER DATABASE
... SET functionality to allow creating settings that apply to all users
in all databases.

reviewed by Pavel Stehule
2013-02-17 23:45:36 -05:00
Tom Lane
71627f3d19 Fix CVE-2013-0255 properly.
Revert commit ab0f7b6089 (in HEAD only)
in favor of the proper solution, which is to declare enum_recv() correctly
in the system catalogs.  It should be declared to take type "internal"
not "cstring".

Also improve the type_sanity regression test, which should have caught
this typo, so that it actually would.  Most of the relevant checks on
the signature of type I/O functions should not have been restricted to
basetypes/pseudotypes, as they should apply to any type's I/O functions.
2013-02-13 16:20:01 -05:00
Alvaro Herrera
8396447cdb Create libpgcommon, and move pg_malloc et al to it
libpgcommon is a new static library to allow sharing code among the
various frontend programs and backend; this lets us eliminate duplicate
implementations of common routines.  We avoid libpgport, because that's
intended as a place for porting issues; per discussion, it seems better
to keep them separate.

The first use case, and the only implemented by this patch, is pg_malloc
and friends, which many frontend programs were already using.

At the same time, we can use this to provide palloc emulation functions
for the frontend; this way, some palloc-using files in the backend can
also be used by the frontend cleanly.  To do this, we change palloc() in
the backend to be a function instead of a macro on top of
MemoryContextAlloc().  This was previously believed to cause loss of
performance, but this implementation has been tweaked by Tom and Andres
so that on modern compilers it provides a slight improvement over the
previous one.

This lets us clean up some places that were already with
localized hacks.

Most of the pg_malloc/palloc changes in this patch were authored by
Andres Freund. Zoltán Böszörményi also independently provided a form of
that.  libpgcommon infrastructure was authored by Álvaro.
2013-02-12 11:21:05 -03:00
Tom Lane
f806c191a3 Simplify box_overlap computations.
Given the assumption that a box's high coordinates are not less than its
low coordinates, the tests in box_ov() are overly complicated and can be
reduced to about half as much work.  Since many other functions in
geo_ops.c rely on that assumption, there doesn't seem to be a good reason
not to use it here.

Per discussion of Alexander Korotkov's GiST fix, which was already using
the simplified logic (in a non-fuzzy form, but the equivalence holds just
as well for fuzzy).
2013-02-08 18:26:08 -05:00
Andrew Dunstan
e1c1e21732 Enable building with Microsoft Visual Studio 2012.
Backpatch to release 9.2

Brar Piening and Noah Misch, reviewed by Craig Ringer.
2013-02-06 14:52:29 -05:00
Tom Lane
ab0f7b6089 Prevent execution of enum_recv() from SQL.
This function was misdeclared to take cstring when it should take internal.
This at least allows crashing the server, and in principle an attacker
might be able to use the function to examine the contents of server memory.

The correct fix is to adjust the system catalog contents (and fix the
regression tests that should have caught this but failed to).  However,
asking users to correct the catalog contents in existing installations
is a pain, so as a band-aid fix for the back branches, install a check
in enum_recv() to make it throw error if called with a cstring argument.
We will later revert this in HEAD in favor of correcting the catalogs.

Our thanks to Sumit Soni (via Secunia SVCRP) for reporting this issue.

Security: CVE-2013-0255
2013-02-04 16:25:01 -05:00
Simon Riggs
f480e29449 Reset vacuum_defer_cleanup_age to PGC_SIGHUP.
Revert commit 84725aa5ef
2013-02-04 16:39:55 +00:00
Tom Lane
62e666400d Perform line wrapping and indenting by default in ruleutils.c.
This patch changes pg_get_viewdef() and allied functions so that
PRETTY_INDENT processing is always enabled.  Per discussion, only the
PRETTY_PAREN processing (that is, stripping of "unnecessary" parentheses)
poses any real forward-compatibility risk, so we may as well make dump
output look as nice as we safely can.

Also, set the default wrap length to zero (i.e, wrap after each SELECT
or FROM list item), since there's no very principled argument for the
former default of 80-column wrapping, and most people seem to agree this
way looks better.

Marko Tiikkaja, reviewed by Jeevan Chalke, further hacking by Tom Lane
2013-02-03 15:56:45 -05:00
Simon Riggs
84725aa5ef Mark vacuum_defer_cleanup_age as PGC_POSTMASTER.
Following bug analysis of #7819 by Tom Lane
2013-02-02 18:49:54 +00:00
Tom Lane
9afc58396a Reject nonzero day fields in AT TIME ZONE INTERVAL functions.
It's not sensible for an interval that's used as a time zone value to be
larger than a day.  When we changed the interval type to contain a separate
day field, check_timezone() was adjusted to reject nonzero day values, but
timetz_izone(), timestamp_izone(), and timestamptz_izone() evidently were
overlooked.

While at it, make the error messages for these three cases consistent.
2013-01-31 12:12:23 -05:00
Tom Lane
991f3e5ab3 Provide database object names as separate fields in error messages.
This patch addresses the problem that applications currently have to
extract object names from possibly-localized textual error messages,
if they want to know for example which index caused a UNIQUE_VIOLATION
failure.  It adds new error message fields to the wire protocol, which
can carry the name of a table, table column, data type, or constraint
associated with the error.  (Since the protocol spec has always instructed
clients to ignore unrecognized field types, this should not create any
compatibility problem.)

Support for providing these new fields has been added to just a limited set
of error reports (mainly, those in the "integrity constraint violation"
SQLSTATE class), but we will doubtless add them to more calls in future.

Pavel Stehule, reviewed and extensively revised by Peter Geoghegan, with
additional hacking by Tom Lane.
2013-01-29 17:08:26 -05:00
Bruce Momjian
7e2322dff3 Allow CREATE TABLE IF EXIST so succeed if the schema is nonexistent
Previously, CREATE TABLE IF EXIST threw an error if the schema was
nonexistent.  This was done by passing 'missing_ok' to the function that
looks up the schema oid.
2013-01-26 13:24:50 -05:00
Tom Lane
0d5fbdc157 Change plan caching to honor, not resist, changes in search_path.
In the initial implementation of plan caching, we saved the active
search_path when a plan was first cached, then reinstalled that path
anytime we needed to reparse or replan.  The idea of that was to try to
reselect the same referenced objects, in somewhat the same way that views
continue to refer to the same objects in the face of schema or name
changes.  Of course, that analogy doesn't bear close inspection, since
holding the search_path fixed doesn't cope with object drops or renames.
Moreover sticking with the old path seems to create more surprises than
it avoids.  So instead of doing that, consider that the cached plan depends
on search_path, and force reparse/replan if the active search_path is
different than it was when we last saved the plan.

This gets us fairly close to having "transparency" of plan caching, in the
sense that the cached statement acts the same as if you'd just resubmitted
the original query text for another execution.  There are still some corner
cases where this fails though: a new object added in the search path
schema(s) might capture a reference in the query text, but we'd not realize
that and force a reparse.  We might try to fix that in the future, but for
the moment it looks too expensive and complicated.
2013-01-25 14:14:41 -05:00
Tom Lane
760f3c043a Fix concat() and format() to handle VARIADIC-labeled arguments correctly.
Previously, the VARIADIC labeling was effectively ignored, but now these
functions act as though the array elements had all been given as separate
arguments.

Pavel Stehule
2013-01-25 00:19:56 -05:00
Alvaro Herrera
0ac5ad5134 Improve concurrency of foreign key locking
This patch introduces two additional lock modes for tuples: "SELECT FOR
KEY SHARE" and "SELECT FOR NO KEY UPDATE".  These don't block each
other, in contrast with already existing "SELECT FOR SHARE" and "SELECT
FOR UPDATE".  UPDATE commands that do not modify the values stored in
the columns that are part of the key of the tuple now grab a SELECT FOR
NO KEY UPDATE lock on the tuple, allowing them to proceed concurrently
with tuple locks of the FOR KEY SHARE variety.

Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this
means the concurrency improvement applies to them, which is the whole
point of this patch.

The added tuple lock semantics require some rejiggering of the multixact
module, so that the locking level that each transaction is holding can
be stored alongside its Xid.  Also, multixacts now need to persist
across server restarts and crashes, because they can now represent not
only tuple locks, but also tuple updates.  This means we need more
careful tracking of lifetime of pg_multixact SLRU files; since they now
persist longer, we require more infrastructure to figure out when they
can be removed.  pg_upgrade also needs to be careful to copy
pg_multixact files over from the old server to the new, or at least part
of multixact.c state, depending on the versions of the old and new
servers.

Tuple time qualification rules (HeapTupleSatisfies routines) need to be
careful not to consider tuples with the "is multi" infomask bit set as
being only locked; they might need to look up MultiXact values (i.e.
possibly do pg_multixact I/O) to find out the Xid that updated a tuple,
whereas they previously were assured to only use information readily
available from the tuple header.  This is considered acceptable, because
the extra I/O would involve cases that would previously cause some
commands to block waiting for concurrent transactions to finish.

Another important change is the fact that locking tuples that have
previously been updated causes the future versions to be marked as
locked, too; this is essential for correctness of foreign key checks.
This causes additional WAL-logging, also (there was previously a single
WAL record for a locked tuple; now there are as many as updated copies
of the tuple there exist.)

With all this in place, contention related to tuples being checked by
foreign key rules should be much reduced.

As a bonus, the old behavior that a subtransaction grabbing a stronger
tuple lock than the parent (sub)transaction held on a given tuple and
later aborting caused the weaker lock to be lost, has been fixed.

Many new spec files were added for isolation tester framework, to ensure
overall behavior is sane.  There's probably room for several more tests.

There were several reviewers of this patch; in particular, Noah Misch
and Andres Freund spent considerable time in it.  Original idea for the
patch came from Simon Riggs, after a problem report by Joel Jacobson.
Most code is from me, with contributions from Marti Raudsepp, Alexander
Shulgin, Noah Misch and Andres Freund.

This patch was discussed in several pgsql-hackers threads; the most
important start at the following message-ids:
	AANLkTimo9XVcEzfiBR-ut3KVNDkjm2Vxh+t8kAmWjPuv@mail.gmail.com
	1290721684-sup-3951@alvh.no-ip.org
	1294953201-sup-2099@alvh.no-ip.org
	1320343602-sup-2290@alvh.no-ip.org
	1339690386-sup-8927@alvh.no-ip.org
	4FE5FF020200002500048A3D@gw.wicourts.gov
	4FEAB90A0200002500048B7D@gw.wicourts.gov
2013-01-23 12:04:59 -03:00
Tom Lane
75b39e7909 Add infrastructure for storing a VARIADIC ANY function's VARIADIC flag.
Originally we didn't bother to mark FuncExprs with any indication whether
VARIADIC had been given in the source text, because there didn't seem to be
any need for it at runtime.  However, because we cannot fold a VARIADIC ANY
function's arguments into an array (since they're not necessarily all the
same type), we do actually need that information at runtime if VARIADIC ANY
functions are to respond unsurprisingly to use of the VARIADIC keyword.
Add the missing field, and also fix ruleutils.c so that VARIADIC ANY
function calls are dumped properly.

Extracted from a larger patch that also fixes concat() and format() (the
only two extant VARIADIC ANY functions) to behave properly when VARIADIC is
specified.  This portion seems appropriate to review and commit separately.

Pavel Stehule
2013-01-21 20:26:15 -05:00
Robert Haas
841a5150c5 Add ddl_command_end support for event triggers.
Dimitri Fontaine, with slight changes by me
2013-01-21 18:00:24 -05:00
Tom Lane
535e69a43f Fix error-checking typo in check_TSCurrentConfig().
The code failed to detect an out-of-memory failure.

Xi Wang
2013-01-20 23:09:35 -05:00
Tom Lane
d5b31cc32b Fix an O(N^2) performance issue for sessions modifying many relations.
AtEOXact_RelationCache() scanned the entire relation cache at the end of
any transaction that created a new relation or assigned a new relfilenode.
Thus, clients such as pg_restore had an O(N^2) performance problem that
would start to be noticeable after creating 10000 or so tables.  Since
typically only a small number of relcache entries need any cleanup, we
can fix this by keeping a small list of their OIDs and doing hash_searches
for them.  We fall back to the full-table scan if the list overflows.

Ideally, the maximum list length would be set at the point where N
hash_searches would cost just less than the full-table scan.  Some quick
experimentation says that point might be around 50-100; I (tgl)
conservatively set MAX_EOXACT_LIST = 32.  For the case that we're worried
about here, which is short single-statement transactions, it's unlikely
there would ever be more than about a dozen list entries anyway; so it's
probably not worth being too tense about the value.

We could avoid the hash_searches by instead keeping the target relcache
entries linked into a list, but that would be noticeably more complicated
and bug-prone because of the need to maintain such a list in the face of
relcache entry drops.  Since a relcache entry can only need such cleanup
after a somewhat-heavyweight filesystem operation, trying to save a
hash_search per cleanup doesn't seem very useful anyway --- it's the scan
over all the not-needing-cleanup entries that we wish to avoid here.

Jeff Janes, reviewed and tweaked a bit by Tom Lane
2013-01-20 13:45:10 -05:00
Tom Lane
8ae35e9180 Improve memory space management in tuplesort and tuplestore.
The code originally just doubled the size of the tuple-pointer array so
long as that would fit in allowedMem.  This could result in failing to use
as much as half of allowedMem, if (as is typical) the last doubling attempt
didn't quite fit.  Worse, we might double the array size but be unable to
use most of the added slots, because there was no room left within the
allowedMem limit for tuples the slots should point to.  To fix, double only
so long as we've used less than half of allowedMem in total.  Then do one
more array enlargement, but scale it based on total memory consumption so
far.  This will work nicely as long as the average tuple size is reasonably
stable, and in any case should be better than the old method.

This change will result in large sort operations consuming a larger
fraction of work_mem than they typically did in the past.  The release
notes should mention that users may want to revisit their work_mem
settings, if they'd tuned those settings based on the old behavior of
sorting.

Jeff Janes, reviewed by Peter Geoghegan and Robert Haas
2013-01-17 13:12:56 -05:00
Magnus Hagander
bba486f372 Base the default SSL ciphers on DEFAULT instead of ALL
It's better to start from what the OpenSSL people consider a good
default and then remove insecure things (low encryption, exportable
encryption and md5 at this point) from that, instead of starting
from everything that exists and remove from that. We trust the
OpenSSL people to make good choices about what the default is.
2013-01-17 15:04:44 +01:00
Tom Lane
1b794d3f32 Fix hash_update_hash_key() to handle same-bucket case correctly.
Original coding would corrupt the hashtable if the item being updated was
at the end of its bucket chain and the new hash key hashed to that same
bucket.  Diagnosis and fix by Heikki Linnakangas.
2013-01-14 21:57:15 -05:00
Tom Lane
5c4eb9166e Reject out-of-range dates in to_date().
Dates outside the supported range could be entered, but would not print
reasonably, and operations such as conversion to timestamp wouldn't behave
sanely either.  Since this has the potential to result in undumpable table
data, it seems worth back-patching.

Hitoshi Harada
2013-01-14 15:19:48 -05:00
Tom Lane
2065dd2834 Prevent very-low-probability PANIC during PREPARE TRANSACTION.
The code in PostPrepare_Locks supposed that it could reassign locks to
the prepared transaction's dummy PGPROC by deleting the PROCLOCK table
entries and immediately creating new ones.  This was safe when that code
was written, but since we invented partitioning of the shared lock table,
it's not safe --- another process could steal away the PROCLOCK entry in
the short interval when it's on the freelist.  Then, if we were otherwise
out of shared memory, PostPrepare_Locks would have to PANIC, since it's
too late to back out of the PREPARE at that point.

Fix by inventing a dynahash.c function to atomically update a hashtable
entry's key.  (This might possibly have other uses in future.)

This is an ancient bug that in principle we ought to back-patch, but the
odds of someone hitting it in the field seem really tiny, because (a) the
risk window is small, and (b) nobody runs servers with maxed-out lock
tables for long, because they'll be getting non-PANIC out-of-memory errors
anyway.  So fixing it in HEAD seems sufficient, at least until the new
code has gotten some testing.
2013-01-13 22:20:22 -05:00
Peter Eisentraut
9d2cd99a60 Make spelling more uniform 2013-01-13 21:42:03 -05:00
Tom Lane
24dd0502a1 Update comments for elog_start().
Forgot I was going to do this as part of the previous patch ...
2013-01-13 18:50:48 -05:00
Tom Lane
31f38f28b0 Redesign the planner's handling of index-descent cost estimation.
Historically we've used a couple of very ad-hoc fudge factors to try to
get the right results when indexes of different sizes would satisfy a
query with the same number of index leaf tuples being visited.  In
commit 21a39de580 I tweaked one of these
fudge factors, with results that proved disastrous for larger indexes.
Commit bf01e34b55 fudged it some more,
but still with not a lot of principle behind it.

What seems like a better way to address these issues is to explicitly model
index-descent costs, since that's what's really at stake when considering
diferent indexes with similar leaf-page-level costs.  We tried that once
long ago, and found that charging random_page_cost per page descended
through was way too much, because upper btree levels tend to stay in cache
in real-world workloads.  However, there's still CPU costs to think about,
and the previous fudge factors can be seen as a crude attempt to account
for those costs.  So this patch replaces those fudge factors with explicit
charges for the number of tuple comparisons needed to descend the index
tree, plus a small charge per page touched in the descent.  The cost
multipliers are chosen so that the resulting charges are in the vicinity of
the historical (pre-9.2) fudge factors for indexes of up to about a million
tuples, while not ballooning unreasonably beyond that, as the old fudge
factor did (even more so in 9.2).

To make this work accurately for btree indexes, add some code that allows
extraction of the known root-page height from a btree.  There's no
equivalent number readily available for other index types, but we can use
the log of the number of index pages as an approximate substitute.

This seems like too much of a behavioral change to risk back-patching,
but it should improve matters going forward.  In 9.2 I'll just revert
the fudge-factor change.
2013-01-11 12:56:58 -05:00
Peter Eisentraut
49e7a26d67 Make some spelling more consistent 2013-01-05 08:25:21 -05:00
Tom Lane
94afbd5831 Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path.  And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead.  This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before.  This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.

By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.

An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones.  This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.

Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)

Heikki Linnakangas and Tom Lane
2013-01-04 17:42:19 -05:00
Alvaro Herrera
84f6fb81b8 Fix IsUnderPostmaster/EXEC_BACKEND confusion 2013-01-02 18:39:20 -03:00
Alvaro Herrera
15658911d9 Set MaxBackends only on bootstrap and standalone modes
... not on auxiliary processes.  I managed to overlook the fact that I
had disabled assertions on my HEAD checkout long ago.

Hopefully this will turn the buildfarm green again, and put an end to
today's silliness.
2013-01-02 17:57:56 -03:00
Alvaro Herrera
dfbba2c86c Make sure MaxBackends is always set
Auxiliary and bootstrap processes weren't getting it, causing initdb to
fail completely.
2013-01-02 14:39:11 -03:00
Alvaro Herrera
cdbc0ca48c Fix background workers for EXEC_BACKEND
Commit da07a1e8 was broken for EXEC_BACKEND because I failed to realize
that the MaxBackends recomputation needed to be duplicated by
subprocesses in SubPostmasterMain.  However, instead of having the value
be recomputed at all, it's better to assign the correct value at
postmaster initialization time, and have it be propagated to exec'ed
backends via BackendParameters.

MaxBackends stays as zero until after modules in
shared_preload_libraries have had a chance to register bgworkers, since
the value is going to be untrustworthy till that's finished.

Heikki Linnakangas and Álvaro Herrera
2013-01-02 12:01:14 -03:00
Bruce Momjian
bd61a623ac Update copyrights for 2013
Fully update git head, and update back branches in ./COPYRIGHT and
legal.sgml files.
2013-01-01 17:15:01 -05:00
Tom Lane
2ffa740be9 Fix ruleutils to cope with conflicts from adding/dropping/renaming columns.
In commit 11e131854f, we improved the
rule/view dumping code so that it would produce valid query representations
even if some of the tables involved in a query had been renamed since the
query was parsed.  This patch extends that idea to fix problems that occur
when individual columns are renamed, or added or dropped.  As before, the
core of the fix is to assign unique new aliases when a name conflict has
been created.  This is complicated by the JOIN USING feature, which
requires the same column alias to be used in both input relations, but we
can handle that with a sufficiently complex approach to assigning aliases.

A fortiori, this patch takes care of situations where the query didn't have
unique column names to begin with, such as in a recent complaint from Bryan
Nuse.  (Because of expansion of "SELECT *", re-parsing a dumped query can
require column name uniqueness even though the original text did not.)
2012-12-31 15:13:26 -05:00
Tom Lane
3f88b08003 Fix some minor issues in view pretty-printing.
Code review for commit 2f582f76b1: don't use
a static variable for what ought to be a deparse_context field, fix
non-multibyte-safe test for spaces, avoid useless and potentially O(N^2)
(though admittedly with a very small constant) calculations of wrap
positions when we aren't going to wrap.
2012-12-24 17:52:19 -05:00
Simon Riggs
ae9aba69a8 Keep rd_newRelfilenodeSubid across overflow.
Teach RelationCacheInvalidate() to keep rd_newRelfilenodeSubid across rel cache
message overflows, so that behaviour is now fully deterministic.

Noah Misch
2012-12-24 16:43:22 +00:00
Tom Lane
6919b7e329 Fix failure to ignore leftover temp tables after a server crash.
During crash recovery, we remove disk files belonging to temporary tables,
but the system catalog entries for such tables are intentionally not
cleaned up right away.  Instead, the first backend that uses a temp schema
is expected to clean out any leftover objects therein.  This approach
requires that we be careful to ignore leftover temp tables (since any
actual access attempt would fail), *even if their BackendId matches our
session*, if we have not yet established use of the session's corresponding
temp schema.  That worked fine in the past, but was broken by commit
debcec7dc3 which incorrectly removed the
rd_islocaltemp relcache flag.  Put it back, and undo various changes
that substituted tests like "rel->rd_backend == MyBackendId" for use
of a state-aware flag.  Per trouble report from Heikki Linnakangas.

Back-patch to 9.1 where the erroneous change was made.  In the back
branches, be careful to add rd_islocaltemp in a spot in the struct that
was alignment padding before, so as not to break existing add-on code.
2012-12-17 20:15:32 -05:00
Tom Lane
c299477229 Fix filling of postmaster.pid in bootstrap/standalone mode.
We failed to ever fill the sixth line (LISTEN_ADDR), which caused the
attempt to fill the seventh line (SHMEM_KEY) to fail, so that the shared
memory key never got added to the file in standalone mode.  This has been
broken since we added more content to our lock files in 9.1.

To fix, tweak the logic in CreateLockFile to add an empty LISTEN_ADDR
line in standalone mode.  This is a tad grotty, but since that function
already knows almost everything there is to know about the contents of
lock files, it doesn't seem that it's any better to hack it elsewhere.

It's not clear how significant this bug really is, since a standalone
backend should never have any children and thus it seems not critical
to be able to check the nattch count of the shmem segment externally.
But I'm going to back-patch the fix anyway.

This problem had escaped notice because of an ancient (and in hindsight
pretty dubious) decision to suppress LOG-level messages by default in
standalone mode; so that the elog(LOG) complaint in AddToDataDirLockFile
that should have warned of the problem didn't do anything.  Fixing that
is material for a separate patch though.
2012-12-16 15:02:49 -05:00
Andrew Dunstan
3717f0837b Tidy up from frontend Assert change.
Quiet compiler warnings noted by Peter Eisentraut.
2012-12-16 12:22:57 -05:00
Tom Lane
691c5ebf79 Add defenses against integer overflow in dynahash numbuckets calculations.
The dynahash code requires the number of buckets in a hash table to fit
in an int; but since we calculate the desired hash table size dynamically,
there are various scenarios where we might calculate too large a value.
The resulting overflow can lead to infinite loops, division-by-zero
crashes, etc.  I (tgl) had previously installed some defenses against that
in commit 299d171652, but that covered only one
call path.  Moreover it worked by limiting the request size to work_mem,
but in a 64-bit machine it's possible to set work_mem high enough that the
problem appears anyway.  So let's fix the problem at the root by installing
limits in the dynahash.c functions themselves.

Trouble report and patch by Jeff Davis.
2012-12-11 22:09:05 -05:00
Tom Lane
a99c42f291 Support automatically-updatable views.
This patch makes "simple" views automatically updatable, without the need
to create either INSTEAD OF triggers or INSTEAD rules.  "Simple" views
are those classified as updatable according to SQL-92 rules.  The rewriter
transforms INSERT/UPDATE/DELETE commands on such views directly into an
equivalent command on the underlying table, which will generally have
noticeably better performance than is possible with either triggers or
user-written rules.  A view that has INSTEAD OF triggers or INSTEAD rules
continues to operate the same as before.

For the moment, security_barrier views are not considered simple.
Also, we do not support WITH CHECK OPTION.  These features may be
added in future.

Dean Rasheed, reviewed by Amit Kapila
2012-12-08 18:26:21 -05:00
Alvaro Herrera
da07a1e856 Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code.  They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.

Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on.  Modules can install more than one
bgworker, if necessary.

Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration.  This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests.  Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)

The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).

In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing.  The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).

More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.

An elementary sample module is supplied.

Author: Álvaro Herrera

This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.

Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 17:47:30 -03:00
Simon Riggs
8de72b66a2 COPY FREEZE and mark committed on fresh tables.
When a relfilenode is created in this subtransaction or
a committed child transaction and it cannot otherwise
be seen by our own process, mark tuples committed ahead
of transaction commit for all COPY commands in same
transaction. If FREEZE specified on COPY
and pre-conditions met then rows will also be frozen.
Both options designed to avoid revisiting rows after commit,
increasing performance of subsequent commands after
data load and upgrade. pg_restore changes later.

Simon Riggs, review comments from Heikki Linnakangas, Noah Misch and design
input from Tom Lane, Robert Haas and Kevin Grittner
2012-12-01 12:54:20 +00:00
Tom Lane
3c84046490 Fix assorted bugs in CREATE/DROP INDEX CONCURRENTLY.
Commit 8cb53654db, which introduced DROP
INDEX CONCURRENTLY, managed to break CREATE INDEX CONCURRENTLY via a poor
choice of catalog state representation.  The pg_index state for an index
that's reached the final pre-drop stage was the same as the state for an
index just created by CREATE INDEX CONCURRENTLY.  This meant that the
(necessary) change to make RelationGetIndexList ignore about-to-die indexes
also made it ignore freshly-created indexes; which is catastrophic because
the latter do need to be considered in HOT-safety decisions.  Failure to
do so leads to incorrect index entries and subsequently wrong results from
queries depending on the concurrently-created index.

To fix, add an additional boolean column "indislive" to pg_index, so that
the freshly-created and about-to-die states can be distinguished.  (This
change obviously is only possible in HEAD.  This patch will need to be
back-patched, but in 9.2 we'll use a kluge consisting of overloading the
formerly-impossible state of indisvalid = true and indisready = false.)

In addition, change CREATE/DROP INDEX CONCURRENTLY so that the pg_index
flag changes they make without exclusive lock on the index are made via
heap_inplace_update() rather than a normal transactional update.  The
latter is not very safe because moving the pg_index tuple could result in
concurrent SnapshotNow scans finding it twice or not at all, thus possibly
resulting in index corruption.  This is a pre-existing bug in CREATE INDEX
CONCURRENTLY, which was copied into the DROP code.

In addition, fix various places in the code that ought to check to make
sure that the indexes they are manipulating are valid and/or ready as
appropriate.  These represent bugs that have existed since 8.2, since
a failed CREATE INDEX CONCURRENTLY could leave a corrupt or invalid
index behind, and we ought not try to do anything that might fail with
such an index.

Also fix RelationReloadIndexInfo to ensure it copies all the pg_index
columns that are allowed to change after initial creation.  Previously we
could have been left with stale values of some fields in an index relcache
entry.  It's not clear whether this actually had any user-visible
consequences, but it's at least a bug waiting to happen.

In addition, do some code and docs review for DROP INDEX CONCURRENTLY;
some cosmetic code cleanup but mostly addition and revision of comments.

This will need to be back-patched, but in a noticeably different form,
so I'm committing it to HEAD before working on the back-patch.

Problem reported by Amit Kapila, diagnosis by Pavan Deolassee,
fix by Tom Lane and Andres Freund.
2012-11-28 21:26:01 -05:00
Alvaro Herrera
1577b46b7c Split out rmgr rm_desc functions into their own files
This is necessary (but not sufficient) to have them compilable outside
of a backend environment.
2012-11-28 13:01:15 -03:00
Heikki Linnakangas
1f67078ea3 Add OpenTransientFile, with automatic cleanup at end-of-xact.
Files opened with BasicOpenFile or PathNameOpenFile are not automatically
cleaned up on error. That puts unnecessary burden on callers that only want
to keep the file open for a short time. There is AllocateFile, but that
returns a buffered FILE * stream, which in many cases is not the nicest API
to work with. So add function called OpenTransientFile, which returns a
unbuffered fd that's cleaned up like the FILE* returned by AllocateFile().

This plugs a few rare fd leaks in error cases:

1. copy_file() - fixed by by using OpenTransientFile instead of BasicOpenFile
2. XLogFileInit() - fixed by adding close() calls to the error cases. Can't
   use OpenTransientFile here because the fd is supposed to persist over
   transaction boundaries.
3. lo_import/lo_export - fixed by using OpenTransientFile instead of
   PathNameOpenFile.

In addition to plugging those leaks, this replaces many BasicOpenFile() calls
with OpenTransientFile() that were not leaking, because the code meticulously
closed the file on error. That wasn't strictly necessary, but IMHO it's good
for robustness.

The same leaks exist in older versions, but given the rarity of the issues,
I'm not backpatching this. Not yet, anyway - it might be good to backpatch
later, after this mechanism has had some more testing in master branch.
2012-11-27 10:25:50 +02:00
Heikki Linnakangas
5cb0e33597 Speed up operations on numeric, mostly by avoiding palloc() overhead.
In many functions, a NumericVar was initialized from an input Numeric, to be
passed as input to a calculation function. When the NumericVar is not
modified, the digits array of the NumericVar can point directly to the digits
array in the original Numeric, and we can avoid a palloc() and memcpy(). Add
init_var_from_num() function to initialize a var like that.

Remove dscale argument from get_str_from_var(), as all the callers just
passed the dscale of the variable. That means that the rounding it used to
do was not actually necessary, and get_str_from_var() no longer scribbles on
its input. That makes it safer in general, and allows us to use the new
init_var_from_num() function in e.g numeric_out().

Also modified numericvar_to_int8() to no scribble on its input either. It
creates a temporary copy to avoid that. To compensate, the callers no longer
need to create a temporary copy, so the net # of pallocs is the same, but this
is nicer.

In the passing, use a constant for the number 10 in get_str_from_var_sci(),
when calculating 10^exponent. Saves a palloc() and some cycles to convert
integer 10 to numeric.

Original patch by Kyotaro HORIGUCHI, with further changes by me. Reviewed
by Pavel Stehule.
2012-11-21 15:53:35 +02:00
Tom Lane
1f7cb5c309 Improve handling of INT_MIN / -1 and related cases.
Some platforms throw an exception for this division, rather than returning
a necessarily-overflowed result.  Since we were testing for overflow after
the fact, an exception isn't nice.  We can avoid the problem by treating
division by -1 as negation.

Add some regression tests so that we'll find out if any compilers try to
optimize away the overflow check conditions.

This ought to be back-patched, but I'm going to see what the buildfarm
reports about the regression tests first.

Per discussion with Xi Wang, though this is different from the patch he
submitted.
2012-11-19 12:24:25 -05:00
Tom Lane
b6e3798f3a Limit values of archive_timeout, post_auth_delay, auth_delay.milliseconds.
The previous definitions of these GUC variables allowed them to range
up to INT_MAX, but in point of fact the underlying code would suffer
overflows or other errors with large values.  Reduce the maximum values
to something that won't misbehave.  There's no apparent value in working
harder than this, since very large delays aren't sensible for any of
these.  (Note: the risk with archive_timeout is that if we're late
checking the state, the timestamp difference it's being compared to
might overflow.  So we need some amount of slop; the choice of INT_MAX/2
is arbitrary.)

Per followup investigation of bug #7670.  Although this isn't a very
significant fix, might as well back-patch.
2012-11-18 17:15:06 -05:00
Tom Lane
d038966ddb Fix syslogger to not fail when log_rotation_age exceeds 2^31 milliseconds.
We need to avoid calling WaitLatch with timeouts exceeding INT_MAX.
Fortunately a simple clamp will do the trick, since no harm is done if
the wait times out before it's really time to rotate the log file.
Per bug #7670 (probably bug #7545 is the same thing, too).

In passing, fix bogus definition of log_rotation_age's maximum value in
guc.c --- it was numerically right, but only because MINS_PER_HOUR and
SECS_PER_MINUTE have the same value.

Back-patch to 9.2.  Before that, syslogger wasn't using WaitLatch.
2012-11-18 16:16:39 -05:00
Tom Lane
a235b85a0b Fix the int8 and int2 cases of (minimum possible integer) % (-1).
The correct answer for this (or any other case with arg2 = -1) is zero,
but some machines throw a floating-point exception instead of behaving
sanely.  Commit f9ac414c35 dealt with this
in int4mod, but overlooked the fact that it also happens in int8mod
(at least on my Linux x86_64 machine).  Protect int2mod as well; it's
not clear whether any machines fail there (mine does not) but since the
test is so cheap it seems better safe than sorry.  While at it, simplify
the original guard in int4mod: we need only check for arg2 == -1, we
don't need to check arg1 explicitly.

Xi Wang, with some editing by me.
2012-11-14 17:30:00 -05:00
Tom Lane
273986bf0d Fix memory leaks in record_out() and record_send().
record_out() leaks memory: it fails to free the strings returned by the
per-column output functions, and also is careless about detoasted values.
This results in a query-lifespan memory leakage when returning composite
values to the client, because printtup() runs the output functions in the
query-lifespan memory context.  Fix it to handle these issues the same way
printtup() does.  Also fix a similar leakage in record_send().

(At some point we might want to try to run output functions in
shorter-lived memory contexts, so that we don't need a zero-leakage policy
for them.  But that would be a significantly more invasive patch, which
doesn't seem like material for back-patching.)

In passing, use appendStringInfoCharMacro instead of appendStringInfoChar
in the innermost data-copying loop of record_out, to try to shave a few
cycles from this function's runtime.

Per trouble report from Carlos Henrique Reimer.  Back-patch to all
supported versions.
2012-11-13 14:45:26 -05:00
Heikki Linnakangas
dbdf9679d7 Use correct text domain for translating errcontext() messages.
errcontext() is typically used in an error context callback function, not
within an ereport() invocation like e.g errmsg and errdetail are. That means
that the message domain that the TEXTDOMAIN magic in ereport() determines
is not the right one for the errcontext() calls. The message domain needs to
be determined by the C file containing the errcontext() call, not the file
containing the ereport() call.

Fix by turning errcontext() into a macro that passes the TEXTDOMAIN to use
for the errcontext message. "errcontext" was used in a few places as a
variable or struct field name, I had to rename those out of the way, now
that errcontext is a macro.

We've had this problem all along, but this isn't doesn't seem worth
backporting. It's a fairly minor issue, and turning errcontext from a
function to a macro requires at least a recompile of any external code that
calls errcontext().
2012-11-12 17:07:29 +02:00
Heikki Linnakangas
add6c3179a Make the streaming replication protocol messages architecture-independent.
We used to send structs wrapped in CopyData messages, which works as long as
the client and server agree on things like endianess, timestamp format and
alignment. That's good enough for running a standby server, which has to run
on the same platform anyway, but it's useful for tools like pg_receivexlog
to work across platforms.

This breaks protocol compatibility of streaming replication, but we never
promised that to be compatible across versions, anyway.
2012-11-07 19:09:13 +02:00
Tom Lane
bf01e34b55 Tweak genericcostestimate's fudge factor for index size.
To provide some bias against using a large index when a small one would do
as well, genericcostestimate adds a "fudge factor", which for a long time
was random_page_cost * index_pages/10000.  However, this can grow to be the
dominant term in indexscan cost estimates when the index involved is large
enough, a behavior that was never intended.  Change to a ln(1 + n/10000)
formulation, which has nearly the same behavior up to a few hundred pages
but tails off significantly thereafter.  (A log curve seems correct on
first principles, since what we're trying to account for here is index
descent costs, which are typically logarithmic.)  Per bug #7619 from Niko
Kiirala.

Possibly this change should get back-patched, but I'm hesitant to mess with
cost estimates in stable branches.
2012-10-24 16:25:40 -04:00
Tom Lane
4e32f8cd14 Fix hash_search to avoid corruption of the hash table on out-of-memory.
An out-of-memory error during expand_table() on a palloc-based hash table
would leave a partially-initialized entry in the table.  This would not be
harmful for transient hash tables, since they'd get thrown away anyway at
transaction abort.  But for long-lived hash tables, such as the relcache
hash, this would effectively corrupt the table, leading to crash or other
misbehavior later.

To fix, rearrange the order of operations so that table enlargement is
attempted before we insert a new entry, rather than after adding it
to the hash table.

Problem discovered by Hitoshi Harada, though this is a bit different
from his proposed patch.
2012-10-19 15:24:03 -04:00
Tom Lane
0d6895051a Fix ruleutils to print "INSERT INTO foo DEFAULT VALUES" correctly.
Per bug #7615 from Marko Tiikkaja.  Apparently nobody ever tried this
case before ...
2012-10-19 13:39:51 -04:00
Tom Lane
002191a1a3 Further cleanup of catcache.c ilist changes.
Remove useless duplicate initialization of bucket headers, don't use a
dlist_mutable_iter in a performance-critical path that doesn't need it,
make some other cosmetic changes for consistency's sake.
2012-10-18 19:30:43 -04:00
Tom Lane
dc5aeca168 Remove unnecessary "head" arguments from some dlist/slist functions.
dlist_delete, dlist_insert_after, dlist_insert_before, slist_insert_after
do not need access to the list header, and indeed insisting on that negates
one of the main advantages of a doubly-linked list.

In consequence, revert addition of "cache_bucket" field to CatCTup.
2012-10-18 19:04:20 -04:00
Alvaro Herrera
a66ee69add Embedded list interface
Provide a common implementation of embedded singly-linked and
doubly-linked lists.  "Embedded" in the sense that the nodes'
next/previous pointers exist within some larger struct; this design
choice reduces memory allocation overhead.

Most of the implementation uses inlineable functions (where supported),
for performance.

Some existing uses of both types of lists have been converted to the new
code, for demonstration purposes.  Other uses can (and probably will) be
converted in the future.  Since dllist.c is unused after this conversion,
it has been removed.

Author: Andres Freund
Some tweaks by me
Reviewed by Tom Lane, Peter Geoghegan
2012-10-17 11:31:20 -03:00
Bruce Momjian
22cc3b35f4 When outputting the session id in log_line_prefix (%c) or in CSV log
output mode, cause the hex digits after the period to always be at least
four hex digits, with zero-padding.
2012-10-16 12:37:59 -04:00
Tom Lane
8b728e5c6e Fix oversight in new code for printing rangetable aliases.
In commit 11e131854f, I missed the case of
a CTE RTE that doesn't have a user-defined alias, but does have an
alias assigned by set_rtable_names().  Per report from Peter Eisentraut.

While at it, refactor slightly to reduce code duplication.
2012-10-12 16:14:43 -04:00
Tom Lane
71e58dcfb9 Make equal() ignore CoercionForm fields for better planning with casts.
This change ensures that the planner will see implicit and explicit casts
as equivalent for all purposes, except in the minority of cases where
there's actually a semantic difference (as reflected by having a 3-argument
cast function).  In particular, this fixes cases where the EquivalenceClass
machinery failed to consider two references to a varchar column as
equivalent if one was implicitly cast to text but the other was explicitly
cast to text, as seen in bug #7598 from Vaclav Juza.  We have had similar
bugs before in other parts of the planner, so I think it's time to fix this
problem at the core instead of continuing to band-aid around it.

Remove set_coercionform_dontcare(), which represents the band-aid
previously in use for allowing matching of index and constraint expressions
with inconsistent cast labeling.  (We can probably get rid of
COERCE_DONTCARE altogether, but I don't think removing that enum value in
back branches would be wise; it's possible there's third party code
referring to it.)

Back-patch to 9.2.  We could go back further, and might want to once this
has been tested more; but for the moment I won't risk destabilizing plan
choices in long-since-stable branches.
2012-10-12 12:11:22 -04:00
Heikki Linnakangas
6f60fdd701 Improve replication connection timeouts.
Rename replication_timeout to wal_sender_timeout, and add a new setting
called wal_receiver_timeout that does the same at the walreceiver side.
There was previously no timeout in walreceiver, so if the network went down,
for example, the walreceiver could take a long time to notice that the
connection was lost. Now with the two settings, both sides of a replication
connection will detect a broken connection similarly.

It is no longer necessary to manually set wal_receiver_status_interval to
a value smaller than the timeout. Both wal sender and receiver now
automatically send a "ping" message if more than 1/2 of the configured
timeout has elapsed, and it hasn't received any messages from the other end.

Amit Kapila, heavily edited by me.
2012-10-11 17:48:08 +03:00
Peter Eisentraut
8521d13194 Refactor flex and bison make rules
Numerous flex and bison make rules have appeared in the source tree
over time, and they are all virtually identical, so we can replace
them by pattern rules with some variables for customization.

Users of pgxs will also be able to benefit from this.
2012-10-11 06:57:04 -04:00
Tom Lane
26fe56481c Code review for 64-bit-large-object patch.
Fix broken-on-bigendian-machines byte-swapping functions, add missed update
of alternate regression expected file, improve error reporting, remove some
unnecessary code, sync testlo64.c with current testlo.c (it seems to have
been cloned from a very old copy of that), assorted cosmetic improvements.
2012-10-08 18:24:32 -04:00
Alvaro Herrera
878daf2e72 Fix thinko in previous commit
Since postgres.h includes palloc.h, definitions that affect the latter
must be present before the former is included.

Per buildfarm results
2012-10-08 18:33:08 -03:00
Alvaro Herrera
976fa10d20 Add support for easily declaring static inline functions
We already had those, but they forced modules to spell out the function
bodies twice.  Eliminate some duplicates we had already grown.

Extracted from a somewhat larger patch from Andres Freund.
2012-10-08 16:28:01 -03:00
Andrew Dunstan
33a7101281 Quiet a few MSC compiler warnings. 2012-10-07 17:31:10 -04:00
Tatsuo Ishii
461ef73f09 Add API for 64-bit large object access. Now users can access up to
4TB large objects (standard 8KB BLCKSZ case).  For this purpose new
libpq API lo_lseek64, lo_tell64 and lo_truncate64 are added.  Also
corresponding new backend functions lo_lseek64, lo_tell64 and
lo_truncate64 are added. inv_api.c is changed to handle 64-bit
offsets.

Patch contributed by Nozomi Anzai (backend side) and Yugo Nagata
(frontend side, docs, regression tests and example program). Reviewed
by Kohei Kaigai. Committed by Tatsuo Ishii with minor editings.
2012-10-07 08:36:48 +09:00
Tom Lane
1f91c8ca1d Avoid planner crash/Assert failure with joins to unflattened subqueries.
examine_simple_variable supposed that any RTE_SUBQUERY rel it gets pointed
at must have been planned already.  However, this isn't a safe assumption
because we must do selectivity estimation while generating indexscan paths,
and that code might look at join clauses involving a rel that the loop in
set_base_rel_sizes() hasn't reached yet.  The simplest fix is to play dumb
in such a situation, that is give up trying to extract any stats for the
Var.  This could possibly be improved by making a separate pass over the
RTE list to plan each unflattened subquery before we start the main
planning work --- but that would be pretty invasive and it doesn't seem
worth it, for now at least.  (We couldn't just break set_base_rel_sizes()
into two loops: the prescan would need to handle all subquery rels in the
query, not only those in the current join subproblem.)

This bug was introduced in commit 1cb108efb0,
although I think that subsequent changes may have exposed it more than it
was originally.  Per bug #7580 from Maxim Boguk.
2012-10-03 13:37:53 -04:00
Tom Lane
09ac603c36 Work around unportable behavior of malloc(0) and realloc(NULL, 0).
On some platforms these functions return NULL, rather than the more common
practice of returning a pointer to a zero-sized block of memory.  Hack our
various wrapper functions to hide the difference by substituting a size
request of 1.  This is probably not so important for the callers, who
should never touch the block anyway if they asked for size 0 --- but it's
important for the wrapper functions themselves, which mistakenly treated
the NULL result as an out-of-memory failure.  This broke at least pg_dump
for the case of no user-defined aggregates, as per report from
Matthew Carrington.

Back-patch to 9.2 to fix the pg_dump issue.  Given the lack of previous
complaints, it seems likely that there is no live bug in previous releases,
even though some of these functions were in place before that.
2012-10-02 17:32:42 -04:00
Heikki Linnakangas
0899556e92 Fix access past end of string in date parsing.
This affects date_in(), and a couple of other funcions that use DecodeDate().

Hitoshi Harada
2012-10-02 10:43:48 +03:00
Alvaro Herrera
ae90ffada4 Have pg_terminate/cancel_backend not ERROR on non-existent processes
This worked fine for superusers, but not for ordinary users trying to
cancel their own processes.  Tweak the order the checks are done in so
that we correctly return SIGNAL_BACKEND_ERROR (which current callers
know to ignore without erroring out) so that an ordinary user can loop
through a resultset without fearing that a process might exit in the
middle of said looping -- causing the remaining processes to go
unsignalled.

Incidentally, the last in-core caller of IsBackendPid() is now gone.
However, the function is exported and must remain in place, because
there are plenty of callers in external modules.

Author: Josh Kupershmidt

Reviewed by Noah Misch
2012-09-27 12:29:51 -03:00
Heikki Linnakangas
2a0c81a12c Add support for include_dir in config file.
This allows easily splitting configuration into many files, deployed in a
directory.

Magnus Hagander, Greg Smith, Selena Deckelmann, reviewed by Noah Misch.
2012-09-24 18:07:53 +03:00
Tom Lane
11e131854f Improve ruleutils.c's heuristics for dealing with rangetable aliases.
The previous scheme had bugs in some corner cases involving tables that had
been renamed since a view was made.  This could result in dumped views that
failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd
Albin, as well as in some pgsql-hackers discussion back in January.  Also,
its behavior for printing EXPLAIN plans was sometimes confusing because of
willingness to use the same alias for multiple RTEs (it was Ashutosh
Bapat's complaint about that aspect that started the January thread).

To fix, ensure that each RTE in the query has a unique unqualified alias,
by modifying the alias if necessary (we add "_" and digits as needed to
create a non-conflicting name).  Then we can just print its variables with
that alias, avoiding the confusing and bug-prone scheme of sometimes
schema-qualifying variable names.  In EXPLAIN, it proves to be expedient to
take the further step of only assigning such aliases to RTEs that are
actually referenced in the query, since the planner has a habit of
generating extra RTEs with the same alias in situations such as
inheritance-tree expansion.

Although this fixes a bug of very long standing, I'm hesitant to back-patch
such a noticeable behavioral change.  My experiments while creating a
regression test convinced me that actually incorrect output (as opposed to
confusing output) occurs only in very narrow cases, which is backed up by
the lack of previous complaints from the field.  So we may be better off
living with it in released branches; and in any case it'd be smart to let
this ripen awhile in HEAD before we consider back-patching it.
2012-09-21 19:03:10 -04:00
Heikki Linnakangas
7c45e3a3c6 Parse pg_ident.conf when it's loaded, keeping it in memory in parsed format.
Similar changes were done to pg_hba.conf earlier already, this commit makes
pg_ident.conf to behave the same as pg_hba.conf.

This has two user-visible effects. First, if pg_ident.conf contains multiple
errors, the whole file is parsed at postmaster startup time and all the
errors are immediately reported. Before this patch, the file was parsed and
the errors were reported only when someone tries to connect using an
authentication method that uses the file, and the parsing stopped on first
error. Second, if you SIGHUP to reload the config files, and the new
pg_ident.conf file contains an error, the error is logged but the old file
stays in effect.

Also, regular expressions in pg_ident.conf are now compiled only once when
the file is loaded, rather than every time the a user is authenticated. That
should speed up authentication if you have a lot of regexps in the file.

Amit Kapila
2012-09-21 17:54:39 +03:00
Heikki Linnakangas
9d5e9730e5 Fix obsolete comment.
load_hba and load_ident load stuff in a separate memory context nowadays,
not in the current memory context.
2012-09-21 15:22:56 +03:00
Tom Lane
3f828fae62 Fix array_typanalyze to work for domains over arrays.
Not sure how we missed this case, but we did.  Per bug #7551 from
Diego de Lima.
2012-09-18 00:31:40 -04:00
Tom Lane
d2286a98ef Allow embedded spaces without quoting in unix_socket_directories entries.
This fix removes an unnecessary incompatibility with the old behavior of
the unix_socket_directory parameter.  Since pathnames with embedded spaces
are fairly popular on some platforms, the incompatibility could be
significant in practice.  We'll still strip unquoted leading/trailing
spaces, however.

No docs update since the documentation already implied that it worked
like this.

Per bug #7514 from Murray Cumming.
2012-09-06 11:43:51 -04:00
Bruce Momjian
015722fb36 Fix to_date() and to_timestamp() to allow specification of the day of
the week via ISO or Gregorian designations.  The fix is to store the
day-of-week consistently as 1-7, Sunday = 1.

Fixes bug reported by Marc Munro
2012-09-03 22:52:44 -04:00
Tom Lane
58a031f920 Make configure probe for mbstowcs_l as well as wcstombs_l.
We previously supposed that any given platform would supply both or neither
of these functions, so that one configure test would be sufficient.  It now
appears that at least on AIX this is not the case ... which is likely an
AIX bug, but nonetheless we need to cope with it.  So use separate tests.
Per bug #6758; thanks to Andrew Hastie for doing the followup testing
needed to confirm what was happening.

Backpatch to 9.1, where we began using these functions.
2012-08-31 14:17:56 -04:00
Alvaro Herrera
c219d9b0a5 Split tuple struct defs from htup.h to htup_details.h
This reduces unnecessary exposure of other headers through htup.h, which
is very widely included by many files.

I have chosen to move the function prototypes to the new file as well,
because that means htup.h no longer needs to include tupdesc.h.  In
itself this doesn't have much effect in indirect inclusion of tupdesc.h
throughout the tree, because it's also required by execnodes.h; but it's
something to explore in the future, and it seemed best to do the htup.h
change now while I'm busy with it.
2012-08-30 16:52:35 -04:00
Bruce Momjian
381a9ed66d Remove configure flag --disable-shared, as it is no longer used by any
port.  The last use was QNX, per Peter Eisentraut.
2012-08-30 16:26:53 -04:00
Heikki Linnakangas
3e6eb0dd0a Fix division by zero in the new range type histogram creation.
Report and analysis by Matthias.
2012-08-30 20:29:11 +03:00
Tom Lane
d1a4db8d25 Improve EXPLAIN's ability to cope with LATERAL references in plans.
push_child_plan/pop_child_plan didn't bother to adjust the "ancestors"
list of parent plan nodes when descending to a child plan node.  I think
this was okay when it was written, but it's not okay in the presence of
LATERAL references, since a subplan node could easily be returning a
LATERAL value back up to the same nestloop node that provides the value.
Per changed regression test results, the omission led to failure to
interpret Param nodes that have perfectly good interpretations.
2012-08-30 12:56:50 -04:00
Bruce Momjian
3825963e7f Report postmaster.pid file as empty if it is empty, rather than
reporting in contains invalid data.
2012-08-29 17:05:22 -04:00
Alvaro Herrera
21c09e99dc Split heapam_xlog.h from heapam.h
The heapam XLog functions are used by other modules, not all of which
are interested in the rest of the heapam API.  With this, we let them
get just the XLog stuff in which they are interested and not pollute
them with unrelated includes.

Also, since heapam.h no longer requires xlog.h, many files that do
include heapam.h no longer get xlog.h automatically, including a few
headers.  This is useful because heapam.h is getting pulled in by
execnodes.h, which is in turn included by a lot of files.
2012-08-28 19:02:00 -04:00
Alvaro Herrera
fda0594fc2 remove catcache.h from syscache.h
Instead, place a forward struct declaration for struct catclist in
syscache.h.  This reduces header proliferation somewhat.
2012-08-28 18:36:39 -04:00
Alvaro Herrera
45326c5a11 Split resowner.h
This lets files that are mere users of ResourceOwner not automatically
include the headers for stuff that is managed by the resowner mechanism.
2012-08-28 18:02:07 -04:00
Heikki Linnakangas
918eee0c49 Collect and use histograms of lower and upper bounds for range types.
This enables selectivity estimation of the <<, >>, &<, &> and && operators,
as well as the normal inequality operators: <, <=, >=, >. "range @> element"
is also supported, but the range-variant @> and <@ operators are not,
because they cannot be sensibly estimated with lower and upper bound
histograms alone. We would need to make some assumption about the lengths of
the ranges for that. Alexander's patch included a separate histogram of
lengths for that, but I left that out of the patch for simplicity. Hopefully
that will be added as a followup patch.

The fraction of empty ranges is also calculated and used in estimation.

Alexander Korotkov, heavily modified by me.
2012-08-27 15:58:46 +03:00
Bruce Momjian
3e1a373e2b Allow text timezone designations, e.g. "America/Chicago", when using the
ISO "T" timestamptz format.
2012-08-25 17:44:53 -04:00
Tom Lane
7abaa6b9d3 Fix issues with checks for unsupported transaction states in Hot Standby.
The GUC check hooks for transaction_read_only and transaction_isolation
tried to check RecoveryInProgress(), so as to disallow setting read/write
mode or serializable isolation level (respectively) in hot standby
sessions.  However, GUC check hooks can be called in many situations where
we're not connected to shared memory at all, resulting in a crash in
RecoveryInProgress().  Among other cases, this results in EXEC_BACKEND
builds crashing during child process start if default_transaction_isolation
is serializable, as reported by Heikki Linnakangas.  Protect those calls
by silently allowing any setting when not inside a transaction; which is
okay anyway since these GUCs are always reset at start of transaction.

Also, add a check to GetSerializableTransactionSnapshot() to complain
if we are in hot standby.  We need that check despite the one in
check_XactIsoLevel() because default_transaction_isolation could be
serializable.  We don't want to complain any sooner than this in such
cases, since that would prevent running transactions at all in such a
state; but a transaction can be run, if SET TRANSACTION ISOLATION is done
before setting a snapshot.  Per report some months ago from Robert Haas.

Back-patch to 9.1, since these problems were introduced by the SSI patch.

Kevin Grittner and Tom Lane, with ideas from Heikki Linnakangas
2012-08-24 13:09:04 -04:00
Tom Lane
ec8a0135c3 Fix cascading privilege revoke to notice when privileges are still held.
If we revoke a grant option from some role X, but X still holds the option
via another grant, we should not recursively revoke the privilege from
role(s) Y that X had granted it to.  This was supposedly fixed as one
aspect of commit 4b2dafcc0b, but I must not
have tested it, because in fact that code never worked: it forgot to shift
the grant-option bits back over when masking the bits being revoked.

Per bug #6728 from Daniel German.  Back-patch to all active branches,
since this has been wrong since 8.0.
2012-08-23 17:25:10 -04:00
Tom Lane
092d7ded29 Allow OLD and NEW in multi-row VALUES within rules.
Now that we have LATERAL, it's fairly painless to allow this case, which
was left as a TODO in the original multi-row VALUES implementation.
2012-08-19 14:12:16 -04:00
Tom Lane
470d0b9789 Check LIBXML_VERSION instead of testing in configure script.
We had put a test for libxml2's xmlStructuredErrorContext variable in
configure, but of course that doesn't work on Windows builds.  The next
best alternative seems to be to test the LIBXML_VERSION symbol provided
by xmlversion.h.

Per report from Talha Bin Rizwan, though this fixes it in a different way
than his proposed patch.
2012-08-17 00:05:26 -04:00
Tom Lane
56ba337e6f Suppress possibly-uninitialized-variable warning. 2012-08-16 12:04:07 -04:00
Heikki Linnakangas
317dd55a9c Add SP-GiST support for range types.
The implementation is a quad-tree, largely copied from the quad-tree
implementation for points. The lower and upper bound of ranges are the 2d
coordinates, with some extra code to handle empty ranges.

I left out the support for adjacent operator, -|-, from the original patch.
Not because there was necessarily anything wrong with it, but it was more
complicated than the other operators, and I only have limited time for
reviewing. That will follow as a separate patch.

Alexander Korotkov, reviewed by Jeff Davis and me.
2012-08-16 14:30:45 +03:00
Bruce Momjian
083b9133aa On second thought, explain why date_trunc("week") on interval values is
not supported in the error message, rather than the docs.
2012-08-15 16:48:05 -04:00
Tom Lane
17351fce4e Prevent access to external files/URLs via XML entity references.
xml_parse() would attempt to fetch external files or URLs as needed to
resolve DTD and entity references in an XML value, thus allowing
unprivileged database users to attempt to fetch data with the privileges
of the database server.  While the external data wouldn't get returned
directly to the user, portions of it could be exposed in error messages
if the data didn't parse as valid XML; and in any case the mere ability
to check existence of a file might be useful to an attacker.

The ideal solution to this would still allow fetching of references that
are listed in the host system's XML catalogs, so that documents can be
validated according to installed DTDs.  However, doing that with the
available libxml2 APIs appears complex and error-prone, so we're not going
to risk it in a security patch that necessarily hasn't gotten wide review.
So this patch merely shuts off all access, causing any external fetch to
silently expand to an empty string.  A future patch may improve this.

In HEAD and 9.2, also suppress warnings about undefined entities, which
would otherwise occur as a result of not loading referenced DTDs.  Previous
branches don't show such warnings anyway, due to different error handling
arrangements.

Credit to Noah Misch for first reporting the problem, and for much work
towards a solution, though this simplistic approach was not his preference.
Also thanks to Daniel Veillard for consultation.

Security: CVE-2012-3489
2012-08-14 18:31:16 -04:00
Bruce Momjian
03bda4535e Revert "commit_delay" change; just add comment that we don't have
a microsecond specification.
2012-08-14 16:26:08 -04:00
Bruce Momjian
e74727440c Add pg_settings units display for "commit_delay" (ms).
Also remove unnecessary units designation in postgresql.conf.sample.
2012-08-14 16:16:45 -04:00
Tom Lane
c9b0cbe98b Support having multiple Unix-domain sockets per postmaster.
Replace unix_socket_directory with unix_socket_directories, which is a list
of socket directories, and adjust postmaster's code to allow zero or more
Unix-domain sockets to be created.

This is mostly a straightforward change, but since the Unix sockets ought
to be created after the TCP/IP sockets for safety reasons (better chance
of detecting a port number conflict), AddToDataDirLockFile needs to be
fixed to support out-of-order updates of data directory lockfile lines.
That's a change that had been foreseen to be necessary someday anyway.

Honza Horak, reviewed and revised by Tom Lane
2012-08-10 17:27:15 -04:00
Robert Haas
21786db81f Fix cache flush hazard in event trigger cache.
Bug spotted by Jeff Davis using -DCLOBBER_CACHE_ALWAYS.
2012-08-08 16:38:37 -04:00
Bruce Momjian
2751740ab5 Add additional C comments for to_date/to_char() fixes. 2012-08-08 13:27:01 -04:00
Tom Lane
5ebaaa4944 Implement SQL-standard LATERAL subqueries.
This patch implements the standard syntax of LATERAL attached to a
sub-SELECT in FROM, and also allows LATERAL attached to a function in FROM,
since set-returning function calls are expected to be one of the principal
use-cases.

The main change here is a rewrite of the mechanism for keeping track of
which relations are visible for column references while the FROM clause is
being scanned.  The parser "namespace" lists are no longer lists of bare
RTEs, but are lists of ParseNamespaceItem structs, which carry an RTE
pointer as well as some visibility-controlling flags.  Aside from
supporting LATERAL correctly, this lets us get rid of the ancient hacks
that required rechecking subqueries and JOIN/ON and function-in-FROM
expressions for invalid references after they were initially parsed.
Invalid column references are now always correctly detected on sight.

In passing, remove assorted parser error checks that are now dead code by
virtue of our having gotten rid of add_missing_from, as well as some
comments that are obsolete for the same reason.  (It was mainly
add_missing_from that caused so much fudging here in the first place.)

The planner support for this feature is very minimal, and will be improved
in future patches.  It works well enough for testing purposes, though.

catversion bump forced due to new field in RangeTblEntry.
2012-08-07 19:02:54 -04:00
Robert Haas
eea65943c6 Fix memory leaks in event trigger code.
Spotted by Jeff Davis.
2012-08-07 17:00:16 -04:00
Bruce Momjian
ac78c4178b Fix to_char(), to_date(), and to_timestamp() to handle negative/BC
century specifications just like positive/AD centuries.  Previously the
behavior was either wrong or inconsistent with positive/AD handling.

Centuries without years now always assume the first year of the century,
which is now documented.
2012-08-07 13:34:44 -04:00
Alvaro Herrera
3a42a3ffd8 Fix redundant wording 2012-08-07 11:43:51 -04:00
Alvaro Herrera
f5f8e7169f Make strings identical 2012-08-06 12:45:08 -04:00
Tom Lane
3152bf722f Fix bugs with parsing signed hh:mm and hh:mm:ss fields in interval input.
DecodeInterval() failed to honor the "range" parameter (the special SQL
syntax for indicating which fields appear in the literal string) if the
time was signed.  This seems inappropriate, so make it work like the
not-signed case.  The inconsistency was introduced in my commit
f867339c01, which as noted in its log message
was only really focused on making SQL-compliant literals work per spec.
Including a sign here is not per spec, but if we're going to allow it
then it's reasonable to expect it to work like the not-signed case.

Also, remove bogus setting of tmask, which caused subsequent processing to
think that what had been given was a timezone and not an hh:mm(:ss) field,
thus confusing checks for redundant fields.  This seems to be an aboriginal
mistake in Lockhart's commit 2cf1642461.

Add regression test cases to illustrate the changed behaviors.

Back-patch as far as 8.4, where support for spec-compliant interval
literals was added.

Range problem reported and diagnosed by Amit Kapila, tmask problem by me.
2012-08-03 17:40:43 -04:00
Alvaro Herrera
d7b47e5155 Change syntax of new CHECK NO INHERIT constraints
The initially implemented syntax, "CHECK NO INHERIT (expr)" was not
deemed very good, so switch to "CHECK (expr) NO INHERIT" instead.  This
way it looks similar to SQL-standards compliant constraint attribute.

Backport to 9.2 where the new syntax and feature was introduced.

Per discussion.
2012-07-24 16:01:32 -04:00
Robert Haas
3a0e4d36eb Make new event trigger facility actually do something.
Commit 3855968f32 added syntax, pg_dump,
psql support, and documentation, but the triggers didn't actually fire.
With this commit, they now do.  This is still a pretty basic facility
overall because event triggers do not get a whole lot of information
about what the user is trying to do unless you write them in C; and
there's still no option to fire them anywhere except at the very
beginning of the execution sequence, but it's better than nothing,
and a good building block for future work.

Along the way, add a regression test for ALTER LARGE OBJECT, since
testing of event triggers reveals that we haven't got one.

Dimitri Fontaine and Robert Haas
2012-07-20 11:39:01 -04:00
Heikki Linnakangas
a7a4add6c4 Refactor the way code is shared between some range type functions.
Functions like range_eq, range_before etc. are exposed at the SQL-level, but
they're also used internally by the GiST consistent support function. The
code sharing was done by a hack, TrickFunctionCall2, which relied on the
knowledge that all the functions used fn_extra the same way. This commit
splits the functions into internal versions that take a TypeCacheEntry as
argument, and thin wrappers to expose the functions at the SQL-level. The
internal versions can then be called directly and in a less hacky way from
the GiST consistent function.

This is just cosmetic, but backpatch to 9.2 anyway, to avoid having a
different version of this code in the 9.2 branch. That would make
backpatching fixes in this area more difficult.

Alexander Korotkov
2012-07-18 23:14:56 +03:00
Robert Haas
3855968f32 Syntax support and documentation for event triggers.
They don't actually do anything yet; that will get fixed in a
follow-on commit.  But this gets the basic infrastructure in place,
including CREATE/ALTER/DROP EVENT TRIGGER; support for COMMENT,
SECURITY LABEL, and ALTER EXTENSION .. ADD/DROP EVENT TRIGGER;
pg_dump and psql support; and documentation for the anticipated
initial feature set.

Dimitri Fontaine, with review and a bunch of additional hacking by me.
Thom Brown extensively reviewed earlier versions of this patch set,
but there's not a whole lot of that code left in this commit, as it
turns out.
2012-07-18 10:16:16 -04:00
Alvaro Herrera
f34c68f096 Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier.  Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.

External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.

timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.

Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase.  This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.

Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-16 22:55:33 -04:00
Peter Eisentraut
dd16f9480a Remove unreachable code
The Solaris Studio compiler warns about these instances, unlike more
mainstream compilers such as gcc.  But manual inspection showed that
the code is clearly not reachable, and we hope no worthy compiler will
complain about removing this code.
2012-07-16 22:15:03 +03:00
Tom Lane
54fd196ffc Prevent corner-case core dump in rfree().
rfree() failed to cope with the case that pg_regcomp() had initialized the
regex_t struct but then failed to allocate any memory for re->re_guts (ie,
the first malloc call in pg_regcomp() failed).  It would try to touch the
guts struct anyway, and thus dump core.  This is a sufficiently narrow
corner case that it's not surprising it's never been seen in the field;
but still a bug is a bug, so patch all active branches.

Noted while investigating whether we need to call pg_regfree after a
failure return from pg_regcomp.  Other than this bug, it turns out we
don't, so adjust comments appropriately.
2012-07-15 13:27:54 -04:00
Peter Eisentraut
a84bf4922e Avoid extra newlines in XML mapping in table forest mode
found by P. Broennimann
2012-07-12 23:52:50 +03:00
Tom Lane
84a42560c8 Add array_remove() and array_replace() functions.
These functions support removing or replacing array element value(s)
matching a given search value.  Although intended mainly to support a
future array-foreign-key feature, they seem useful in their own right.

Marco Nenciarini and Gabriele Bartolini, reviewed by Alex Hunsaker
2012-07-11 13:59:35 -04:00
Tom Lane
60e9c224a1 Fix ASCII case in pg_wchar2mule_with_len.
Also some cosmetic improvements for wchar-to-mblen patch.
2012-07-10 15:59:39 -04:00
Tom Lane
628cbb50ba Re-implement extraction of fixed prefixes from regular expressions.
To generate btree-indexable conditions from regex WHERE conditions (such as
WHERE indexed_col ~ '^foo'), we need to be able to identify any fixed
prefix that a regex might have; that is, find any string that must be a
prefix of all strings satisfying the regex.  We used to do that with
entirely ad-hoc code that looked at the source text of the regex.  It
didn't know very much about regex syntax, which mostly meant that it would
fail to identify some optimizable cases; but Viktor Rosenfeld reported that
it would produce actively wrong answers for quantified parenthesized
subexpressions, such as '^(foo)?bar'.  Rather than trying to extend the
ad-hoc code to cover this, let's get rid of it altogether in favor of
identifying prefixes by examining the compiled form of a regex.

To do this, I've added a new entry point "pg_regprefix" to the regex library;
hopefully it is defined in a sufficiently general fashion that it can remain
in the library when/if that code gets split out as a standalone project.

Since this bug has been there for a very long time, this fix needs to get
back-patched.  However it depends on some other recent commits (particularly
the addition of wchar-to-database-encoding conversion), so I'll commit this
separately and then go to work on back-porting the necessary fixes.
2012-07-10 14:54:37 -04:00
Tom Lane
00dac6000d Refactor pattern_fixed_prefix() to avoid dealing in incomplete patterns.
Previously, pattern_fixed_prefix() was defined to return whatever fixed
prefix it could extract from the pattern, plus the "rest" of the pattern.
That definition was sensible for LIKE patterns, but not so much for
regexes, where reconstituting a valid pattern minus the prefix could be
quite tricky (certainly the existing code wasn't doing that correctly).
Since the only thing that callers ever did with the "rest" of the pattern
was to pass it to like_selectivity() or regex_selectivity(), let's cut out
the middle-man and just have pattern_fixed_prefix's subroutines do this
directly.  Then pattern_fixed_prefix can return a simple selectivity
number, and the question of how to cope with partial patterns is removed
from its API specification.

While at it, adjust the API spec so that callers who don't actually care
about the pattern's selectivity (which is a lot of them) can pass NULL for
the selectivity pointer to skip doing the work of computing a selectivity
estimate.

This patch is only an API refactoring that doesn't actually change any
processing, other than allowing a little bit of useless work to be skipped.
However, it's necessary infrastructure for my upcoming fix to regex prefix
extraction, because after that change there won't be any simple way to
identify the "rest" of the regex, not even to the low level of fidelity
needed by regex_selectivity.  We can cope with that if regex_fixed_prefix
and regex_selectivity communicate directly, but not if we have to work
within the old API.  Hence, back-patch to all active branches.
2012-07-09 23:22:55 -04:00
Tom Lane
e7ef6d7e24 Fix planner to pass correct collation to operator selectivity estimators.
We can do this without creating an API break for estimation functions
by passing the collation using the existing fmgr functionality for
passing an input collation as a hidden parameter.

The need for this was foreseen at the outset, but we didn't get around to
making it happen in 9.1 because of the decision to sort all pg_statistic
histograms according to the database's default collation.  That meant that
selectivity estimators generally need to use the default collation too,
even if they're estimating for an operator that will do something
different.  The reason it's suddenly become more interesting is that
regexp interpretation also uses a collation (for its LC_TYPE not LC_COLLATE
property), and we no longer want to use the wrong collation when examining
regexps during planning.  It's not that the selectivity estimate is likely
to change much from this; rather that we are thinking of caching compiled
regexps during planner estimation, and we won't get the intended benefit
if we cache them with a different collation than the executor will use.

Back-patch to 9.1, both because the regexp change is likely to get
back-patched and because we might as well get this right in all
collation-supporting branches, in case any third-party code wants to
rely on getting the collation.  The patch turns out to be minuscule
now that I've done it ...
2012-07-08 23:51:08 -04:00
Bruce Momjian
3c9b406420 Run updated copyright.pl on HEAD and 9.2 trees, updating the psql
\copyright output to 2012.

Backpatch to 9.2.
2012-07-06 12:28:18 -04:00
Robert Haas
f6a05fd973 Fix failure of new wchar->mb functions to advance from pointer.
Bug spotted by Tom Lane.
2012-07-05 23:47:53 -04:00
Bruce Momjian
042d9ffc28 Run newly-configured perltidy script on Perl files.
Run on HEAD and 9.2.
2012-07-04 21:47:49 -04:00
Robert Haas
72dd6291f2 Add wchar -> mb conversion routines.
This is infrastructure for Alexander Korotkov's work on indexing regular
expression searches.

Alexander Korotkov, with a bit of further hackery on the MULE conversion
by me
2012-07-04 17:10:10 -04:00
Tom Lane
09022de1f5 Improve documentation about MULE encoding.
This commit improves the comments in pg_wchar.h and creates #define symbols
for some formerly hard-coded values.  No substantive code changes.

Tatsuo Ishii and Tom Lane
2012-07-04 00:29:57 -04:00
Peter Eisentraut
2b44306315 Assorted message style improvements 2012-07-02 21:12:46 +03:00
Tom Lane
41f4a0ab78 Fix to_date's handling of year 519.
A thinko in commit 029dfdf115 caused the year
519 to be handled differently from either adjacent year, which was not the
intention AFAICS.  Report and diagnosis by Marc Cousin.

In passing, remove redundant re-tests of year value.
2012-07-02 11:35:35 -04:00
Tom Lane
9ad45c18b6 Fix race condition in enum value comparisons.
When (re) loading the typcache comparison cache for an enum type's values,
use an up-to-date MVCC snapshot, not the transaction's existing snapshot.
This avoids problems if we encounter an enum OID that was created since our
transaction started.  Per report from Andres Freund and diagnosis by Robert
Haas.

To ensure this is safe even if enum comparison manages to get invoked
before we've set a transaction snapshot, tweak GetLatestSnapshot to
redirect to GetTransactionSnapshot instead of throwing error when
FirstSnapshotSet is false.  The existing uses of GetLatestSnapshot (in
ri_triggers.c) don't care since they couldn't be invoked except in a
transaction that's already done some work --- but it seems just conceivable
that this might not be true of enums, especially if we ever choose to use
enums in system catalogs.

Note that the comparable coding in enum_endpoint and enum_range_internal
remains GetTransactionSnapshot; this is perhaps debatable, but if we
changed it those functions would have to be marked volatile, which doesn't
seem attractive.

Back-patch to 9.1 where ALTER TYPE ADD VALUE was added.
2012-07-01 17:12:49 -04:00
Tom Lane
fa188b5ef5 Remove inappropriate semicolons after function definitions.
Solaris Studio warns about this, and some compilers might think it's an
outright syntax error.
2012-06-30 17:29:39 -04:00
Robert Haas
c5b3451a8e Add missing space in event_source GUC description.
This has apparently been wrong since event_source was added.

Alexander Lakhin
2012-06-28 08:15:50 -04:00
Robert Haas
c60ca19de9 Allow pg_terminate_backend() to be used on backends with matching role.
A similar change was made previously for pg_cancel_backend, so now it
all matches again.

Dan Farina, reviewed by Fujii Masao, Noah Misch, and Jeff Davis,
with slight kibitzing on the doc changes by me.
2012-06-26 16:16:52 -04:00
Alvaro Herrera
77ed0c6950 Tighten up includes in sinvaladt.h, twophase.h, proc.h
Remove proc.h from sinvaladt.h and twophase.h; also replace xlog.h in
proc.h with xlogdefs.h.
2012-06-25 18:40:40 -04:00
Peter Eisentraut
eeece9e609 Unify calling conventions for postgres/postmaster sub-main functions
There was a wild mix of calling conventions: Some were declared to
return void and didn't return, some returned an int exit code, some
claimed to return an exit code, which the callers checked, but
actually never returned, and so on.

Now all of these functions are declared to return void and decorated
with attribute noreturn and don't return.  That's easiest, and most
code already worked that way.
2012-06-25 21:30:12 +03:00
Peter Eisentraut
b8b2e3b2de Replace int2/int4 in C code with int16/int32
The latter was already the dominant use, and it's preferable because
in C the convention is that intXX means XX bits.  Therefore, allowing
mixed use of int2, int4, int8, int16, int32 is obviously confusing.

Remove the typedefs for int2 and int4 for now.  They don't seem to be
widely used outside of the PostgreSQL source tree, and the few uses
can probably be cleaned up by the time this ships.
2012-06-25 01:51:46 +03:00
Heikki Linnakangas
eeb6f37d89 Add a small cache of locks owned by a resource owner in ResourceOwner.
This speeds up reassigning locks to the parent owner, when the transaction
holds a lot of locks, but only a few of them belong to the current resource
owner. This is particularly helps pg_dump when dumping a large number of
objects.

The cache can hold up to 15 locks in each resource owner. After that, the
cache is marked as overflowed, and we fall back to the old method of
scanning the whole local lock table. The tradeoff here is that the cache has
to be scanned whenever a lock is released, so if the cache is too large,
lock release becomes more expensive. 15 seems enough to cover pg_dump, and
doesn't have much impact on lock release.

Jeff Janes, reviewed by Amit Kapila and Heikki Linnakangas.
2012-06-21 15:30:26 +03:00
Tom Lane
dfd9c116cc Remove incomplete/incorrect support for zero-column foreign keys.
The original coding in ri_triggers.c had partial support for the concept of
zero-column foreign key constraints.  But this is not defined in the SQL
standard, nor was it ever allowed by any other part of Postgres, nor was it
very fully implemented even here (eg there was no support for preventing
PK-table deletions that would violate the constraint).  Doesn't seem very
useful to carry 100-plus lines of code for a corner case that no one is
interested in making work.  Instead, just add a check that the column list
read from pg_constraint is non-empty.
2012-06-20 20:15:02 -04:00
Tom Lane
0ce4459a36 Increase MAX_SYSCACHE_CALLBACKS from 20 to 32.
By my count there are 18 callers of CacheRegisterSyscacheCallback in the
core code in HEAD, so we are potentially leaving as few as 2 slots for any
add-on code to use (though possibly not all these callers would actually
activate in any particular session).  That doesn't seem like a lot of
headroom, so let's pump it up a little.
2012-06-20 19:47:37 -04:00
Tom Lane
45ba424f33 Cache the results of ri_FetchConstraintInfo in a backend-local cache.
Extracting data from pg_constraint turned out to take as much as 10% of the
runtime in a bulk-update case where the foreign key column wasn't changing,
because we did it over again for each tuple.  Fix that by maintaining a
backend-local cache of the results.  This is really a pretty small patch,
but converting the trigger functions to work with pointers rather than
local struct variables requires a lot of mechanical changes.
2012-06-20 17:24:14 -04:00
Tom Lane
cfa0f4255b Improve tests for whether we can skip queueing RI enforcement triggers.
During an update of a PK row, we can skip firing the RI trigger if any old
key value is NULL, because then the row could not have had any matching
rows in the FK table.  Conversely, during an update of an FK row, the
outcome is determined if any new key value is NULL.  In either case it
becomes unnecessary to compare individual key values.

This patch was inspired by discussion of Vik Reykja's patch to use IS NOT
DISTINCT semantics for the key comparisons.  In the event there is no need
for that and so this patch looks nothing like his, but he should still get
credit for having re-opened consideration of the trigger skip logic.
2012-06-19 20:07:33 -04:00
Tom Lane
fe3db74002 Share RI trigger code between NO ACTION and RESTRICT cases.
These triggers are identical except for whether ri_Check_Pk_Match is to be
called, so factor out the common code to save a couple hundred lines.

Also, eliminate null-column checks in ri_Check_Pk_Match, since they're
duplicate with the calling functions and require unnecessary complication
in its API statement.

Simplify the way code is shared between RI_FKey_check_ins and
RI_FKey_check_upd, too.
2012-06-19 14:31:54 -04:00
Tom Lane
48756be9cf Improve comments about why SET DEFAULT triggers must recheck for matches.
I was confused about this, so try to make it clearer for the next person.

(This seems like a fairly inefficient way of dealing with a corner case,
but I don't have a better idea offhand.  Maybe if there were a way to turn
off the RI_FKey_keyequal_upd_fk event filter temporarily?)
2012-06-18 22:45:07 -04:00
Tom Lane
e8c9fd5fdf Allow ON UPDATE/DELETE SET DEFAULT plans to be cached.
Once upon a time, somebody was worried that cached RI plans wouldn't get
remade with new default values after ALTER TABLE ... SET DEFAULT, so they
didn't allow caching of plans for ON UPDATE/DELETE SET DEFAULT actions.
That time is long gone, though (and even at the time I doubt this was the
greatest hazard posed by ALTER TABLE...).  So allow these triggers to cache
their plans just like the others.

The cache_plan argument to ri_PlanCheck is now vestigial, since there
are no callers that don't pass "true"; but I left it alone in case there
is any future need for it.
2012-06-18 19:37:23 -04:00
Tom Lane
03a5ba24b0 Remove derived fields from RI_QueryKey, and do a bit of other cleanup.
We really only need the foreign key constraint's OID and the query type
code to uniquely identify each plan we are caching for FK checks.  The
other stuff that was in the struct had no business being used as part of
a hash key, and was all just being copied from struct RI_ConstraintInfo
anyway.  Get rid of the unnecessary fields, and readjust various function
APIs to make them use RI_ConstraintInfo not RI_QueryKey as info source.

I'd be surprised if this makes any measurable performance difference,
but it certainly feels cleaner.
2012-06-18 18:50:29 -04:00
Tom Lane
f9429746c9 Update SQL spec references in ri_triggers code to match SQL:2008.
Now that what we're implementing isn't SQL92, we probably shouldn't cite
chapter and verse in that spec anymore.  Also fix some comments that
talked about MATCH FULL but in fact were in code that's also used for
MATCH SIMPLE.

No code changes in this commit, just comments.
2012-06-18 12:19:38 -04:00
Tom Lane
c75be2ad60 Change ON UPDATE SET NULL/SET DEFAULT referential actions to meet SQL spec.
Previously, when executing an ON UPDATE SET NULL or SET DEFAULT action for
a multicolumn MATCH SIMPLE foreign key constraint, we would set only those
referencing columns corresponding to referenced columns that were changed.
This is what the SQL92 standard said to do --- but more recent versions
of the standard say that all referencing columns should be set to null or
their default values, no matter exactly which referenced columns changed.
At least for SET DEFAULT, that is clearly saner behavior.  It's somewhat
debatable whether it's an improvement for SET NULL, but it appears that
other RDBMS systems read the spec this way.  So let's do it like that.

This is a release-notable behavioral change, although considering that
our documentation already implied it was done this way, the lack of
complaints suggests few people use such cases.
2012-06-18 12:12:52 -04:00
Tom Lane
f5297bdfe4 Refer to the default foreign key match style as MATCH SIMPLE internally.
Previously we followed the SQL92 wording, "MATCH <unspecified>", but since
SQL99 there's been a less awkward way to refer to the default style.

In addition to the code changes, pg_constraint.confmatchtype now stores
this match style as 's' (SIMPLE) rather than 'u' (UNSPECIFIED).  This
doesn't affect pg_dump or psql because they use pg_get_constraintdef()
to reconstruct foreign key definitions.  But other client-side code might
examine that column directly, so this change will have to be marked as
an incompatibility in the 9.3 release notes.
2012-06-17 20:16:44 -04:00
Robert Haas
d2c86a1ccd Remove RELKIND_UNCATALOGED.
This may have been important at some point in the past, but it no
longer does anything useful.

Review by Tom Lane.
2012-06-14 09:47:30 -04:00
Tom Lane
80edfd7659 Revisit error message details for JSON input parsing.
Instead of identifying error locations only by line number (which could
be entirely unhelpful with long input lines), provide a fragment of the
input text too, placing this info in a new CONTEXT entry.  Make the
error detail messages conform more closely to style guidelines, fix
failure to expose some of them for translation, ensure compiler can
check formats against supplied parameters.
2012-06-13 19:43:35 -04:00
Tom Lane
f871ef74a5 Minor code review for json.c.
Improve commenting, conform to project style for use of ++ etc.
No functional changes.
2012-06-12 16:23:45 -04:00
Robert Haas
36b7e3da17 Mark JSON error detail messages for translation.
Per gripe from Tom Lane.
2012-06-12 10:41:38 -04:00
Bruce Momjian
927d61eeff Run pgindent on 9.2 source tree in preparation for first 9.3
commit-fest.
2012-06-10 15:20:04 -04:00
Tom Lane
3dd8e59681 Fix bogus handling of control characters in json_lex_string().
The original coding misbehaved if "char" is signed, and also made the
extremely poor decision to print control characters literally when trying
to complain about them.  Report and patch by Shigeru Hanada.

In passing, also fix core dump risk in report_parse_error() should the
parse state be something other than what it expects.
2012-06-04 20:43:57 -04:00
Tom Lane
33c6eaf78e Ignore SECURITY DEFINER and SET attributes for a PL's call handler.
It's not very sensible to set such attributes on a handler function;
but if one were to do so, fmgr.c went into infinite recursion because
it would call fmgr_security_definer instead of the handler function proper.
There is no way for fmgr_security_definer to know that it ought to call the
handler and not the original function referenced by the FmgrInfo's fn_oid,
so it tries to do the latter, causing the whole process to start over
again.

Ordinarily such misconfiguration of a procedural language's handler could
be written off as superuser error.  However, because we allow non-superuser
database owners to create procedural languages and the handler for such a
language becomes owned by the database owner, it is possible for a database
owner to crash the backend, which ideally shouldn't be possible without
superuser privileges.  In 9.2 and up we will adjust things so that the
handler functions are always owned by superusers, but in existing branches
this is a minor security fix.

Problem noted by Noah Misch (after several of us had failed to detect
it :-().  This is CVE-2012-2655.
2012-05-30 23:27:57 -04:00
Tom Lane
cd0ff9c0f4 Expand the allowed range of timezone offsets to +/-15:59:59 from Greenwich.
We used to only allow offsets less than +/-13 hours, then it was +/14,
then it was +/-15.  That's still not good enough though, as per today's bug
report from Patric Bechtel.  This time I actually looked through the Olson
timezone database to find the largest offsets used anywhere.  The winners
are Asia/Manila, at -15:56:00 until 1844, and America/Metlakatla, at
+15:13:42 until 1867.  So we'd better allow offsets less than +/-16 hours.

Given the history, we are way overdue to have some greppable #define
symbols controlling this, so make some ... and also remove an obsolete
comment that didn't get fixed the last time.

Back-patch to all supported branches.
2012-05-30 19:58:35 -04:00
Tom Lane
d3b97d1488 Fix string truncation to be multibyte-aware in text_name and bpchar_name.
Previously, casts to name could generate invalidly-encoded results.

Also, make these functions match namein() more exactly, by consistently
using palloc0() instead of ad-hoc zeroing code.

Back-patch to all supported branches.

Karl Schnaitter and Tom Lane
2012-05-25 17:34:51 -04:00
Tom Lane
efae4653c9 Update woefully-obsolete comment.
The accurate info about what's in a lock file has been in miscadmin.h
for some time, so let's just make this comment point there instead of
maintaining a duplicative copy.
2012-05-21 22:11:00 -04:00
Peter Eisentraut
f1f6737e15 Fix incorrect logic in JSON number lexer
Detectable by gcc -Wlogical-op.

Add two regression test cases that would previously allow incorrect
values to pass.
2012-05-20 02:24:46 +03:00
Peter Eisentraut
c8e086795a Remove whitespace from end of lines
pgindent and perltidy should clean up the rest.
2012-05-15 22:19:41 +03:00
Heikki Linnakangas
f15c2eae9c Remove unnecessary pg_verifymbstr() calls from tsvector/query in functions.
The input should've been validated well before it hits the input function.
Doing so again is a waste of cycles.
2012-05-14 14:30:32 +03:00
Heikki Linnakangas
9e4637bf89 Update comments that became out-of-date with the PGXACT struct.
When the "hot" members of PGPROC were split off to separate PGXACT structs,
many PGPROC fields referred to in comments were moved to PGXACT, but the
comments were neglected in the commit. Mostly this is just a search/replace
of PGPROC with PGXACT, but the way the dummy PGPROC entries are created for
prepared transactions changed more, making some of the comments totally
bogus.

Noah Misch
2012-05-14 10:28:55 +03:00
Peter Eisentraut
6bf1e7668d Small punctuation editing of postgresql.conf.sample 2012-05-14 04:50:39 +03:00
Simon Riggs
b762e8f50b Remove extraneous #include "storage/proc.h" 2012-05-11 14:46:46 +01:00
Simon Riggs
b06679e012 Ensure age() returns a stable value rather than the latest value 2012-05-11 14:36:24 +01:00
Simon Riggs
5829387381 Avoid xid error from age() function when run on Hot Standby 2012-05-09 13:56:24 +01:00
Bruce Momjian
ebcaa5fcde Remove BSD/OS (BSDi) port. There are no known users upgrading to
Postgres 9.2, and perhaps no existing users either.
2012-05-03 10:58:44 -04:00
Robert Haas
0038110421 Avoid repeated CLOG access from heap_hot_search_buffer.
At the time we check whether the tuple is dead to all running
transactions, we've already verified that it isn't visible to our
scan, setting hint bits if appropriate.  So there's no need to
recheck CLOG for the all-dead test we do just a moment later.
So, add HeapTupleIsSurelyDead() to test the appropriate condition
under the assumption that all relevant hit bits are already set.

Review by Tom Lane.
2012-05-02 12:40:07 -04:00
Heikki Linnakangas
f291ccd43e Remove duplicate words in comments.
Found these with grep -r "for for ".
2012-05-02 10:20:27 +03:00
Tom Lane
50c2d6a1a6 Kill some remaining references to SVR4 and univel.
Both terms still appear in a few places, but I thought it best to leave
those alone in context.
2012-05-02 00:29:17 -04:00
Tom Lane
809e7e21af Converge all SQL-level statistics timing values to float8 milliseconds.
This patch adjusts the core statistics views to match the decision already
taken for pg_stat_statements, that values representing elapsed time should
be represented as float8 and measured in milliseconds.  By using float8,
we are no longer tied to a specific maximum precision of timing data.
(Internally, it's still microseconds, but we could now change that without
needing changes at the SQL level.)

The columns affected are
pg_stat_bgwriter.checkpoint_write_time
pg_stat_bgwriter.checkpoint_sync_time
pg_stat_database.blk_read_time
pg_stat_database.blk_write_time
pg_stat_user_functions.total_time
pg_stat_user_functions.self_time
pg_stat_xact_user_functions.total_time
pg_stat_xact_user_functions.self_time

The first four of these are new in 9.2, so there is no compatibility issue
from changing them.  The others require a release note comment that they
are now double precision (and can show a fractional part) rather than
bigint as before; also their underlying statistics functions now match
the column definitions, instead of returning bigint microseconds.
2012-04-30 14:03:33 -04:00
Tom Lane
1dd89eadcd Rename I/O timing statistics columns to blk_read_time and blk_write_time.
This seems more consistent with the pre-existing choices for names of
other statistics columns.  Rename assorted internal identifiers to match.
2012-04-29 18:13:33 -04:00
Tom Lane
309c64745e Rename track_iotiming GUC to track_io_timing.
This spelling seems significantly more readable to me.
2012-04-29 16:23:54 -04:00
Peter Eisentraut
81107282a5 Change return type of ExceptionalCondition to void and mark it noreturn
In ancient times, it was thought that this wouldn't work because of
TrapMacro/AssertMacro, but changing those to use a comma operator
appears to work without compiler warnings.
2012-04-29 21:20:14 +03:00
Tom Lane
d6f7d4fdc5 Fix printing of whole-row Vars at top level of a SELECT targetlist.
Normally whole-row Vars are printed as "tabname.*".  However, that does not
work at top level of a targetlist, because per SQL standard the parser will
think that the "*" should result in column-by-column expansion; which is
not at all what a whole-row Var implies.  We used to just print the table
name in such cases, which works most of the time; but it fails if the table
name matches a column name available anywhere in the FROM clause.  This
could lead for instance to a view being interpreted differently after dump
and reload.  Adding parentheses doesn't fix it, but there is a reasonably
simple kluge we can use instead: attach a no-op cast, so that the "*" isn't
syntactically at top level anymore.  This makes the printing of such
whole-row Vars a lot more consistent with other Vars, and may indeed fix
more cases than just the reported one; I'm suspicious that cases involving
schema qualification probably didn't work properly before, either.

Per bug report and fix proposal from Abbas Butt, though this patch is quite
different in detail from his.

Back-patch to all supported versions.
2012-04-27 19:49:18 -04:00
Robert Haas
5d4b60f2f2 Lots of doc corrections.
Josh Kupershmidt
2012-04-23 22:43:09 -04:00
Robert Haas
85efd5f065 Reduce hash size for compute_array_stats, compute_tsvector_stats.
The size is only a hint, but a big hint chews up a lot of memory without
apparently improving performance much.

Analysis and patch by Noah Misch.
2012-04-23 22:05:41 -04:00
Alvaro Herrera
09ff76fcdb Recast "ONLY" column CHECK constraints as NO INHERIT
The original syntax wasn't universally loved, and it didn't allow its
usage in CREATE TABLE, only ALTER TABLE.  It now works everywhere, and
it also allows using ALTER TABLE ONLY to add an uninherited CHECK
constraint, per discussion.

The pg_constraint column has accordingly been renamed connoinherit.

This commit partly reverts some of the changes in
61d81bd28d, particularly some pg_dump and
psql bits, because now pg_get_constraintdef includes the necessary NO
INHERIT within the constraint definition.

Author: Nikhil Sontakke
Some tweaks by me
2012-04-20 23:56:57 -03:00
Robert Haas
293ec33c32 Remove bogus comment from HeapTupleSatisfiesNow.
This has been wrong for a really long time.  We don't use two-phase
locking to protect against serialization anomalies.

Per discussion on pgsql-hackers about 2011-03-07; original report
by Dan Ports.
2012-04-18 11:50:45 -04:00
Robert Haas
ea6a2d8d47 Rename synchronous_commit='write' to 'remote_write'.
Fujii Masao, per discussion on pgsql-hackers
2012-04-14 10:53:22 -04:00
Robert Haas
4a2d7ad76f pg_size_pretty(numeric)
The output of the new pg_xlog_location_diff function is of type numeric,
since it could theoretically overflow an int8 due to signedness; this
provides a convenient way to format such values.

Fujii Masao, with some beautification by me.
2012-04-14 08:07:25 -04:00
Peter Eisentraut
c0cc526e8b Rename bytea_agg to string_agg and add delimiter argument
Per mailing list discussion, we would like to keep the bytea functions
parallel to the text functions, so rename bytea_agg to string_agg,
which already exists for text.

Also, to satisfy the rule that we don't want aggregate functions of
the same name with a different number of arguments, add a delimiter
argument, just like string_agg for text already has.
2012-04-13 21:36:59 +03:00
Tom Lane
3769fa5fc6 Make pg_tablespace_location(0) return the database's default tablespace.
This definition is convenient when applying the function to the
reltablespace column of pg_class, since that's what zero means there;
and it doesn't interfere with any other plausible use of the function.
Per gripe from Bruce Momjian.
2012-04-10 21:43:14 -04:00
Tom Lane
0d9819f7e3 Measure epoch of timestamp-without-time-zone from local not UTC midnight.
This patch reverts commit 191ef2b407
and thereby restores the pre-7.3 behavior of EXTRACT(EPOCH FROM
timestamp-without-tz).  Per discussion, the more recent behavior was
misguided on a couple of grounds: it makes it hard to get a
non-timezone-aware epoch value for a timestamp, and it makes this one
case dependent on the value of the timezone GUC, which is incompatible
with having timestamp_part() labeled as immutable.

The other behavior is still available (in all releases) by explicitly
casting the timestamp to timestamp with time zone before applying EXTRACT.

This will need to be called out as an incompatible change in the 9.2
release notes.  Although having mutable behavior in a function marked
immutable is clearly a bug, we're not going to back-patch such a change.
2012-04-10 12:04:42 -04:00
Tom Lane
65fd91333e Fix an Assert that turns out to be reachable after all.
estimate_num_groups() gets unhappy with
	create table empty();
	select * from empty except select * from empty e2;
I can't see any actual use-case for such a query (and the table is illegal
per SQL spec), but it seems like a good idea that it not cause an assert
failure.
2012-04-09 11:58:24 -04:00
Tom Lane
95b9c333b2 Further adjustment of comment about qsort_tuple. 2012-04-07 17:48:40 -04:00
Tom Lane
17b985b1a0 Fix broken comparetup_datum code.
Commit 337b6f5ecf contained the entirely
fanciful assumption that it had made comparetup_datum unreachable.
Reported and patched by Takashi Yamamoto.

Fix up some not terribly accurate/useful comments from that commit, too.
2012-04-06 16:58:50 -04:00
Simon Riggs
8cb53654db Add DROP INDEX CONCURRENTLY [IF EXISTS], uses ShareUpdateExclusiveLock 2012-04-06 10:21:40 +01:00
Robert Haas
b736aef2ec Publish checkpoint timing information to pg_stat_bgwriter.
Greg Smith, Peter Geoghegan, and Robert Haas
2012-04-05 14:04:37 -04:00
Robert Haas
644828908f Expose track_iotiming data via the statistics collector.
Ants Aasma's original patch to add timing information for buffer I/O
requests exposed this data at the relation level, which was judged too
costly.  I've here exposed it at the database level instead.
2012-04-05 11:40:24 -04:00
Robert Haas
40b9b95769 New GUC, track_iotiming, to track I/O timings.
Currently, the only way to see the numbers this gathers is via
EXPLAIN (ANALYZE, BUFFERS), but the plan is to add visibility through
the stats collector and pg_stat_statements in subsequent patches.

Ants Aasma, reviewed by Greg Smith, with some further changes by me.
2012-03-27 14:55:02 -04:00
Tom Lane
c7cea267de Replace empty locale name with implied value in CREATE DATABASE and initdb.
setlocale() accepts locale name "" as meaning "the locale specified by the
process's environment variables".  Historically we've accepted that for
Postgres' locale settings, too.  However, it's fairly unsafe to store an
empty string in a new database's pg_database.datcollate or datctype fields,
because then the interpretation could vary across postmaster restarts,
possibly resulting in index corruption and other unpleasantness.

Instead, we should expand "" to whatever it means at the moment of calling
CREATE DATABASE, which we can do by saving the value returned by
setlocale().

For consistency, make initdb set up the initial lc_xxx parameter values the
same way.  initdb was already doing the right thing for empty locale names,
but it did not replace non-empty names with setlocale results.  On a
platform where setlocale chooses to canonicalize the spellings of locale
names, this would result in annoying inconsistency.  (It seems that popular
implementations of setlocale don't do such canonicalization, which is a
pity, but the POSIX spec certainly allows it to be done.)  The same risk
of inconsistency leads me to not venture back-patching this, although it
could certainly be seen as a longstanding bug.

Per report from Jeff Davis, though this is not his proposed patch.
2012-03-25 21:47:22 -04:00
Tom Lane
0339047bc9 Code review for protransform patches.
Fix loss of previous expression-simplification work when a transform
function fires: we must not simply revert to untransformed input tree.
Instead build a dummy FuncExpr node to pass to the transform function.
This has the additional advantage of providing a simpler, more uniform
API for transform functions.

Move documentation to a somewhat less buried spot, relocate some
poorly-placed code, be more wary of null constants and invalid typmod
values, add an opr_sanity check on protransform function signatures,
and some other minor cosmetic adjustments.

Note: although this patch touches pg_proc.h, no need for catversion
bump, because the changes are cosmetic and don't actually change the
intended catalog contents.
2012-03-23 17:29:57 -04:00
Peter Eisentraut
0e85abd658 Clean up compiler warnings from unused variables with asserts disabled
For those variables only used when asserts are enabled, use a new
macro PG_USED_FOR_ASSERTS_ONLY, which expands to
__attribute__((unused)) when asserts are not enabled.
2012-03-21 23:33:10 +02:00
Tom Lane
f70f095c90 Allow new relmapper entries when allow_system_table_mods is true.
This restores the pre-9.0 situation that it's possible to add new indexes
on pg_class and other mapped-but-not-shared catalogs, so long as you broke
the glass and flipped the big red Dont-Touch-Me switch.  As before, there
are a lot of gotchas, and you'd have to be pretty desperate to try this
on a production database; but there doesn't seem to be a reason for
relmapper.c to be preventing such things all by itself.  Per
experimentation with a case suggested by Cody Cutrer.
2012-03-21 14:09:39 -04:00
Robert Haas
aefa6d163e Add some CHECK_FOR_INTERRUPTS() calls to the heap-sort call path.
I broke this in commit 337b6f5ecf, which
among other things arranged for quicksorts to CHECK_FOR_INTERRUPTS()
slightly less frequently.  Sadly, it also arranged for heapsorts to
CHECK_FOR_INTERRUPTS() much less frequently.  Repair.
2012-03-20 21:26:39 -04:00
Tom Lane
9dbf2b7d75 Restructure SELECT INTO's parsetree representation into CreateTableAsStmt.
Making this operation look like a utility statement seems generally a good
idea, and particularly so in light of the desire to provide command
triggers for utility statements.  The original choice of representing it as
SELECT with an IntoClause appendage had metastasized into rather a lot of
places, unfortunately, so that this patch is a great deal more complicated
than one might at first expect.

In particular, keeping EXPLAIN working for SELECT INTO and CREATE TABLE AS
subcommands required restructuring some EXPLAIN-related APIs.  Add-on code
that calls ExplainOnePlan or ExplainOneUtility, or uses
ExplainOneQuery_hook, will need adjustment.

Also, the cases PREPARE ... SELECT INTO and CREATE RULE ... SELECT INTO,
which formerly were accepted though undocumented, are no longer accepted.
The PREPARE case can be replaced with use of CREATE TABLE AS EXECUTE.
The CREATE RULE case doesn't seem to have much real-world use (since the
rule would work only once before failing with "table already exists"),
so we'll not bother with that one.

Both SELECT INTO and CREATE TABLE AS still return a command tag of
"SELECT nnnn".  There was some discussion of returning "CREATE TABLE nnnn",
but for the moment backwards compatibility wins the day.

Andres Freund and Tom Lane
2012-03-19 21:38:12 -04:00
Peter Eisentraut
693ff85d47 backend: Fix minor memory leak in configuration file processing
Just for consistency with the other code paths.

found by Coverity
2012-03-16 20:34:59 +02:00
Peter Eisentraut
eb990a2b9e Add const qualifier to tzn returned by timestamp2tm()
The tzn value might come from tm->tm_zone, which libc declares as
const, so it's prudent that the upper layers know about this as well.
2012-03-15 21:17:19 +02:00
Peter Eisentraut
531e60aec0 Remove unused tzn arguments for timestamp2tm() 2012-03-15 21:13:35 +02:00
Peter Eisentraut
ad4fb0d0d2 Improve EncodeDateTime and EncodeTimeOnly APIs
Use an explicit argument to tell whether to include the time zone in
the output, rather than using some undocumented pointer magic.
2012-03-14 23:03:34 +02:00
Peter Eisentraut
bad250f4f3 Use correct sizeof operand in qsort call
Probably no practical impact, since all pointers ought to have the
same size, but it was wrong nonetheless.  Found by Coverity.
2012-03-12 20:56:13 +02:00
Peter Eisentraut
c9f310d377 Add comment for missing break in switch
For clarity, following other sites, and to silence Coverity.
2012-03-12 20:55:09 +02:00
Tom Lane
66a7e6bae9 Improve estimation of IN/NOT IN by assuming array elements are distinct.
In constructs such as "x IN (1,2,3,4)" and "x <> ALL(ARRAY[1,2,3,4])",
we formerly always used a general-purpose assumption that the probability
of success is independent for each comparison of "x" to an array element.
But in real-world usage of these constructs, that's a pretty poor
assumption; it's much saner to assume that the array elements are distinct
and so the match probabilities are disjoint.  Apply that assumption if the
operator appears to behave as equality (for ANY) or inequality (for ALL).
But fall back to the normal independent-probabilities calculation if this
yields an impossible result, ie probability > 1 or < 0.  We could protect
ourselves against bad estimates even more by explicitly checking for equal
array elements, but that is expensive and doesn't seem worthwhile: doing
it would amount to optimizing for poorly-written queries at the expense
of well-written ones.

Daniele Varrazzo and Tom Lane, after a suggestion by Ants Aasma
2012-03-07 22:59:49 -05:00
Tom Lane
d4bf3c9c94 Expose an API for calculating catcache hash values.
Now that cache invalidation callbacks get only a hash value, and not a
tuple TID (per commits 632ae6829f and
b5282aa893), the only way they can restrict
what they invalidate is to know what the hash values mean.  setrefs.c was
doing this via a hard-wired assumption but that seems pretty grotty, and
it'll only get worse as more cases come up.  So let's expose a calculation
function that takes the same parameters as SearchSysCache.  Per complaint
from Marko Kreen.
2012-03-07 14:51:13 -05:00
Tom Lane
19dbc34631 Add a hook for processing messages due to be sent to the server log.
Use-cases for this include custom log filtering rules and custom log
message transmission mechanisms (for instance, lossy log message
collection, which has been discussed several times recently).

As is our common practice for hooks, there's no regression test nor
user-facing documentation for this, though the author did exhibit a
sample module using the hook.

Martin Pihlak, reviewed by Marti Raudsepp
2012-03-06 15:35:41 -05:00
Tom Lane
80da9e68fd Rewrite GiST support code for rangetypes.
This patch installs significantly smarter penalty and picksplit functions
for ranges, making GiST indexes for them smaller and faster to search.

There is no on-disk format change, so no catversion bump, but you'd need
to REINDEX to get the benefits for any existing index.

Alexander Korotkov, reviewed by Jeff Davis
2012-03-04 22:50:06 -05:00
Tom Lane
e2eed78910 Remove useless "rough estimate" path from mcelem_array_contained_selec.
The code in this function that tried to cope with a missing count histogram
was quite ineffective for anything except a perfectly flat distribution.
Furthermore, since we were already punting for missing MCELEM slot, it's
rather useless to sweat over missing DECHIST: there are no cases where
ANALYZE will create the first but not the second.  So just simplify the
code by punting rather than pretending we can do something useful.
2012-03-04 16:03:38 -05:00
Tom Lane
4fb694aebc Improve histogram-filling loop in new compute_array_stats() code.
Do "frac" arithmetic in int64 to prevent overflow with large statistics
targets, and improve the comments so people have some chance of
understanding how it works.

Alexander Korotkov and Tom Lane
2012-03-04 15:40:16 -05:00
Tom Lane
0e5e167aae Collect and use element-frequency statistics for arrays.
This patch improves selectivity estimation for the array <@, &&, and @>
(containment and overlaps) operators.  It enables collection of statistics
about individual array element values by ANALYZE, and introduces
operator-specific estimators that use these stats.  In addition,
ScalarArrayOpExpr constructs of the forms "const = ANY/ALL (array_column)"
and "const <> ANY/ALL (array_column)" are estimated by treating them as
variants of the containment operators.

Since we still collect scalar-style stats about the array values as a
whole, the pg_stats view is expanded to show both these stats and the
array-style stats in separate columns.  This creates an incompatible change
in how stats for tsvector columns are displayed in pg_stats: the stats
about lexemes are now displayed in the array-related columns instead of the
original scalar-related columns.

There are a few loose ends here, notably that it'd be nice to be able to
suppress either the scalar-style stats or the array-element stats for
columns for which they're not useful.  But the patch is in good enough
shape to commit for wider testing.

Alexander Korotkov, reviewed by Noah Misch and Nathan Boley
2012-03-03 20:20:57 -05:00
Peter Eisentraut
6688d2878e Add COLLATION FOR expression
reviewed by Jaime Casanova
2012-03-02 21:12:16 +02:00
Tom Lane
5c02a00d44 Move CRC tables to libpgport, and provide them in a separate include file.
This makes it much more convenient to build tools for Postgres that are
separately compiled and require a matching CRC implementation.

To prevent multiple copies of the CRC polynomial tables being introduced
into the postgres binaries, they are now included in the static library
libpgport that is mainly meant for replacement system functions.  That
seems like a bit of a kludge, but there's no better place.

This cleans up building of the tools pg_controldata and pg_resetxlog,
which previously had to build their own copies of pg_crc.o.

In the future, external programs that need access to the CRC tables can
include the tables directly from the new header file pg_crc_tables.h.

Daniel Farina, reviewed by Abhijit Menon-Sen and Tom Lane
2012-02-28 19:53:39 -05:00
Peter Eisentraut
973e9fb294 Add const qualifiers where they are accidentally cast away
This only produces warnings under -Wcast-qual, but it's more correct
and consistent in any case.
2012-02-28 12:42:08 +02:00
Alvaro Herrera
cb3a7c2b95 ALTER TABLE: skip FK validation when it's safe to do so
We already skip rewriting the table in these cases, but we still force a
whole table scan to validate the data.  This can be skipped, and thus
we can make the whole ALTER TABLE operation just do some catalog touches
instead of scanning the table, when these two conditions hold:

(a) Old and new pg_constraint.conpfeqop match exactly.  This is actually
stronger than needed; we could loosen things by way of operator
families, but it'd require a lot more effort.

(b) The functions, if any, implementing a cast from the foreign type to
the primary opcintype are the same.  For this purpose, we can consider a
binary coercion equivalent to an exact type match.  When the opcintype
is polymorphic, require that the old and new foreign types match
exactly.  (Since ri_triggers.c does use the executor, the stronger check
for polymorphic types is no mere future-proofing.  However, no core type
exercises its necessity.)

Author: Noah Misch

Committer's note: catalog version bumped due to change of the Constraint
node.  I can't actually find any way to have such a node in a stored
rule, but given that we have "out" support for them, better be safe.
2012-02-27 19:10:24 -03:00
Peter Eisentraut
9cfd800aab Add some enumeration commas, for consistency 2012-02-24 11:04:45 +02:00
Andrew Dunstan
0c9e5d5e0d Correctly handle NULLs in JSON output.
Error reported by David Wheeler.
2012-02-23 23:44:16 -05:00
Peter Eisentraut
a445cb92ef Add parameters for controlling locations of server-side SSL files
This allows changing the location of the files that were previously
hard-coded to server.crt, server.key, root.crt, root.crl.

server.crt and server.key continue to be the default settings and are
thus required to be present by default if SSL is enabled.  But the
settings for the server-side CA and CRL are now empty by default, and
if they are set, the files are required to be present.  This replaces
the previous behavior of ignoring the functionality if the files were
not found.
2012-02-22 23:40:46 +02:00
Andrew Dunstan
6b044cb810 Fix typo, noticed by Will Crawford. 2012-02-21 11:03:51 -05:00
Andrew Dunstan
83fcaffea2 Fix a couple of cases of JSON output.
First, as noted by Itagaki Takahiro, a datum of type JSON doesn't
need to be escaped. Second, ensure that numeric output not in
the form of a legal JSON number is quoted and escaped.
2012-02-20 15:01:03 -05:00
Andrew Dunstan
2f582f76b1 Improve pretty printing of viewdefs.
Some line feeds are added to target lists and from lists to make
them more readable. By default they wrap at 80 columns if possible,
but the wrap column is also selectable - if 0 it wraps after every
item.

Andrew Dunstan, reviewed by Hitoshi Harada.
2012-02-19 11:43:46 -05:00
Tom Lane
4767bc8ff2 Improve statistics estimation to make some use of DISTINCT in sub-queries.
Formerly, we just punted when trying to estimate stats for variables coming
out of sub-queries using DISTINCT, on the grounds that whatever stats we
might have for underlying table columns would be inapplicable.  But if the
sub-query has only one DISTINCT column, we can consider its output variable
as being unique, which is useful information all by itself.  The scope of
this improvement is pretty narrow, but it costs nearly nothing, so we might
as well do it.  Per discussion with Andres Freund.

This patch differs from the draft I submitted yesterday in updating various
comments about vardata.isunique (to reflect its extended meaning) and in
tweaking the interaction with security_barrier views.  There does not seem
to be a reason why we can't use this sort of knowledge even when the
sub-query is such a view.
2012-02-16 17:34:00 -05:00
Tom Lane
4bfe68dfab Run a portal's cleanup hook immediately when pushing it to FAILED state.
This extends the changes of commit 6252c4f9e2
so that we run the cleanup hook earlier for failure cases as well as
success cases.  As before, the point is to avoid an assertion failure from
an Assert I added in commit a874fe7b4c, which
was meant to check that no user-written code can be called during portal
cleanup.  This fixes a case reported by Pavan Deolasee in which the Assert
could be triggered during backend exit (see the new regression test case),
and also prevents the possibility that the cleanup hook is run after
portions of the portal's state have already been recycled.  That doesn't
really matter in current usage, but it foreseeably could matter in the
future.

Back-patch to 9.1 where the Assert in question was added.
2012-02-15 16:19:01 -05:00
Robert Haas
edec8c8e00 Fix VPATH builds, broken by my recent commit to speed up tuplesorting.
The relevant commit is 337b6f5ecf.
2012-02-15 15:53:53 -05:00
Robert Haas
337b6f5ecf Speed up in-memory tuplesorting.
Per recent work by Peter Geoghegan, it's significantly faster to
tuplesort on a single sortkey if ApplySortComparator is inlined into
quicksort rather reached via a function pointer.  It's also faster
in general to have a version of quicksort which is specialized for
sorting SortTuple objects rather than objects of arbitrary size and
type.  This requires a couple of additional copies of the quicksort
logic, which in this patch are generate using a Perl script.  There
might be some benefit in adding further specializations here too,
but thus far it's not clear that those gains are worth their weight
in code footprint.
2012-02-15 12:13:32 -05:00