Commit Graph

23506 Commits

Author SHA1 Message Date
Peter Eisentraut b344c651fb Make init-po and update-po recursive make targets
This is for convenience, now that adding recursive targets is much
easier than it used to be when the NLS stuff was initially added.
2012-06-29 14:01:54 +03:00
Tom Lane ae90128dc5 Fix NOTIFY to cope with I/O problems, such as out-of-disk-space.
The LISTEN/NOTIFY subsystem got confused if SimpleLruZeroPage failed,
which would typically happen as a result of a write() failure while
attempting to dump a dirty pg_notify page out of memory.  Subsequently,
all attempts to send more NOTIFY messages would fail with messages like
"Could not read from file "pg_notify/nnnn" at offset nnnnn: Success".
Only restarting the server would clear this condition.  Per reports from
Kevin Grittner and Christoph Berg.

Back-patch to 9.0, where the problem was introduced during the
LISTEN/NOTIFY rewrite.
2012-06-29 00:51:34 -04:00
Tom Lane c1494b7330 Provide MAP_FAILED if sys/mman.h doesn't.
On old HPUX this has to be #defined to -1.  It might be that other values
are required on other dinosaur systems, but we'll worry about that when
and if we get reports.
2012-06-28 14:19:20 -04:00
Heikki Linnakangas 8f85667a86 Update outdated commit; xlp_rem_len field is in page header now.
Spotted by Amit Kapila
2012-06-28 20:35:18 +03:00
Peter Eisentraut dcd5af6c34 Further fix install program detection
The $(or) make function was introduced in GNU make 3.81, so the
previous coding didn't work in 3.80.  Write it differently, and
improve the variable naming to make more sense in the new coding.
2012-06-28 20:07:02 +03:00
Robert Haas 39715af23a Fix broken mmap failure-detection code, and improve error message.
Per an observation by Thom Brown that my previous commit made an
overly large shmem allocation crash the server, on Linux.
2012-06-28 12:57:22 -04:00
Robert Haas b0fc0df936 Dramatically reduce System V shared memory consumption.
Except when compiling with EXEC_BACKEND, we'll now allocate only a tiny
amount of System V shared memory (as an interlock to protect the data
directory) and allocate the rest as anonymous shared memory via mmap.
This will hopefully spare most users the hassle of adjusting operating
system parameters before being able to start PostgreSQL with a
reasonable value for shared_buffers.

There are a bunch of documentation updates needed here, and we might
need to adjust some of the HINT messages related to shared memory as
well.  But it's not 100% clear how portable this is, so before we
write the documentation, let's give it a spin on the buildfarm and
see what turns red.
2012-06-28 11:05:16 -04:00
Robert Haas c5b3451a8e Add missing space in event_source GUC description.
This has apparently been wrong since event_source was added.

Alexander Lakhin
2012-06-28 08:15:50 -04:00
Tom Lane bde689f809 Make UtilityContainsQuery recurse until it finds a non-utility Query.
The callers of UtilityContainsQuery want it to return a non-utility Query
if it returns anything at all.  However, since we made CREATE TABLE
AS/SELECT INTO into a utility command instead of a variant of SELECT,
a command like "EXPLAIN SELECT INTO" results in two nested utility
statements.  So what we need UtilityContainsQuery to do is drill down
to the bottom non-utility Query.

I had thought of this possibility in setrefs.c, and fixed it there by
looping around the UtilityContainsQuery call; but overlooked that the call
sites in plancache.c have a similar issue.  In those cases it's
notationally inconvenient to provide an external loop, so let's redefine
UtilityContainsQuery as recursing down to a non-utility Query instead.

Noted by Rushabh Lathia.  This is a somewhat cleaned-up version of his
proposed patch.
2012-06-27 23:18:30 -04:00
Peter Eisentraut f786715412 Fix install program detection
configure handles INSTALL as a substitution variable specially, and
apparently it gets confused when it's set to empty.  Use INSTALL_
instead as a workaround to avoid the issue.
2012-06-27 21:22:41 +03:00
Heikki Linnakangas a8f97b39c7 Fix two more neglected comments, still referring to log/seg.
Fujii Masao
2012-06-27 19:11:26 +03:00
Heikki Linnakangas ec786c6c81 I neglected many comments in the log+seg -> 64-bit segno patch. Fix.
Reported by Amit Kapila.
2012-06-27 17:53:53 +03:00
Peter Eisentraut 9db7ccae20 Use system install program when available and usable
In a3176dac22 we switched to using
install-sh unconditionally, because the configure check
AC_PROG_INSTALL would pick up any random program named install, which
has caused failure reports
(http://archives.postgresql.org/pgsql-hackers/2001-03/msg00312.php).
Now the configure check is much improved and should avoid false
positives.  It has also been shown that using a system install program
can significantly reduce "make install" times, so it's worth trying.
2012-06-27 13:40:51 +03:00
Robert Haas c60ca19de9 Allow pg_terminate_backend() to be used on backends with matching role.
A similar change was made previously for pg_cancel_backend, so now it
all matches again.

Dan Farina, reviewed by Fujii Masao, Noah Misch, and Jeff Davis,
with slight kibitzing on the doc changes by me.
2012-06-26 16:16:52 -04:00
Robert Haas b79ab00144 When LWLOCK_STATS is defined, count spindelays.
When LWLOCK_STATS is *not* defined, the only change is that
SpinLockAcquire now returns the number of delays.

Patch by me, review by Jeff Janes.
2012-06-26 16:06:07 -04:00
Tom Lane 757773602c Cope with smaller-than-normal BLCKSZ setting in SPGiST indexes on text.
The original coding failed miserably for BLCKSZ of 4K or less, as reported
by Josh Kupershmidt.  With the present design for text indexes, a given
inner tuple could have up to 256 labels (requiring either 3K or 4K bytes
depending on MAXALIGN), which means that we can't positively guarantee no
failures for smaller blocksizes.  But we can at least make it behave sanely
so long as there are few enough labels to fit on a page.  Considering that
btree is also more prone to "index tuple too large" failures when BLCKSZ is
small, it's not clear that we should expend more work than this on this
case.
2012-06-26 14:36:25 -04:00
Robert Haas 0caa0d04db Make DROP FUNCTION hint more informative.
If you decide you want to take the hint, this gives you something you
can paste right back to the server.

Dean Rasheed
2012-06-26 13:33:23 -04:00
Robert Haas 76837c1507 Reduce use of heavyweight locking inside hash AM.
Avoid using LockPage(rel, 0, lockmode) to protect against changes to
the bucket mapping.  Instead, an exclusive buffer content lock is now
viewed as sufficient permission to modify the metapage, and a shared
buffer content lock is used when such modifications need to be
prevented.  This more relaxed locking regimen makes it possible that,
when we're busy getting a heavyweight bucket on the bucket we intend
to search or insert into, a bucket split might occur underneath us.
To compenate for that possibility, we use a loop-and-retry system:
release the metapage content lock, acquire the heavyweight lock on the
target bucket, and then reacquire the metapage content lock and check
that the bucket mapping has not changed.   Normally it hasn't, and
we're done.  But if by chance it has, we simply unlock the metapage,
release the heavyweight lock we acquired previously, lock the new
bucket, and loop around again.  Even in the worst case we cannot loop
very many times here, since we don't split the same bucket again until
we've split all the other buckets, and 2^N gets big pretty fast.

This results in greatly improved concurrency, because we're
effectively replacing two lwlock acquire-and-release cycles in
exclusive mode (on one of the lock manager locks) with a single
acquire-and-release cycle in shared mode (on the metapage buffer
content lock).  Testing shows that it's still not quite as good as
btree; for that, we'd probably have to find some way of getting rid
of the heavyweight bucket locks as well, which does not appear
straightforward.

Patch by me, review by Jeff Janes.
2012-06-26 06:56:10 -04:00
Heikki Linnakangas 038f3a0509 Fix pg_upgrade, broken by the xlogid/segno -> 64-bit int refactoring.
The xlogid + segno representation of a particular WAL segment doesn't make
much sense in pg_resetxlog anymore, now that we don't use that anywhere
else. Use the WAL filename instead, since that's a convenient way to name a
particular WAL segment.

I did this partially for pg_resetxlog in the original xlogid/segno -> uint64
patch, but I neglected pg_upgrade and the docs. This should now be more
complete.
2012-06-26 07:49:02 +03:00
Tom Lane 8a504a3639 Make pg_dump emit more accurate dependency information.
While pg_dump has included dependency information in archive-format output
ever since 7.3, it never made any large effort to ensure that that
information was actually useful.  In particular, in common situations where
dependency chains include objects that aren't separately emitted in the
dump, the dependencies shown for objects that were emitted would reference
the dump IDs of these un-dumped objects, leaving no clue about which other
objects the visible objects indirectly depend on.  So far, parallel
pg_restore has managed to avoid tripping over this misfeature, but only
by dint of some crude hacks like not trusting dependency information in
the pre-data section of the archive.

It seems prudent to do something about this before it rises up to bite us,
so instead of emitting the "raw" dependencies of each dumped object,
recursively search for its actual dependencies among the subset of objects
that are being dumped.

Back-patch to 9.2, since that code hasn't yet diverged materially from
HEAD.  At some point we might need to back-patch further, but right now
there are no known cases where this is actively necessary.  (The one known
case, bug #6699, is fixed in a different way by my previous patch.)  Since
this patch depends on 9.2 changes that made TOC entries be marked before
output commences as to whether they'll be dumped, back-patching further
would require additional surgery; and as of now there's no evidence that
it's worth the risk.
2012-06-25 21:21:18 -04:00
Tom Lane a1ef01fe16 Improve pg_dump's dependency-sorting logic to enforce section dump order.
As of 9.2, with the --section option, it is very important that the concept
of "pre data", "data", and "post data" sections of the output be honored
strictly; else a dump divided into separate sectional files might be
unrestorable.  However, the dependency-sorting logic knew nothing of
sections and would happily select output orderings that didn't fit that
structure.  Doing so was mostly harmless before 9.2, but now we need to be
sure it doesn't do that.  To fix, create dummy objects representing the
section boundaries and add dependencies between them and all the normal
objects.  (This might sound expensive but it seems to only add a percent or
two to pg_dump's runtime.)

This also fixes a problem introduced in 9.1 by the feature that allows
incomplete GROUP BY lists when a primary key is given in GROUP BY.
That means that views can depend on primary key constraints.  Previously,
pg_dump would deal with that by simply emitting the primary key constraint
before the view definition (and hence before the data section of the
output).  That's bad enough for simple serial restores, where creating an
index before the data is loaded works, but is undesirable for speed
reasons.  But it could lead to outright failure of parallel restores, as
seen in bug #6699 from Joe Van Dyk.  That happened because pg_restore would
switch into parallel mode as soon as it reached the constraint, and then
very possibly would try to emit the view definition before the primary key
was committed (as a consequence of another bug that causes the view not to
be correctly marked as depending on the constraint).  Adding the section
boundary constraints forces the dependency-sorting code to break the view
into separate table and rule declarations, allowing the rule, and hence the
primary key constraint it depends on, to revert to their intended location
in the post-data section.  This also somewhat accidentally works around the
bogus-dependency-marking problem, because the rule will be correctly shown
as depending on the constraint, so parallel pg_restore will now do the
right thing.  (We will fix the bogus-dependency problem for real in a
separate patch, but that patch is not easily back-portable to 9.1, so the
fact that this patch is enough to dodge the only known symptom is
fortunate.)

Back-patch to 9.1, except for the hunk that adds verification that the
finished archive TOC list is in correct section order; the place where
it was convenient to add that doesn't exist in 9.1.
2012-06-25 21:21:17 -04:00
Alvaro Herrera 77ed0c6950 Tighten up includes in sinvaladt.h, twophase.h, proc.h
Remove proc.h from sinvaladt.h and twophase.h; also replace xlog.h in
proc.h with xlogdefs.h.
2012-06-25 18:40:40 -04:00
Peter Eisentraut eeece9e609 Unify calling conventions for postgres/postmaster sub-main functions
There was a wild mix of calling conventions: Some were declared to
return void and didn't return, some returned an int exit code, some
claimed to return an exit code, which the callers checked, but
actually never returned, and so on.

Now all of these functions are declared to return void and decorated
with attribute noreturn and don't return.  That's easiest, and most
code already worked that way.
2012-06-25 21:30:12 +03:00
Robert Haas c7d47abd04 Fix typo in DEBUG message, introduced by recent WAL refactoring.
Fujii Masao
2012-06-25 14:00:35 -04:00
Robert Haas a6427f1f47 Unbreak pg_resetxlog -l.
Fujii Masao
2012-06-25 13:58:38 -04:00
Robert Haas 2dfa87bcb6 Remove sanity test in XRecOffIsValid.
Commit 061e7efb1b changed the rules
for splitting xlog records across pages, but neglected to update this
test.  It's possible that there's some better action here than just
removing the test completely, but this at least appears to get some
of the things that are currently broken (like initdb on MacOS X)
working again.
2012-06-25 12:14:43 -04:00
Kevin Grittner 5c7f954d31 Fix warning for 64-bit literal on 32-bit build. 2012-06-25 07:25:00 -05:00
Peter Eisentraut b8b2e3b2de Replace int2/int4 in C code with int16/int32
The latter was already the dominant use, and it's preferable because
in C the convention is that intXX means XX bits.  Therefore, allowing
mixed use of int2, int4, int8, int16, int32 is obviously confusing.

Remove the typedefs for int2 and int4 for now.  They don't seem to be
widely used outside of the PostgreSQL source tree, and the few uses
can probably be cleaned up by the time this ships.
2012-06-25 01:51:46 +03:00
Heikki Linnakangas 7eb8c78514 I missed some references to xlogid/xrecoff in Win32-only code. Fix. 2012-06-24 22:14:31 +03:00
Heikki Linnakangas 0687a26002 Use UINT64CONST for 64-bit integer constants.
Peter Eisentraut advised me that UINT64CONST is the proper way to do that,
not LL suffix.
2012-06-24 21:56:45 +03:00
Heikki Linnakangas a218e23a08 Oops. Remove stray paren.
I didn't notice this on my laptop as I don't HAVE_FSYNC_WRITETHROUGH.
2012-06-24 20:03:57 +03:00
Heikki Linnakangas 96ff85e2dd Use LL suffix for 64-bit constants.
Per warning from buildfarm member 'locust'. At least I think this what's
making it upset.
2012-06-24 20:01:55 +03:00
Heikki Linnakangas 0ab9d1c4b3 Replace XLogRecPtr struct with a 64-bit integer.
This simplifies code that needs to do arithmetic on XLogRecPtrs.

To avoid changing on-disk format of data pages, the LSN on data pages is
still stored in the old format. That should keep pg_upgrade happy. However,
we have XLogRecPtrs embedded in the control file, and in the structs that
are sent over the replication protocol, so this changes breaks compatibility
of pg_basebackup and server. I didn't do anything about this in this patch,
per discussion on -hackers, the right thing to do would to be to change the
replication protocol to be architecture-independent, so that you could use
a newer version of pg_receivexlog, for example, against an older server
version.
2012-06-24 19:19:45 +03:00
Heikki Linnakangas 061e7efb1b Allow WAL record header to be split across pages.
This saves a few bytes of WAL space, but the real motivation is to make it
predictable how much WAL space a record requires, as it no longer depends
on whether we need to waste the last few bytes at end of WAL page because
the header doesn't fit.

The total length field of WAL record, xl_tot_len, is moved to the beginning
of the WAL record header, so that it is still always found on the first page
where a WAL record begins.

Bump WAL version number again as this is an incompatible change.
2012-06-24 18:35:56 +03:00
Heikki Linnakangas 20ba5ca64c Move WAL continuation record information to WAL page header.
The continuation record only contained one field, xl_rem_len, so it makes
things simpler to just include it in the WAL page header. This wastes four
bytes on pages that don't begin with a continuation from previos page, plus
four bytes on every page, because of padding.

The motivation of this is to make it easier to calculate how much space a
WAL record needs. Before this patch, it depended on how many page boundaries
the record crosses. The motivation of that, in turn, is to separate the
allocation of space in the WAL from the copying of the record data to the
allocated space. Keeping the calculation of space required simple helps to
keep the critical section of allocating the space from WAL short. But that's
not included in this patch yet.

Bump WAL version number again, as this is an incompatible change.
2012-06-24 18:35:30 +03:00
Heikki Linnakangas dfda6ebaec Don't waste the last segment of each 4GB logical log file.
The comments claimed that wasting the last segment made it easier to do
calculations with XLogRecPtrs, because you don't have problems representing
last-byte-position-plus-1 that way. In my experience, however, it only made
things more complicated, because the there was two ways to represent the
boundary at the beginning of a logical log file: logid = n+1 and xrecoff = 0,
or as xlogid = n and xrecoff = 4GB - XLOG_SEG_SIZE. Some functions were
picky about which representation was used.

Also, use a 64-bit segment number instead of the log/seg combination, to
point to a certain WAL segment. We assume that all platforms have a working
64-bit integer type nowadays.

This is an incompatible change in WAL format, so bumping WAL version number.
2012-06-24 18:35:29 +03:00
Tom Lane d14241c2cf Fix memory leak in ARRAY(SELECT ...) subqueries.
Repeated execution of an uncorrelated ARRAY_SUBLINK sub-select (which
I think can only happen if the sub-select is embedded in a larger,
correlated subquery) would leak memory for the duration of the query,
due to not reclaiming the array generated in the previous execution.
Per bug #6698 from Armando Miraglia.  Diagnosis and fix idea by Heikki,
patch itself by me.

This has been like this all along, so back-patch to all supported versions.
2012-06-21 17:27:19 -04:00
Alvaro Herrera 68d0e3cbf9 Repair comment mangled by a pgindent run long ago 2012-06-21 15:37:05 -04:00
Heikki Linnakangas eeb6f37d89 Add a small cache of locks owned by a resource owner in ResourceOwner.
This speeds up reassigning locks to the parent owner, when the transaction
holds a lot of locks, but only a few of them belong to the current resource
owner. This is particularly helps pg_dump when dumping a large number of
objects.

The cache can hold up to 15 locks in each resource owner. After that, the
cache is marked as overflowed, and we fall back to the old method of
scanning the whole local lock table. The tradeoff here is that the cache has
to be scanned whenever a lock is released, so if the cache is too large,
lock release becomes more expensive. 15 seems enough to cover pg_dump, and
doesn't have much impact on lock release.

Jeff Janes, reviewed by Amit Kapila and Heikki Linnakangas.
2012-06-21 15:30:26 +03:00
Tom Lane dfd9c116cc Remove incomplete/incorrect support for zero-column foreign keys.
The original coding in ri_triggers.c had partial support for the concept of
zero-column foreign key constraints.  But this is not defined in the SQL
standard, nor was it ever allowed by any other part of Postgres, nor was it
very fully implemented even here (eg there was no support for preventing
PK-table deletions that would violate the constraint).  Doesn't seem very
useful to carry 100-plus lines of code for a corner case that no one is
interested in making work.  Instead, just add a check that the column list
read from pg_constraint is non-empty.
2012-06-20 20:15:02 -04:00
Tom Lane 0ce4459a36 Increase MAX_SYSCACHE_CALLBACKS from 20 to 32.
By my count there are 18 callers of CacheRegisterSyscacheCallback in the
core code in HEAD, so we are potentially leaving as few as 2 slots for any
add-on code to use (though possibly not all these callers would actually
activate in any particular session).  That doesn't seem like a lot of
headroom, so let's pump it up a little.
2012-06-20 19:47:37 -04:00
Tom Lane 45ba424f33 Cache the results of ri_FetchConstraintInfo in a backend-local cache.
Extracting data from pg_constraint turned out to take as much as 10% of the
runtime in a bulk-update case where the foreign key column wasn't changing,
because we did it over again for each tuple.  Fix that by maintaining a
backend-local cache of the results.  This is really a pretty small patch,
but converting the trigger functions to work with pointers rather than
local struct variables requires a lot of mechanical changes.
2012-06-20 17:24:14 -04:00
Tom Lane cfa0f4255b Improve tests for whether we can skip queueing RI enforcement triggers.
During an update of a PK row, we can skip firing the RI trigger if any old
key value is NULL, because then the row could not have had any matching
rows in the FK table.  Conversely, during an update of an FK row, the
outcome is determined if any new key value is NULL.  In either case it
becomes unnecessary to compare individual key values.

This patch was inspired by discussion of Vik Reykja's patch to use IS NOT
DISTINCT semantics for the key comparisons.  In the event there is no need
for that and so this patch looks nothing like his, but he should still get
credit for having re-opened consideration of the trigger skip logic.
2012-06-19 20:07:33 -04:00
Alvaro Herrera 11b335ac4c pg_dump: Fix verbosity level in LO progress messages
In passing, reword another instance of the same message that was
gratuitously different.

Author: Josh Kupershmidt
after a bug report by Bosco Rama
2012-06-19 17:20:23 -04:00
Tom Lane fe3db74002 Share RI trigger code between NO ACTION and RESTRICT cases.
These triggers are identical except for whether ri_Check_Pk_Match is to be
called, so factor out the common code to save a couple hundred lines.

Also, eliminate null-column checks in ri_Check_Pk_Match, since they're
duplicate with the calling functions and require unnecessary complication
in its API statement.

Simplify the way code is shared between RI_FKey_check_ins and
RI_FKey_check_upd, too.
2012-06-19 14:31:54 -04:00
Tom Lane 48756be9cf Improve comments about why SET DEFAULT triggers must recheck for matches.
I was confused about this, so try to make it clearer for the next person.

(This seems like a fairly inefficient way of dealing with a corner case,
but I don't have a better idea offhand.  Maybe if there were a way to turn
off the RI_FKey_keyequal_upd_fk event filter temporarily?)
2012-06-18 22:45:07 -04:00
Tom Lane e8c9fd5fdf Allow ON UPDATE/DELETE SET DEFAULT plans to be cached.
Once upon a time, somebody was worried that cached RI plans wouldn't get
remade with new default values after ALTER TABLE ... SET DEFAULT, so they
didn't allow caching of plans for ON UPDATE/DELETE SET DEFAULT actions.
That time is long gone, though (and even at the time I doubt this was the
greatest hazard posed by ALTER TABLE...).  So allow these triggers to cache
their plans just like the others.

The cache_plan argument to ri_PlanCheck is now vestigial, since there
are no callers that don't pass "true"; but I left it alone in case there
is any future need for it.
2012-06-18 19:37:23 -04:00
Tom Lane 03a5ba24b0 Remove derived fields from RI_QueryKey, and do a bit of other cleanup.
We really only need the foreign key constraint's OID and the query type
code to uniquely identify each plan we are caching for FK checks.  The
other stuff that was in the struct had no business being used as part of
a hash key, and was all just being copied from struct RI_ConstraintInfo
anyway.  Get rid of the unnecessary fields, and readjust various function
APIs to make them use RI_ConstraintInfo not RI_QueryKey as info source.

I'd be surprised if this makes any measurable performance difference,
but it certainly feels cleaner.
2012-06-18 18:50:29 -04:00
Peter Eisentraut e1e97e9313 pg_dump: Add missing newlines at end of messages 2012-06-18 23:57:00 +03:00
Tom Lane f9429746c9 Update SQL spec references in ri_triggers code to match SQL:2008.
Now that what we're implementing isn't SQL92, we probably shouldn't cite
chapter and verse in that spec anymore.  Also fix some comments that
talked about MATCH FULL but in fact were in code that's also used for
MATCH SIMPLE.

No code changes in this commit, just comments.
2012-06-18 12:19:38 -04:00
Tom Lane c75be2ad60 Change ON UPDATE SET NULL/SET DEFAULT referential actions to meet SQL spec.
Previously, when executing an ON UPDATE SET NULL or SET DEFAULT action for
a multicolumn MATCH SIMPLE foreign key constraint, we would set only those
referencing columns corresponding to referenced columns that were changed.
This is what the SQL92 standard said to do --- but more recent versions
of the standard say that all referencing columns should be set to null or
their default values, no matter exactly which referenced columns changed.
At least for SET DEFAULT, that is clearly saner behavior.  It's somewhat
debatable whether it's an improvement for SET NULL, but it appears that
other RDBMS systems read the spec this way.  So let's do it like that.

This is a release-notable behavioral change, although considering that
our documentation already implied it was done this way, the lack of
complaints suggests few people use such cases.
2012-06-18 12:12:52 -04:00
Tom Lane f5297bdfe4 Refer to the default foreign key match style as MATCH SIMPLE internally.
Previously we followed the SQL92 wording, "MATCH <unspecified>", but since
SQL99 there's been a less awkward way to refer to the default style.

In addition to the code changes, pg_constraint.confmatchtype now stores
this match style as 's' (SIMPLE) rather than 'u' (UNSPECIFIED).  This
doesn't affect pg_dump or psql because they use pg_get_constraintdef()
to reconstruct foreign key definitions.  But other client-side code might
examine that column directly, so this change will have to be marked as
an incompatibility in the 9.3 release notes.
2012-06-17 20:16:44 -04:00
Peter Eisentraut bb7520cc26 Make documentation of --help and --version options more consistent
Before, some places didn't document the short options (-? and -V),
some documented both, some documented nothing, and they were listed in
various orders.  Now this is hopefully more consistent and complete.
2012-06-18 02:46:59 +03:00
Tom Lane 9e18eacbdf Fix stats collector to recover nicely when system clock goes backwards.
Formerly, if the system clock went backwards, the stats collector would
fail to update the stats file any more until the clock reading again
exceeds whatever timestamp was last written into the stats file.  Such
glitches in the clock's behavior are not terribly unlikely on machines
not using NTP.  Such a scenario has been observed to cause regression test
failures in the buildfarm, and it could have bad effects on the behavior
of autovacuum, so it seems prudent to install some defenses.

We could directly detect the clock going backwards by adding
GetCurrentTimestamp calls in the stats collector's main loop, but that
would hurt performance on platforms where GetCurrentTimestamp is expensive.
To minimize the performance hit in normal cases, adopt a more complicated
scheme wherein backends check for clock skew when reading the stats file,
and if they see it, signal the stats collector by sending an extra stats
inquiry message.  The stats collector does an extra GetCurrentTimestamp
only when it receives an inquiry with an apparently out-of-order
timestamp.

To avoid unnecessary GetCurrentTimestamp calls, expand the inquiry messages
to carry the backend's current clock reading as well as its stats cutoff
time.  The latter, being intentionally slightly in-the-past, would trigger
more clock rechecks than we need if it were used for this purpose.

We might want to backpatch this change at some point, but let's let it
shake out in the buildfarm for awhile first.
2012-06-17 17:11:49 -04:00
Bruce Momjian 47463a8098 Remove 'for' loop perltidy argument, and move args to perltidyrc file.
Backpatch to 9.2.

Per suggestion from Noah Misch
2012-06-16 10:12:50 -04:00
Bruce Momjian 0acd978259 In pgindent, suppress reading the perltidy RC file using --noprofile. 2012-06-15 22:50:02 -04:00
Bruce Momjian d6e0207437 Update pgindent Perl indentation instructions based on feedback from
Àlvaro and Noah Misch.

Backpatch to 9.2.
2012-06-15 22:43:23 -04:00
Peter Eisentraut 15b1918e7d Improve reporting of permission errors for array types
Because permissions are assigned to element types, not array types,
complaining about permission denied on an array type would be
misleading to users.  So adjust the reporting to refer to the element
type instead.

In order not to duplicate the required logic in two dozen places,
refactor the permission denied reporting for types a bit.

pointed out by Yeb Havinga during the review of the type privilege
feature
2012-06-15 22:55:03 +03:00
Peter Eisentraut d933092e0a Add more message pluralization
Even though we can't do much about the case with multiple plurals in
one sentence, we can fix the other cases.
2012-06-15 02:02:02 +03:00
Robert Haas 8507c2f856 Improve readability and error messages in pg_backup_start_time.
Gurjeet Singh, with corrections by me.
2012-06-14 15:20:08 -04:00
Robert Haas 68de499bda New SQL functons pg_backup_in_progress() and pg_backup_start_time()
Darold Gilles, reviewed by Gabriele Bartolini and others, rebased by
Marco Nenciarini.  Stylistic cleanup and OID fixes by me.
2012-06-14 13:25:43 -04:00
Robert Haas cd80073445 During transaction cleanup, release locks before deleting files.
There's no need to hold onto the locks until the files are needed,
and by doing it this way, we reduce the impact on other backends who
may be awaiting locks we hold.

Noah Misch
2012-06-14 10:19:33 -04:00
Robert Haas 6cd015bea3 Add new function log_newpage_buffer.
When I implemented the ginbuildempty() function as part of
implementing unlogged tables, I falsified the note in the header
comment for log_newpage.  Although we could fix that up by changing
the comment, it seems cleaner to add a new function which is
specifically intended to handle this case.  So do that.
2012-06-14 10:11:16 -04:00
Robert Haas a475c60367 Remove misplaced sanity check from heap_create().
Even when allow_system_table_mods is not set, we allow creation of any
type of SQL object in pg_catalog, except for relations.  And you can
get relations into pg_catalog, too, by initially creating them in some
other schema and then moving them with ALTER .. SET SCHEMA.  So this
restriction, which prevents relations (only) from being created in
pg_catalog directly, is fairly pointless.  If we need a safety mechanism
for this, it should be placed further upstream, so that it affects all
SQL objects uniformly, and picks up both CREATE and SET SCHEMA.

For now, just rip it out, per discussion with Tom Lane.
2012-06-14 09:58:53 -04:00
Robert Haas d2c86a1ccd Remove RELKIND_UNCATALOGED.
This may have been important at some point in the past, but it no
longer does anything useful.

Review by Tom Lane.
2012-06-14 09:47:30 -04:00
Robert Haas 7582e0be78 Make \conninfo print SSL information.
Alastair Turner, per suggestion from Bruce Momjian.
2012-06-14 09:43:14 -04:00
Tom Lane 80491a1983 Add 9.2 branch to git_changelog's list. 2012-06-13 22:23:31 -04:00
Tom Lane f32609db72 Flesh out RELEASE_CHANGES instructions for branching in git.
We have this info in the wiki, but it should be here too.
2012-06-13 22:11:06 -04:00
Tom Lane 357c549334 Stamp library minor versions for 9.3.
This includes fixing the MSVC copy of ecpg/preproc's version info, which
seems to have been overlooked repeatedly.  Can't we fix that so there are
not two copies??
2012-06-13 22:06:26 -04:00
Tom Lane bed88fceac Stamp HEAD as 9.3devel.
Let the hacking begin ...
2012-06-13 20:03:02 -04:00
Tom Lane 80edfd7659 Revisit error message details for JSON input parsing.
Instead of identifying error locations only by line number (which could
be entirely unhelpful with long input lines), provide a fragment of the
input text too, placing this info in a new CONTEXT entry.  Make the
error detail messages conform more closely to style guidelines, fix
failure to expose some of them for translation, ensure compiler can
check formats against supplied parameters.
2012-06-13 19:43:35 -04:00
Tom Lane b8b69d8990 Revert "Reduce checkpoints and WAL traffic on low activity database server"
This reverts commit 18fb9d8d21.  Per
discussion, it does not seem like a good idea to allow committed changes to
go un-checkpointed indefinitely, as could happen in a low-traffic server;
that makes us entirely reliant on the WAL stream with no redundancy that
might aid data recovery in case of disk failure.

This re-introduces the original problem of hot-standby setups generating a
small continuing stream of WAL traffic even when idle, but there are other
ways to address that without compromising crash recovery, so we'll revisit
that issue in a future release cycle.
2012-06-13 18:48:44 -04:00
Tom Lane c3bc76bdb0 Deprecate use of GLOBAL and LOCAL in temp table creation.
Aside from adjusting the documentation to say that these are deprecated,
we now report a warning (not an error) for use of GLOBAL, since it seems
fairly likely that we might change that to request SQL-spec-compliant temp
table behavior in the foreseeable future.  Although our handling of LOCAL
is equally nonstandard, there is no evident interest in ever implementing
SQL modules, and furthermore some other products interpret LOCAL as
behaving the same way we do.  So no expectation of change and no warning
for LOCAL; but it still seems a good idea to deprecate writing it.

Noah Misch
2012-06-13 17:48:42 -04:00
Tom Lane 93f4d7f806 Support Linux's oom_score_adj API as well as the older oom_adj API.
The simplest way to handle this is just to copy-and-paste the relevant
code block in fork_process.c, so that's what I did. (It's possible that
something more complicated would be useful to packagers who want to work
with either the old or the new API; but at this point the number of such
people is rapidly approaching zero, so let's just get the minimal thing
done.)  Update relevant documentation as well.
2012-06-13 15:35:52 -04:00
Peter Eisentraut c0a6f9c84b Improve documentation of postgres -C option
Clarify help (s/return/print/), and explain that this option is for
use by other programs, not for user-facing use (it does not print
units).
2012-06-13 13:41:25 +03:00
Tom Lane f871ef74a5 Minor code review for json.c.
Improve commenting, conform to project style for use of ++ etc.
No functional changes.
2012-06-12 16:23:45 -04:00
Robert Haas 36b7e3da17 Mark JSON error detail messages for translation.
Per gripe from Tom Lane.
2012-06-12 10:41:38 -04:00
Tom Lane 51e61b04f8 Ensure pg_ctl behaves sanely when data directory is not specified.
Commit aaa6e1def2 introduced multiple hazards
in the case where pg_ctl is executed with neither a -D switch nor any
PGDATA environment variable.  It would dump core on machines which are
unforgiving about printf("%s", NULL), or failing that possibly give a
rather unhelpful complaint about being unable to execute "postgres -C",
rather than the logically prior complaint about not being told where the
data directory is.

Edmund Horner's report suggests that there is another, Windows-specific
hazard here, but I'm not the person to fix that; it would in any case only
be significant when trying to use a config-only PGDATA pointer.
2012-06-11 22:47:16 -04:00
Tom Lane bf0945e863 Fix pg_dump output to a named tar-file archive.
"pg_dump -Ft -f filename ..." got broken by my recent commit
4317e0246c, which I fear I only tested
in the output-to-stdout variant.

Report and fix by Muhammad Asif Naeem.
2012-06-11 21:55:48 -04:00
Peter Eisentraut 7d754961f7 pg_receivexlog: Rename option --dir to --directory
getopt_long() allows abbreviating long options, so we might as well
give the option the full name, and users can abbreviate it how they
like.

Do some general polishing of the --help output at the same time.
2012-06-12 00:55:27 +03:00
Magnus Hagander 3595a71e9c Prevent non-streaming replication connections from being selected sync slave
This prevents a pg_basebackup backup session that just does a base
backup (no xlog involved at all) from becoming the synchronous slave
and thus blocking all access while it runs.

Also fixes the problem when a higher priority slave shows up it would
become the sync standby before it has reached the STREAMING state, by
making sure we can only switch to a walsender that's actually STREAMING.

Fujii Masao
2012-06-11 15:17:38 +02:00
Magnus Hagander 9af34cdec8 Revert behaviour of -x/--xlog to 9.1 semantics
To replace it, add -X/--xlog-method that allows the specification
of fetch or stream.

Do this to avoid unnecessary backwards-incompatiblity. Spotted and
suggested by Peter Eisentraut.
2012-06-11 14:58:35 +02:00
Bruce Momjian 927d61eeff Run pgindent on 9.2 source tree in preparation for first 9.3
commit-fest.
2012-06-10 15:20:04 -04:00
Bruce Momjian 60801944fa Update pgindent install instructions and update typedef list. 2012-06-10 15:15:31 -04:00
Magnus Hagander a0b4c5a20a Fix pg_basebackup/pg_receivexlog for floating point timestamps
Since the replication protocol deals with TimestampTz, we need to
care for the floating point case as well in the frontend tools.

Fujii Masao, with changes from Magnus Hagander
2012-06-10 12:12:36 +02:00
Magnus Hagander 7c1abc00fa Error message capitalization fix 2012-06-10 12:02:52 +02:00
Peter Eisentraut 8570114dc1 Make include files work without having to include other ones first 2012-06-10 12:46:14 +03:00
Simon Riggs 28ac797287 Revert error message on GLOBAL/LOCAL pending further discussion 2012-06-10 08:41:01 +01:00
Simon Riggs 72335a2015 Add ERROR msg for GLOBAL/LOCAL TEMP is not yet implemented 2012-06-09 16:35:26 +01:00
Simon Riggs 3725570539 Fix bug in early startup of Hot Standby with subtransactions.
When HS startup is deferred because of overflowed subtransactions, ensure
that we re-initialize KnownAssignedXids for when both existing and incoming
snapshots have non-zero qualifying xids.

Fixes bug #6661 reported by Valentine Gogichashvili.

Analysis and fix by Andres Freund
2012-06-08 17:34:04 +01:00
Robert Haas 3b5548a3d5 When using libpq URI syntax, error out on invalid parameter names.
Dan Farina
2012-06-08 08:47:24 -04:00
Tom Lane ece01aae47 Scan the buffer pool just once, not once per fork, during relation drop.
This provides a speedup of about 4X when NBuffers is large enough.
There is also a useful reduction in sinval traffic, since we
only do CacheInvalidateSmgr() once not once per fork.

Simon Riggs, reviewed and somewhat revised by Tom Lane
2012-06-07 17:43:11 -04:00
Peter Eisentraut 5d0109bd27 Message style improvements 2012-06-07 23:54:59 +03:00
Tom Lane e8d029a30b Do unlocked prechecks in bufmgr.c loops that scan the whole buffer pool.
DropRelFileNodeBuffers, DropDatabaseBuffers, FlushRelationBuffers, and
FlushDatabaseBuffers have to scan the whole shared_buffers pool because
we have no index structure that would find the target buffers any more
efficiently than that.  This gets expensive with large NBuffers.  We can
shave some cycles from these loops by prechecking to see if the current
buffer is interesting before we acquire the buffer header lock.
Ordinarily such a test would be unsafe, but in these cases it should be
safe because we are already assuming that the caller holds a lock that
prevents any new target pages from being loaded into the buffer pool
concurrently.  Therefore, no buffer tag should be changing to a value of
interest, only away from a value of interest.  So a false negative match
is impossible, while a false positive is safe because we'll recheck after
acquiring the buffer lock.  Initial testing says that this speeds these
loops by a factor of 2X to 3X on common Intel hardware.

Patch for DropRelFileNodeBuffers by Jeff Janes (based on an idea of
Heikki's); extended to the remaining sequential scans by Tom Lane
2012-06-07 16:46:26 -04:00
Simon Riggs 2c8a4e9be2 Wake WALSender to reduce data loss at failover for async commit.
WALSender now woken up after each background flush by WALwriter, avoiding
multi-second replication delay for an all-async commit workload.
Replication delay reduced from 7s with default settings to 200ms and often
much less, allowing significantly reduced data loss at failover.

Andres Freund and Simon Riggs
2012-06-07 19:22:47 +01:00
Robert Haas b50991eedb Fix more crash-safe visibility map bugs, and improve comments.
In lazy_scan_heap, we could issue bogus warnings about incorrect
information in the visibility map, because we checked the visibility
map bit before locking the heap page, creating a race condition.  Fix
by rechecking the visibility map bit before we complain.  Rejigger
some related logic so that we rely on the possibly-outdated
all_visible_according_to_vm value as little as possible.

In heap_multi_insert, it's not safe to clear the visibility map bit
before beginning the critical section.  The visibility map is not
crash-safe unless we treat clearing the bit as a critical operation.
Specifically, if the transaction were to error out after we set the
bit and before entering the critical section, we could end up writing
the heap page to disk (with the bit cleared) and crashing before the
visibility map page made it to disk.  That would be bad.  heap_insert
has this correct, but somehow the order of operations got rearranged
when heap_multi_insert was added.

Also, add some more comments to visibilitymap_test, lazy_scan_heap,
and IndexOnlyNext, expounding on concurrency issues.

Per extensive code review by Andres Freund, and further review by Tom
Lane, who also made the original report about the bogus warnings.
2012-06-07 12:48:13 -04:00
Magnus Hagander 92135ea0ed Use strerror(errno) instead of %m
Found by Fujii Masao
2012-06-05 15:52:08 +02:00
Tom Lane 3dd8e59681 Fix bogus handling of control characters in json_lex_string().
The original coding misbehaved if "char" is signed, and also made the
extremely poor decision to print control characters literally when trying
to complain about them.  Report and patch by Shigeru Hanada.

In passing, also fix core dump risk in report_parse_error() should the
parse state be something other than what it expects.
2012-06-04 20:43:57 -04:00
Tom Lane d73b7f973d Fix memory leaks in failure paths in buildACLCommands and parseAclItem.
This is currently only cosmetic, since all the call sites just curl up
and die in event of a failure return.  It might be important for some
future use-case, though, and in any case it quiets warnings from the
clang static analyzer (as reported by Anna Zaks).

Josh Kupershmidt
2012-06-03 11:52:52 -04:00
Simon Riggs d3abbbebe5 Avoid early reuse of btree pages, causing incorrect query results.
When we allowed read-only transactions to skip assigning XIDs
we introduced the possibility that a fully deleted btree page
could be reused. This broke the index link sequence which could
then lead to indexscans silently returning fewer rows than would
have been correct. The actual incidence of silent errors from
this is thought to be very low because of the exact workload
required and locking pre-conditions. Fix is to remove pages only
if index page opaque->btpo.xact precedes RecentGlobalXmin.

Noah Misch, reviewed by Simon Riggs
2012-06-01 12:21:45 +01:00
Simon Riggs 055c352abb After any checkpoint, close all smgr files handles in bgwriter 2012-06-01 09:24:53 +01:00
Simon Riggs a297d64d92 Checkpointer starts before bgwriter to avoid missing fsync requests.
Noted while testing Hot Standby startup.
2012-06-01 08:25:17 +01:00
Simon Riggs 1ec6a2bbc9 Provide interim statistics while in mid-checkpoint.
Re-implements similar functionality in 9.1 and previously which
was removed during split of checkpointer and bgwriter.

Requested/spotted by Magnus Hagander
2012-06-01 08:19:06 +01:00
Tom Lane 4bec93ac0f Stamp 9.2beta2. 2012-05-31 19:16:55 -04:00
Tom Lane a04dc87db1 Improve comment for GetStableLatestTransactionId(). 2012-05-31 11:20:02 -04:00
Simon Riggs a2b516dab9 Only throw recovery conflicts when InHotStandby. Bug fix to recent
patch to allow Index Only Scans on Hot Standby.

Bug report from Jaime Casanova
2012-05-31 13:11:47 +01:00
Tom Lane c8105e62bb Update time zone data files to tzdata release 2012c.
DST law changes in Antarctica, Armenia, Chile, Cuba, Falkland Islands,
Gaza, Haiti, Hebron, Morocco, Syria, Tokelau Islands.
Historical corrections for Canada.
2012-05-31 00:47:57 -04:00
Tom Lane ad0009e7be Force PL and range-type support functions to be owned by a superuser.
We allow non-superusers to create procedural languages (with restrictions)
and range datatypes.  Previously, the automatically-created support
functions for these objects ended up owned by the creating user.  This
represents a rather considerable security hazard, because the owning user
might be able to alter a support function's definition in such a way as to
crash the server, inject trojan-horse SQL code, or even execute arbitrary
C code directly.  It appears that right now the only actually exploitable
problem is the infinite-recursion bug fixed in the previous patch for
CVE-2012-2655.  However, it's not hard to imagine that future additions of
more ALTER FUNCTION capability might unintentionally open up new hazards.
To forestall future problems, cause these support functions to be owned by
the bootstrap superuser, not the user creating the parent object.
2012-05-30 23:47:57 -04:00
Tom Lane 33c6eaf78e Ignore SECURITY DEFINER and SET attributes for a PL's call handler.
It's not very sensible to set such attributes on a handler function;
but if one were to do so, fmgr.c went into infinite recursion because
it would call fmgr_security_definer instead of the handler function proper.
There is no way for fmgr_security_definer to know that it ought to call the
handler and not the original function referenced by the FmgrInfo's fn_oid,
so it tries to do the latter, causing the whole process to start over
again.

Ordinarily such misconfiguration of a procedural language's handler could
be written off as superuser error.  However, because we allow non-superuser
database owners to create procedural languages and the handler for such a
language becomes owned by the database owner, it is possible for a database
owner to crash the backend, which ideally shouldn't be possible without
superuser privileges.  In 9.2 and up we will adjust things so that the
handler functions are always owned by superusers, but in existing branches
this is a minor security fix.

Problem noted by Noah Misch (after several of us had failed to detect
it :-().  This is CVE-2012-2655.
2012-05-30 23:27:57 -04:00
Tom Lane cd0ff9c0f4 Expand the allowed range of timezone offsets to +/-15:59:59 from Greenwich.
We used to only allow offsets less than +/-13 hours, then it was +/14,
then it was +/-15.  That's still not good enough though, as per today's bug
report from Patric Bechtel.  This time I actually looked through the Olson
timezone database to find the largest offsets used anywhere.  The winners
are Asia/Manila, at -15:56:00 until 1844, and America/Metlakatla, at
+15:13:42 until 1867.  So we'd better allow offsets less than +/-16 hours.

Given the history, we are way overdue to have some greppable #define
symbols controlling this, so make some ... and also remove an obsolete
comment that didn't get fixed the last time.

Back-patch to all supported branches.
2012-05-30 19:58:35 -04:00
Robert Haas 07ab1383e3 Fix two more bugs in fast-path relation locking.
First, the previous code failed to account for the fact that, during Hot
Standby operation, the startup process takes AccessExclusiveLocks on
relations without setting MyDatabaseId.  This resulted in fast path
strong lock counts failing to be incremented with the startup process
took locks, which in turn allowed conflicting lock requests to succeed
when they should not have.  Report by Erik Rijkers, diagnosis by Heikki
Linnakangas.

Second, LockReleaseAll() failed to honor the allLocks and lockmethodid
restrictions with respect to fast-path locks.  It's not clear to me
whether this produces any user-visible breakage at the moment, but it's
certainly wrong.  Rearrange order of operations in LockReleaseAll to fix.
Noted by Tom Lane.
2012-05-30 16:17:46 -04:00
Heikki Linnakangas d1996ed5e8 Change the way parent pages are tracked during buffered GiST build.
We used to mimic the way a stack is constructed when descending the tree
during normal GiST inserts, but that was quite complicated during a buffered
build. It was also wrong: in GiST, the left-to-right relationships on
different levels might not match each other, so that when you know the
parent of a child page, you won't necessarily find the parent of the page to
the right of the child page by following the rightlinks at the parent level.
This sometimes led to "could not re-find parent" errors while building a
GiST index.

We now use a simple hash table to track the parent of every internal page.
Whenever a page is split, and downlinks are moved from one page to another,
we update the hash table accordingly. This is also better for performance
than the old method, as we never need to move right to re-find the parent
page, which could take a significant amount of time for buffers that were
created much earlier in the index build.
2012-05-30 12:05:57 +03:00
Heikki Linnakangas be02b16826 Delete the temporary file used in buffered GiST build, after the build.
There were two bugs here: We forgot to call gistFreeBuildBuffers() function
at the end of build, and we passed interXact == true to BufFileCreateTemp,
so the file wasn't automatically cleaned up at end-of-transaction either.
2012-05-30 12:05:57 +03:00
Tom Lane 4317e0246c Rewrite --section option to decouple it from --schema-only/--data-only.
The initial implementation of pg_dump's --section option supposed that the
existing --schema-only and --data-only options could be made equivalent to
--section settings.  This is wrong, though, due to dubious but long since
set-in-stone decisions about where to dump SEQUENCE SET items, as seen in
bug report from Martin Pitt.  (And I'm not totally convinced there weren't
other bugs, either.)  Undo that coupling and instead drive --section
filtering off current-section state tracked as we scan through the TOC
list to call _tocEntryRequired().

To make sure those decisions don't shift around and hopefully save a few
cycles, run _tocEntryRequired() only once per TOC entry and save the result
in a new TOC field.  This required minor rejiggering of ACL handling but
also allows a far cleaner implementation of inhibit_data_for_failed_table.

Also, to ensure that pg_dump and pg_restore have the same behavior with
respect to the --section switches, add _tocEntryRequired() filtering to
WriteToc() and WriteDataChunks(), rather than trying to implement section
filtering in an entirely orthogonal way in dumpDumpableObject().  This
required adjusting the handling of the special ENCODING and STDSTRINGS
items, but they were pretty weird before anyway.

Minor other code review for the patch, too.
2012-05-29 23:22:14 -04:00
Heikki Linnakangas 4bc6fb57f7 Fix integer overflow bug in GiST buffering build calculations.
The result of (maintenance_work_mem * 1024) / BLCKSZ doesn't fit in a signed
32-bit integer, if maintenance_work_mem >= 2GB. Use double instead. And
while we're at it, write the calculations in an easier to understand form,
with the intermediary steps written out and commented.
2012-05-29 22:27:42 +03:00
Tom Lane 2755abf386 Teach AbortOutOfAnyTransaction to clean up partially-started transactions.
AbortOutOfAnyTransaction failed to do anything if the state it saw on
entry corresponded to failing partway through StartTransaction.  I fixed
AbortCurrentTransaction to cope with that case way back in commit
60b2444cc3, but evidently overlooked that
AbortOutOfAnyTransaction should do likewise.

Back-patch to all supported branches.  It's not clear that this omission
has any more-than-cosmetic consequences, but it's also not clear that it
doesn't, so back-patching seems the least risky choice.
2012-05-28 23:57:06 -04:00
Tom Lane c89bdf7690 Eliminate some more O(N^2) behaviors in pg_dump/pg_restore.
This patch fixes three places (which AFAICT is all of them) where runtime
was O(N^2) in the number of TOC entries, by using an index array to replace
linear searches of the TOC list.  This performance issue is a bit less bad
than those recently fixed, because it depends on the number of items dumped
not the number in the source database, so the problem can be dodged by
doing partial dumps.

The previous coding already had an instance of one of the two index arrays
needed, but it was only calculated in parallel-restore cases; now we need
it all the time.  I also chose to move the arrays into the ArchiveHandle
data structure, to make this code a bit more ready for the day that we
try to sling multiple ArchiveHandles around in pg_dump or pg_restore.

Since we still need some server-side work before pg_dump can really cope
nicely with tens of thousands of tables, there's probably little point in
back-patching.
2012-05-28 20:38:28 -04:00
Peter Eisentraut 2d612abd4d libpq: URI parsing fixes
Drop special handling of host component with slashes to mean
Unix-domain socket.  Specify it as separate parameter or using
percent-encoding now.

Allow omitting username, password, and port even if the corresponding
designators are present in URI.

Handle percent-encoding in query parameter keywords.

Alex Shulgin

some documentation improvements by myself
2012-05-28 22:44:34 +03:00
Peter Eisentraut 388d251679 Update SQL features list
Set E081 Basic Privileges to supported, since by the letter of it, we
support it, even though not all possible forms of USAGE privileges are
implemented.
2012-05-27 23:34:16 +03:00
Peter Eisentraut 8e497c731b psql: Remove notice about readline from --version output
This was from a time when readline support wasn't standard.  And it
doesn't help analyzing current line editing library problems.
2012-05-27 22:48:20 +03:00
Peter Eisentraut 27314d32a8 Suppress -Wunused-result warning about write()
This is related to aa90e148ca, but this
code is only used under -DLINUX_OOM_ADJ, so it was apparently
overlooked then.
2012-05-27 22:35:01 +03:00
Peter Eisentraut a8b92b6090 PL/Perl: Avoid compiler warning from clang
Use SvREFCNT_inc_simple_void() instead of SvREFCNT_inc() to avoid
warning about unused return value.
2012-05-27 22:30:34 +03:00
Magnus Hagander 16282ae688 Make pg_recievexlog by default loop on connection failures
Avoids the need for an external script in the most common
scenario. Behavior can be overridden using the -n/--noloop
commandline parameter.
2012-05-27 11:05:24 +02:00
Tom Lane 532fe28dad Prevent synchronized scanning when systable_beginscan chooses a heapscan.
The only interesting-for-performance case wherein we force heapscan here
is when we're rebuilding the relcache init file, and the only such case
that is likely to be examining a catalog big enough to be syncscanned is
RelationBuildTupleDesc.  But the early-exit optimization in that code gets
broken if we start the scan at a random place within the catalog, so that
allowing syncscan is actually a big deoptimization if pg_attribute is large
(at least for the normal case where the rows for core system catalogs have
never been changed since initdb).  Hence, prevent syncscan here.  Per my
testing pursuant to complaints from Jeff Frost and Greg Sabino Mullane,
though neither of them seem to have actually hit this specific problem.

Back-patch to 8.3, where syncscan was introduced.
2012-05-26 19:09:52 -04:00
Tom Lane d3b97d1488 Fix string truncation to be multibyte-aware in text_name and bpchar_name.
Previously, casts to name could generate invalidly-encoded results.

Also, make these functions match namein() more exactly, by consistently
using palloc0() instead of ad-hoc zeroing code.

Back-patch to all supported branches.

Karl Schnaitter and Tom Lane
2012-05-25 17:34:51 -04:00
Tom Lane 73cc7d3b24 Use binary search instead of brute-force scan in findNamespace().
The previous coding presented a significant bottleneck when dumping
databases containing many thousands of schemas, since the total time
spent searching would increase roughly as O(N^2) in the number of objects.
Noted by Jeff Janes, though I rewrote his proposed patch to use the
existing findObjectByOid infrastructure.

Since this is a longstanding performance bug, backpatch to all supported
versions.
2012-05-25 14:35:37 -04:00
Magnus Hagander 31d965819b Fix base backup streaming xlog from standby
When backing up from a standby server, the backup process
will not automatically switch xlog segment. So we must
accept a partially transferred xlog file in this case, but
rename it into position anyway.

In passing, merge the two callbacks for segment end and
stop stream into a single callback, since their implementations
were close to identical, and rename this callback to
reflect that it stops streaming rather than continues it.

Patch by Magnus Hagander, review by Fujii Masao
2012-05-25 11:36:22 +02:00
Tom Lane 2a4c46e0ba Fix array overrun in regex code.
zaptreesubs() was coded to unconditionally reset a capture subre's
corresponding pmatch[] entry.  However, in regexes without backrefs, that
array is caller-supplied and might not have as many entries as the regex
has capturing parens.  So check the array length and do nothing if there
is no corresponding entry, much as subset() does.  Failure to check this
resulted in a stack clobber in the case reported by Marko Kreen.

This bug appears to have been latent in the regex library from the
beginning.  It was not exposed because find() called dissect() not
cdissect(), and the dissect() code path didn't ever call zaptreesubs()
(formerly zapmem()).  When I unified dissect() and cdissect() in commit
4dd78bf37a, the problem was exposed.

Now that I've seen this, I'm rather suspicious that we might need to
back-patch it; but will refrain for now, for lack of evidence that
the case can be hit in the previous coding.
2012-05-24 13:56:16 -04:00
Magnus Hagander 77f93cb32d Add missing PQfinish() calls
Fujii Masao
2012-05-23 21:52:23 +02:00
Tom Lane ed962fd712 Ensure that seqscans check for interrupts at least once per page.
If a seqscan encounters many consecutive pages containing only dead tuples,
it can remain in the loop in heapgettup for a long time, and there was no
CHECK_FOR_INTERRUPTS anywhere in that loop.  This meant there were
real-world situations where a query would be effectively uncancelable for
long stretches.  Add a check placed to occur once per page, which should be
enough to provide reasonable response time without adding any measurable
overhead.

Report and patch by Merlin Moncure (though I tweaked it a bit).
Back-patch to all supported branches.
2012-05-22 19:42:05 -04:00
Robert Haas 8fbe5a317d Fix error message for COMMENT/SECURITY LABEL ON COLUMN xxx IS 'yyy'
When the column name is an unqualified name, rather than table.column,
the error message complains about too many dotted names, which is
wrong.  Report by Peter Eisentraut based on examination of the
sepgsql regression test output, but the problem also affects COMMENT.
New wording as suggested by Tom Lane.
2012-05-22 11:23:36 -04:00
Robert Haas 304aa339b2 Prevent pg_basebackup when integer_datetimes flag doesn't match.
Magnus Hagander, reviewed by Fujii Masao, with slight wording changes
by me.
2012-05-22 10:02:47 -04:00
Robert Haas 219c024c64 Repair out-of-date information in src/backend/storage/buffer/README.
In commit d526575f89, we changed things so
that buffer usage counts are incremented when the buffer is pinned, rather
than when it is unpinned, but the README file didn't get the memo.

Report by Amit Kapila.
2012-05-22 09:32:09 -04:00
Tom Lane b94ce6e807 Move postmaster's RemovePgTempFiles call to a less randomly chosen place.
There is no reason to do this as early as possible in postmaster startup,
and good reason not to do it until we have completely created the
postmaster's lock file, namely that it might contribute to pg_ctl thinking
that postmaster startup has timed out.  (This would require a rather
unusual amount of time to be spent scanning temp file directories, but we
have at least one field report of it happening reproducibly.)

Back-patch to 9.1.  Before that, pg_ctl didn't wait for additional info to
be added to the lock file, so it wasn't a problem.

Note that this is not a complete fix to the slow-start issue in 9.1,
because we still had identify_system_timezone being run during postmaster
start in 9.1.  But that's at least a reasonably well-defined delay, with
an easy workaround if needed, whereas the temp-files scan is not so
predictable and cannot be avoided.
2012-05-21 22:50:30 -04:00
Tom Lane efae4653c9 Update woefully-obsolete comment.
The accurate info about what's in a lock file has been in miscadmin.h
for some time, so let's just make this comment point there instead of
maintaining a duplicative copy.
2012-05-21 22:11:00 -04:00
Peter Eisentraut cdf8bcb8d9 pg_ctl: Sort signal list in --help output
The list was neither logical nor numerical nor alphabetical.  Let's go
with alphabetical.
2012-05-21 20:12:30 +03:00
Peter Eisentraut 4c39a09089 libpq: Add missing file to GETTEXT_FILES list
For the record, fe-print.c is also missing, but it's sort of
deprecated, and the string internationalization there has some issues,
and it doesn't seem worth fixing that.  So let's leave that out.
2012-05-21 20:08:50 +03:00
Peter Eisentraut f1f6737e15 Fix incorrect logic in JSON number lexer
Detectable by gcc -Wlogical-op.

Add two regression test cases that would previously allow incorrect
values to pass.
2012-05-20 02:24:46 +03:00
Peter Eisentraut 2273a50364 Realign some --help output to have better spacing between columns 2012-05-18 20:34:14 +03:00
Heikki Linnakangas 1d27dcf578 Fix bug in gistRelocateBuildBuffersOnSplit().
When we create a temporary copy of the old node buffer, in stack, we mustn't
leak that into any of the long-lived data structures. Before this patch,
when we called gistPopItupFromNodeBuffer(), it got added to the array of
"loaded buffers". After gistRelocateBuildBuffersOnSplit() exits, the
pointer added to the loaded buffers array points to garbage. Often that goes
unnotied, because when we go through the array of loaded buffers to unload
them, buffers with a NULL pageBuffer are ignored, which can often happen by
accident even if the pointer points to garbage.

This patch fixes that by marking the temporary copy in stack explicitly as
temporary, and refrain from adding buffers marked as temporary to the array
of loaded buffers.

While we're at it, initialize nodeBuffer->pageBlocknum to InvalidBlockNumber
and improve comments a bit. This isn't strictly necessary, but makes
debugging easier.
2012-05-18 19:38:32 +03:00
Peter Eisentraut 939ec9b8a4 Update SQL features/conformance information to SQL:2011 2012-05-17 09:50:04 +03:00
Peter Eisentraut be6d1c88a4 Change COLLATION keyword category
It was changed from unreserved to reserved as part of the COLLATION
FOR syntax, but it turns out that type_func_name_keyword is
sufficient.
2012-05-16 20:19:44 +03:00
Tom Lane 488c6dd170 Improve error message for ALTER COLUMN TYPE coercion failure.
Per recent discussion, the error message for this was actually a trifle
inaccurate, since it said "cannot be cast" which might be incorrect.
Adjust that wording, and add a HINT suggesting that a USING clause might
be needed.
2012-05-16 07:28:25 -04:00
Heikki Linnakangas 6593c5b5dc Fix bug in freespace calculation in heap_multi_insert().
If the amount of freespace on page was less than the amount reserved by
fillfactor, the calculation would underflow.

This fixes bug #6643 reported by Tomonari Katsumata.
2012-05-16 14:13:06 +03:00
Peter Eisentraut c8e086795a Remove whitespace from end of lines
pgindent and perltidy should clean up the rest.
2012-05-15 22:19:41 +03:00
Peter Eisentraut 8afb026e57 Remove stray nbsp character 2012-05-15 21:38:59 +03:00
Heikki Linnakangas d2495f272c Fix bug in to_tsquery().
We were using memcpy() to copy to a possibly overlapping memory region,
which is a no-no. Use memmove() instead.
2012-05-15 19:27:34 +03:00
Tom Lane 9b63e9869f In pgstat.c, use a timeout in WaitLatchOrSocket only on Windows.
We have no need for a timeout here really, but some broken products from
Redmond seem to lose FD_READ events occasionally, and waking up and
retrying the recv() is the only known way to work around that.  Perhaps
somebody will be motivated to figure out a better answer here; but not I.
2012-05-14 23:51:34 -04:00
Tom Lane 5a2bb06012 Revert "Add some temporary instrumentation to pgstat.c."
This reverts commit 7d88bb73f7.
That instrumentation has served its purpose.
2012-05-14 23:08:10 -04:00
Tom Lane f667747b6d Put back AC_REQUIRE([AC_STRUCT_TM]).
The BSD-ish members of the buildfarm all seem to think removing this
was a bad idea.  It looks to me like it resulted in omitting the system
header inclusion necessary to detect the fields of struct tm correctly.
2012-05-14 23:06:48 -04:00
Tom Lane e42a21b9e6 Assert that WaitLatchOrSocket callers cannot wait only for writability.
Since we have chosen to report socket EOF and error conditions via the
WL_SOCKET_READABLE flag bit, it's unsafe to wait only for
WL_SOCKET_WRITEABLE; the caller would never be notified of the socket
condition, and in some of these implementations WaitLatchOrSocket would
busy-wait until something else happens.  Add this restriction to the API
specification, and add Asserts to check that callers don't try to do that.

At some point we might want to consider adjusting the API to relax this
restriction, but until we have an actual use case for waiting on a
write-only socket, it seems premature to design a solution.
2012-05-14 16:12:28 -04:00
Peter Eisentraut ff4628f37a Remove unused AC_DEFINE symbols
ENABLE_DTRACE            unused as of a7b7b07af3
HAVE_ERR_SET_MARK        unused as of 4ed4b6c54e
HAVE_FCVT                unused as of 4553e1d80f
HAVE_STRUCT_SOCKADDR_UN  unused as of b4cea00a1f
HAVE_SYSCONF             unused as of f83356c7f5
TM_IN_SYS_TIME           never used, obsolescent per Autoconf documentation
2012-05-14 22:51:21 +03:00
Tom Lane d461d0502b For testing purposes, reinsert a timeout in pgstat.c's wait call.
Test results from buildfarm members mastodon/narwhal (Windows Server 2003)
make it look like that platform just plain loses FD_READ events
occasionally, and the only reason our previous coding seemed to work was
that it timed out every couple of seconds and retried the whole operation.
Try to verify this by reinserting a finite timeout into the pgstat loop.
This isn't meant to be a permanent patch either, just to confirm or
disprove a theory.
2012-05-14 15:03:14 -04:00
Tom Lane f1ca51549e Force pgwin32_recv into nonblock mode when called from pgstat.c.
This should get rid of the usage of pgwin32_waitforsinglesocket entirely,
and perhaps thereby remove the race condition that's evidently still
present on some versions of Windows.  The previous arrangement was a bit
unsafe anyway, since waiting at the recv() would not allow pgstat to notice
postmaster death.
2012-05-14 10:57:07 -04:00
Heikki Linnakangas f15c2eae9c Remove unnecessary pg_verifymbstr() calls from tsvector/query in functions.
The input should've been validated well before it hits the input function.
Doing so again is a waste of cycles.
2012-05-14 14:30:32 +03:00
Heikki Linnakangas 9e4637bf89 Update comments that became out-of-date with the PGXACT struct.
When the "hot" members of PGPROC were split off to separate PGXACT structs,
many PGPROC fields referred to in comments were moved to PGXACT, but the
comments were neglected in the commit. Mostly this is just a search/replace
of PGPROC with PGXACT, but the way the dummy PGPROC entries are created for
prepared transactions changed more, making some of the comments totally
bogus.

Noah Misch
2012-05-14 10:28:55 +03:00
Peter Eisentraut 64f09ca386 Remove leftovers of BeOS port
These should have been removed when the BeOS port was removed in
44f9021223.
2012-05-14 04:50:39 +03:00
Peter Eisentraut 6bf1e7668d Small punctuation editing of postgresql.conf.sample 2012-05-14 04:50:39 +03:00
Peter Eisentraut 2a7f636640 pg_ctl: Improve --help output
All other --help output has = signs between long options and their
arguments, so do it here as well.
2012-05-14 04:50:39 +03:00
Tom Lane 7d88bb73f7 Add some temporary instrumentation to pgstat.c.
Log main-loop blocking events and the results of inquiry messages.
This is to get some clarity as to what's happening on those Windows
buildfarm members that still don't like the latch-ified stats collector.
This bulks up the postmaster log a tad, so I won't leave it in place for
long.
2012-05-13 21:11:31 -04:00
Tom Lane b8347138e9 Fix DROP TABLESPACE to unlink symlink when directory is not there.
If the tablespace directory is missing entirely, we allow DROP TABLESPACE
to go through, on the grounds that it should be possible to clean up the
catalog entry in such a situation.  However, we forgot that the pg_tblspc
symlink might still be there.  We should try to remove the symlink too
(but not fail if it's no longer there), since not doing so can lead to
weird behavior subsequently, as per report from Michael Nolan.

There was some discussion of adding dependency links to prevent DROP
TABLESPACE when the catalogs still contain references to the tablespace.
That might be worth doing too, but it's an orthogonal question, and in
any case wouldn't be back-patchable.

Back-patch to 9.0, which is as far back as the logic looks like this.
We could possibly do something similar in 8.x, but given the lack of
reports I'm not sure it's worth the trouble, and anyway the case could
not arise in the form the logic is meant to cover (namely, a post-DROP
transaction rollback having resurrected the pg_tablespace entry after
some or all of the filesystem infrastructure is gone).
2012-05-13 18:06:52 -04:00
Tom Lane 966970ed63 Re-revert stats collector latch changes.
This reverts commit cb2f2873d6, restoring
the latch-ified stats collector logic.  We'll soon see if this works any
better on the Windows buildfarm machines.
2012-05-13 14:44:39 -04:00
Tom Lane b85427f227 Attempt to fix some issues in our Windows socket code.
Make sure WaitLatchOrSocket regards FD_CLOSE as a read-ready condition.
We might want to tweak this further, but it was surely wrong as-is.

Make pgwin32_waitforsinglesocket detach its private event object from the
passed socket before returning.  I suspect that failure to do so leads
to race conditions when other code (such as WaitLatchOrSocket) attaches
a different event object to the same socket.  Moreover, the existing
coding meant that repeated calls to pgwin32_waitforsinglesocket would
perform ResetEvent on an event actively connected to a socket, which
is rumored to be an unsafe practice; the WSAEventSelect documentation
appears to recommend against this, though it does not say not to do it
in so many words.

Also, uniformly use the coding pattern "WSAEventSelect(s, NULL, 0)" to
detach events from sockets, rather than passing the event in the second
parameter.  The WSAEventSelect documentation says that the second parameter
is ignored if the third is 0, so theoretically this should make no
difference.  However, elsewhere on the same reference page the use of NULL
in this context is recommended, and I have found suggestions on the net
that some versions of Windows have bugs with a non-NULL second parameter
in this usage.

Some other mostly-cosmetic cleanup, such as using the right one of
WSAGetLastError and GetLastError for reporting errors from these functions.
2012-05-13 14:35:40 -04:00
Tom Lane fd350ef395 Fix bogus declaration of local variable.
rc should be an int here, not a pgsocket.  Fairly harmless as long as
pgsocket is an integer type, but nonetheless wrong.  Error introduced
in commit 87091cb1f1.
2012-05-13 00:30:32 -04:00
Tom Lane 398b240151 Avoid unnecessary process wakeups in the log collector.
syslogger was coded to wake up once per second whether there was anything
useful to do or not.  As part of our campaign to reduce the server's idle
power consumption, change it to use a latch for waiting.  Now, in the
absence of any data to log or any signals to service, it will only wake up
at the programmed logfile rotation times (if any).
2012-05-12 19:21:54 -04:00
Tom Lane 31ad655364 Fix WaitLatchOrSocket to handle EOF on socket correctly.
When using poll(), EOF on a socket is reported with the POLLHUP not
POLLIN flag (at least on Linux).  WaitLatchOrSocket failed to check
this bit, causing it to go into a busy-wait loop if EOF occurs.
We earlier fixed the same mistake in the test for the state of the
postmaster_alive socket, but missed it for the caller-supplied socket.
Fortunately, this error is new in 9.2, since 9.1 only had a select()
based code path not a poll() based one.
2012-05-12 16:36:47 -04:00
Simon Riggs 867540b49c Ensure backwards compatibility for GetStableLatestTransactionId() 2012-05-12 13:26:10 +01:00
Peter Eisentraut afe86a9e73 Fix obsolescent C declaration syntax
gcc -Wextra/-Wold-style-declaration thinks that "inline" should go
before the function return type.
2012-05-12 12:52:02 +03:00
Tom Lane d0c231d132 Cosmetic adjustments for postmaster's handling of checkpointer.
Correct some comments, order some operations a bit more consistently.
No functional changes.
2012-05-11 17:46:37 -04:00
Peter Eisentraut 2cfb1c6f77 PL/Python: Adjust the regression tests for Python 3.3
The string representation of ImportError changed.  Remove printing
that; it's not necessary for the test.

The order in which members of a dict are printed changed.  But this
was always implementation-dependent, so we have just been lucky for a
long time.  Do the printing the hard way to ensure sorted order.
2012-05-11 23:04:47 +03:00
Robert Haas 1331cc6c1a Prevent loss of init fork when truncating an unlogged table.
Fixes bug #6635, reported by Akira Kurosawa.
2012-05-11 09:48:56 -04:00
Simon Riggs b762e8f50b Remove extraneous #include "storage/proc.h" 2012-05-11 14:46:46 +01:00
Simon Riggs b06679e012 Ensure age() returns a stable value rather than the latest value 2012-05-11 14:36:24 +01:00
Heikki Linnakangas 3652d72dd4 On GiST page split, release the locks on child pages before recursing up.
When inserting the downlinks for a split gist page, we used hold the locks
on the child pages until the insertion into the parent - and recursively its
parent if it had to be split too - were all completed. Change that so that
the locks on child pages are released after the insertion in the immediate
parent is done, before recursing further up the tree.

This reduces the number of lwlocks that are held simultaneously. Holding
many locks is bad for concurrency, and in extreme cases you can even hit
the limit of 100 simultaneously held lwlocks in a backend. If you're really
unlucky, you can hit the limit while in a critical section, which brings
down the whole system.

This fixes bug #6629 reported by Tom Forbes. Backpatch to 9.1. The page
splitting code was rewritten in 9.1, and the old code did not have this
problem.
2012-05-11 12:35:28 +03:00
Bruce Momjian ee24de4001 Revert catalog bump; was post-beta1, and unnecessary. 2012-05-10 18:44:47 -04:00
Bruce Momjian d2fe836cd2 Update comment for 'name' data type to say 63 "bytes".
Catalog version bump so everyone has the same comment for beta1.
2012-05-10 18:40:40 -04:00
Tom Lane f70fa835e0 Stamp 9.2beta1. 2012-05-10 18:35:09 -04:00
Tom Lane cb2f2873d6 Temporarily revert stats collector latch changes so we can ship beta1.
This patch reverts commit 49340037ee and some
follow-on tweaking in pgstat.c.  While the basic scheme of latch-ifying the
stats collector seems sound enough, it's failing on most Windows buildfarm
members for unknown reasons, and there's no time left to debug that before
9.2beta1.  Better to ship a beta version without this improvement.  I hope
to re-revert this once beta1 is out, though.
2012-05-10 17:26:33 -04:00
Tom Lane f40022f1ad Make WaitLatch's WL_POSTMASTER_DEATH result trustworthy; simplify callers.
Per a suggestion from Peter Geoghegan, make WaitLatch responsible for
verifying that the WL_POSTMASTER_DEATH bit it returns is truthful (by
testing PostmasterIsAlive).  Then simplify its callers, who no longer
need to do that for themselves.  Remove weasel wording about falsely-set
result bits from WaitLatch's API contract.
2012-05-10 14:34:53 -04:00
Peter Eisentraut a97207b690 PL/Python: Fix slicing support for result objects for Python 3
The old way of implementing slicing support by implementing
PySequenceMethods.sq_slice no longer works in Python 3.  You now have
to implement PyMappingMethods.mp_subscript.  Do this by simply
proxying the call to the wrapped list of result dictionaries.
Consolidate some of the subscripting regression tests.

Jan Urbański
2012-05-10 20:40:30 +03:00
Peter Eisentraut 1540d3bf4d PL/Python: Update incorrect comment
Jan Urbański
2012-05-10 20:40:30 +03:00
Tom Lane ada8fa08fc Fix Windows implementation of PGSemaphoreLock.
The original coding failed to reset ImmediateInterruptOK before returning,
which would potentially allow a subsequent query-cancel interrupt to be
accepted at an unsafe point.  This is a really nasty bug since it's so hard
to predict the consequences, but they could be unpleasant.

Also, ensure that signal handlers are serviced before this function
returns, even if the semaphore is already set.  This should make the
behavior more like Unix.

Back-patch to all supported versions.
2012-05-10 13:36:14 -04:00
Tom Lane 8ebc908c57 Improve Windows implementation of WaitLatch/WaitLatchOrSocket.
Ensure that signal handlers are serviced before this function returns.
This should make the behavior more like Unix.  Also, add some more
error checking, and make some other cosmetic improvements.

No back-patch since it's not clear whether this is fixing any live bug
that would affect 9.1.  I'm more concerned about 9.2 anyway given our
considerable recent expansions in the usage of WaitLatch.
2012-05-10 13:26:47 -04:00
Peter Eisentraut 1d158d7f98 Python 2.2 is no longer supported
It was already on its last legs, and it turns out that it was
accidentally broken in commit 89e850e6fd
and no one cared.  So remove the rest the support for it and update
the documentation to indicate that Python 2.3 is now required.
2012-05-10 20:02:57 +03:00
Magnus Hagander f33c5d471c Only attempt to show collations on servers >= 9.1.
Show a proper error message instead of a SQL error.

Josh Kupershmidt
2012-05-10 09:12:26 +02:00
Heikki Linnakangas 60a3dffb72 Fix outdated comment.
Multi-insert records observe XLOG_HEAP_INIT_PAGE flag too, as Andres Freund
pointed out.
2012-05-10 09:55:48 +03:00
Joe Conway b58bacdacb PL/pgSQL RETURN NEXT was leaking converted tuples, causing
out of memory when looping through large numbers of rows.
Flag the converted tuples to be freed. Complaint and patch
by Joe.
2012-05-09 22:57:19 -07:00
Tom Lane fd71421b01 Improve tests for postmaster death in auxiliary processes.
In checkpointer and walwriter, avoid calling PostmasterIsAlive unless
WaitLatch has reported WL_POSTMASTER_DEATH.  This saves a kernel call per
iteration of the process's outer loop, which is not all that much, but a
cycle shaved is a cycle earned.  I had already removed the unconditional
PostmasterIsAlive calls in bgwriter and pgstat in previous patches, but
forgot that WL_POSTMASTER_DEATH is supposed to be treated as untrustworthy
(per comment in unix_latch.c); so adjust those two cases to match.

There are a few other places where the same idea might be applied, but only
after substantial code rearrangement, so I didn't bother.
2012-05-10 00:54:56 -04:00
Tom Lane d3ae406f54 Further tweaking of nomenclature in checkpointer.c.
Get rid of some more naming choices that only make sense if you know that
this code used to be in the bgwriter, as well as some stray comments
referencing the bgwriter.
2012-05-10 00:01:10 -04:00
Tom Lane 6308ba05a7 Improve control logic for bgwriter hibernation mode.
Commit 6d90eaaa89 added a hibernation mode
to the bgwriter to reduce the server's idle-power consumption.  However,
its interaction with the detailed behavior of BgBufferSync's feedback
control loop wasn't very well thought out.  That control loop depends
primarily on the rate of buffer allocation, not the rate of buffer
dirtying, so the hibernation mode has to be designed to operate only when
no new buffer allocations are happening.  Also, the check for whether the
system is effectively idle was not quite right and would fail to detect
a constant low level of activity, thus allowing the bgwriter to go into
hibernation mode in a way that would let the cycle time vary quite a bit,
possibly further confusing the feedback loop.  To fix, move the wakeup
support from MarkBufferDirty and SetBufferCommitInfoNeedsSave into
StrategyGetBuffer, and prevent the bgwriter from entering hibernation mode
unless no buffer allocations have happened recently.

In addition, fix the delaying logic to remove the problem of possibly not
responding to signals promptly, which was basically caused by trying to use
the process latch's is_set flag for multiple purposes.  I can't prove it
but I'm suspicious that that hack was responsible for the intermittent
"postmaster does not shut down" failures we've been seeing in the buildfarm
lately.  In any case it did nothing to improve the readability or
robustness of the code.

In passing, express the hibernation sleep time as a multiplier on
BgWriterDelay, not a constant.  I'm not sure whether there's any value in
exposing the longer sleep time as an independently configurable setting,
but we can at least make it act like this for little extra code.
2012-05-09 23:37:10 -04:00
Peter Eisentraut 5d39807a00 Add make dependency so that postgres.bki is rebuilt in major version change
Every time since the current rule for postgres.bki was put in place
when we change the major version, people complain that their tests
fail in strange ways.  This is because the version number in
postgres.bki is not updated, because it has no dependency for that.
And you can't even force the rebuild manually if you don't happen to
know which file has the problem.  Fix that now before it will happen
again.

The only remaining problem with switching major versions, as far as
the regression tests are concerned, is that contrib needs to be
rebuilt.  But that's easily invoked, and in any case the failure modes
are more friendly if you forget that.
2012-05-09 20:45:56 +03:00
Simon Riggs 8f28789bff Rename BgWriterShmem/Request to CheckpointerShmem/Request 2012-05-09 14:23:45 +01:00
Simon Riggs bbd3ec9dce Rename BgWriterCommLock to CheckpointerCommLock 2012-05-09 14:11:48 +01:00
Simon Riggs 5829387381 Avoid xid error from age() function when run on Hot Standby 2012-05-09 13:56:24 +01:00
Tom Lane acd4c7d58b Fix an issue in recent walwriter hibernation patch.
Users of asynchronous-commit mode expect there to be a guaranteed maximum
delay before an async commit's WAL records get flushed to disk.  The
original version of the walwriter hibernation patch broke that.  Add an
extra shared-memory flag to allow async commits to kick the walwriter out
of hibernation mode, without adding any noticeable overhead in cases where
no action is needed.
2012-05-08 23:06:40 -04:00
Tom Lane 49340037ee Reduce idle power consumption of stats collector process.
Latch-ify the stats collector, so that it does not need an arbitrary wakeup
cycle to check for postmaster death.  The incremental savings in idle power
is pretty marginal, since we only had it waking every two seconds; but I
believe that this patch may also improve the collector's performance under
load, by reducing the number of kernel calls made per message when messages
are arriving constantly (we now avoid a select/poll call except when we
need to sleep).  The change also reduces the time needed for a normal
database shutdown on platforms where signals don't interrupt select().
2012-05-08 21:26:46 -04:00
Tom Lane 5461564a9d Reduce idle power consumption of walwriter and checkpointer processes.
This patch modifies the walwriter process so that, when it has not found
anything useful to do for many consecutive wakeup cycles, it extends its
sleep time to reduce the server's idle power consumption.  It reverts to
normal as soon as it's done any successful flushes.  It's still true that
during any async commit, backends check for completed, unflushed pages of
WAL and signal the walwriter if there are any; so that in practice the
walwriter can get awakened and returned to normal operation sooner than the
sleep time might suggest.

Also, improve the checkpointer so that it uses a latch and a computed delay
time to not wake up at all except when it has something to do, replacing a
previous hardcoded 0.5 sec wakeup cycle.  This also is primarily useful for
reducing the server's power consumption when idle.

In passing, get rid of the dedicated latch for signaling the walwriter in
favor of using its procLatch, since that comports better with possible
generic signal handlers using that latch.  Also, fix a pre-existing bug
with failure to save/restore errno in walwriter's signal handlers.

Peter Geoghegan, somewhat simplified by Tom
2012-05-08 20:03:26 -04:00
Peter Eisentraut db84ba65ab psql: Add variable to control keyword case in tab completion
This adds the variable COMP_KEYWORD_CASE, which controls in what case
keywords are completed.  This is partially to let users configure the
change from commit 69f4f1c357, but it
also offers more behaviors than were available before.
2012-05-08 21:06:08 +03:00
Peter Eisentraut 3420b241a7 Fix dependency tracking for src/port/%_srv.o files
Because they use their own compilation rule, they don't use the
dependency tracking logic from Makefile.global.  To make sure that
dependency tracking works anyway for the *_srv.o files, depend on
their *.o siblings as well, which do have proper dependencies.  It's a
hack that might fail someday if there is a *_srv.o without a
corresponding *.o, but it works for now (and those would probably go
into src/backend/port/ anyway).
2012-05-08 20:10:50 +03:00
Peter Eisentraut dcb2c58381 Fix misleading comments
Josh Kupershmidt
2012-05-08 19:35:22 +03:00
Peter Eisentraut 3284e03d5d Remove strdup, strtol, strtoul from libpgport
These should not be needed anymore, at least after the recent port
removals.  So let's see whether we can do without them.
2012-05-07 23:10:28 +03:00
Peter Eisentraut d7b2cd9d40 Fix pg_config.h make rule
According to the Autoconf documentation, there should be a make rule

pg_config.h: stamp-h

so that with the right setup around this, a change in pg_config.h.in
will trigger a rebuild of everything that depends on pg_config.h.  But
this doesn't always work, sometimes you need to run make twice to get
everything up to date after a change of pg_config.h.in.

The fix is to write the rule as

pg_config.h: stamp-h ;

instead (with an empty command instead of no command).  This is what
Automake-generated makefiles effectively do, so it seems safe to be on
this side.

It's not actually clear why this is (apparently) more correct.  It's
been posted to
<http://lists.gnu.org/archive/html/help-make/2012-04/msg00058.html>
without response so far.
2012-05-07 21:28:38 +03:00
Magnus Hagander 916d589a10 Make "unexpected EOF" messages DEBUG1 unless in an open transaction
"Unexpected EOF on client connection" without an open transaction
is mostly noise, so turn it into DEBUG1. With an open transaction it's
still indicating a problem, so keep those as ERROR, and change the message
to indicate that it happened in a transaction.
2012-05-07 18:50:44 +02:00
Tom Lane 71b9549d05 Overdue code review for transaction-level advisory locks patch.
Commit 62c7bd31c8 had assorted problems, most
visibly that it broke PREPARE TRANSACTION in the presence of session-level
advisory locks (which should be ignored by PREPARE), as per a recent
complaint from Stephen Rees.  More abstractly, the patch made the
LockMethodData.transactional flag not merely useless but outright
dangerous, because in point of fact that flag no longer tells you anything
at all about whether a lock is held transactionally.  This fix therefore
removes that flag altogether.  We now rely entirely on the convention
already in use in lock.c that transactional lock holds must be owned by
some ResourceOwner, while session holds are never so owned.  Setting the
locallock struct's owner link to NULL thus denotes a session hold, and
there is no redundant marker for that.

PREPARE TRANSACTION now works again when there are session-level advisory
locks, and it is also able to transfer transactional advisory locks to the
prepared transaction, but for implementation reasons it throws an error if
we hold both types of lock on a single lockable object.  Perhaps it will be
worth improving that someday.

Assorted other minor cleanup and documentation editing, as well.

Back-patch to 9.1, except that in the 9.1 branch I did not remove the
LockMethodData.transactional flag for fear of causing an ABI break for
any external code that might be examining those structs.
2012-05-04 17:44:31 -04:00
Bruce Momjian ebcaa5fcde Remove BSD/OS (BSDi) port. There are no known users upgrading to
Postgres 9.2, and perhaps no existing users either.
2012-05-03 10:58:44 -04:00
Bruce Momjian 7490c48f1e Mark git_changelog examples with the proper executable names. 2012-05-02 20:42:44 -04:00
Robert Haas 8e0c5195df Add missing parenthesis in comment. 2012-05-02 14:30:58 -04:00
Peter Eisentraut e6c2e8cb87 PL/Python: Improve test coverage
Add test cases for inline handler of plython2u (when using that
language name), and for result object element assignment.  There is
now at least one test case for every top-level functionality, except
plpy.Fatal (annoying to use in regression tests) and result object
slice retrieval and slice assignment (which are somewhat broken).
2012-05-02 21:09:03 +03:00
Peter Eisentraut 52aa334fcd PL/Python: Fix crash in functions returning SETOF and using SPI
Allocate PLyResultObject.tupdesc in TopMemoryContext, because its
lifetime is the lifetime of the Python object and it shouldn't be
freed by some other memory context, such as one controlled by SPI.  We
trust that the Python object will clean up its own memory.

Before, this would crash the included regression test case by trying
to use memory that was already freed.

reported by Asif Naeem, analysis by Tom Lane
2012-05-02 20:59:51 +03:00
Peter Eisentraut e9605a039b Even more duplicate word removal, in the spirit of the season 2012-05-02 20:56:03 +03:00
Robert Haas 0038110421 Avoid repeated CLOG access from heap_hot_search_buffer.
At the time we check whether the tuple is dead to all running
transactions, we've already verified that it isn't visible to our
scan, setting hint bits if appropriate.  So there's no need to
recheck CLOG for the all-dead test we do just a moment later.
So, add HeapTupleIsSurelyDead() to test the appropriate condition
under the assumption that all relevant hit bits are already set.

Review by Tom Lane.
2012-05-02 12:40:07 -04:00
Robert Haas 1b4998fd44 Further corrections from the department of redundancy department.
Thom Brown
2012-05-02 11:11:25 -04:00
Robert Haas e01e66f808 More duplicate word removal. 2012-05-02 09:28:16 -04:00
Heikki Linnakangas f291ccd43e Remove duplicate words in comments.
Found these with grep -r "for for ".
2012-05-02 10:20:27 +03:00
Tom Lane 50c2d6a1a6 Kill some remaining references to SVR4 and univel.
Both terms still appear in a few places, but I thought it best to leave
those alone in context.
2012-05-02 00:29:17 -04:00
Robert Haas 9b7a84f2a4 Tweak psql to print row counts when \x auto chooses non-expanded output.
Noah Misch
2012-05-01 16:05:01 -04:00
Peter Eisentraut f2f9439fbf Remove dead ports
Remove the following ports:

- dgux
- nextstep
- sunos4
- svr4
- ultrix4
- univel

These are obsolete and not worth rescuing.  In most cases, there is
circumstantial evidence that they wouldn't work anymore anyway.
2012-05-01 22:11:12 +03:00
Tom Lane 809e7e21af Converge all SQL-level statistics timing values to float8 milliseconds.
This patch adjusts the core statistics views to match the decision already
taken for pg_stat_statements, that values representing elapsed time should
be represented as float8 and measured in milliseconds.  By using float8,
we are no longer tied to a specific maximum precision of timing data.
(Internally, it's still microseconds, but we could now change that without
needing changes at the SQL level.)

The columns affected are
pg_stat_bgwriter.checkpoint_write_time
pg_stat_bgwriter.checkpoint_sync_time
pg_stat_database.blk_read_time
pg_stat_database.blk_write_time
pg_stat_user_functions.total_time
pg_stat_user_functions.self_time
pg_stat_xact_user_functions.total_time
pg_stat_xact_user_functions.self_time

The first four of these are new in 9.2, so there is no compatibility issue
from changing them.  The others require a release note comment that they
are now double precision (and can show a fractional part) rather than
bigint as before; also their underlying statistics functions now match
the column definitions, instead of returning bigint microseconds.
2012-04-30 14:03:33 -04:00
Peter Eisentraut 26471a51fc Mark ReThrowError() with attribute noreturn
All related functions were already so marked.
2012-04-30 20:23:41 +03:00
Robert Haas 0d2235a25b Remove duplicate word in comment.
Noted by Peter Geoghegan.
2012-04-30 13:14:46 -04:00
Bruce Momjian f33fe47a91 Add comments suggesting usage of git_changelog to generate release notes. 2012-04-30 11:05:34 -04:00
Tom Lane 1dd89eadcd Rename I/O timing statistics columns to blk_read_time and blk_write_time.
This seems more consistent with the pre-existing choices for names of
other statistics columns.  Rename assorted internal identifiers to match.
2012-04-29 18:13:33 -04:00
Tom Lane 309c64745e Rename track_iotiming GUC to track_io_timing.
This spelling seems significantly more readable to me.
2012-04-29 16:23:54 -04:00
Peter Eisentraut 81107282a5 Change return type of ExceptionalCondition to void and mark it noreturn
In ancient times, it was thought that this wouldn't work because of
TrapMacro/AssertMacro, but changing those to use a comma operator
appears to work without compiler warnings.
2012-04-29 21:20:14 +03:00
Peter Eisentraut 2227bb9c94 Simplify makefile rule
Instead of writing out the .c -> .o rule, use the default one, so that
dependency tracking can be used.
2012-04-29 21:20:14 +03:00
Tom Lane cdbad241f4 Clear I/O timing counters after sending them to the stats collector.
This oversight caused the reported times to accumulate in an O(N^2)
fashion the longer a backend runs.
2012-04-28 15:11:13 -04:00
Tom Lane d6f7d4fdc5 Fix printing of whole-row Vars at top level of a SELECT targetlist.
Normally whole-row Vars are printed as "tabname.*".  However, that does not
work at top level of a targetlist, because per SQL standard the parser will
think that the "*" should result in column-by-column expansion; which is
not at all what a whole-row Var implies.  We used to just print the table
name in such cases, which works most of the time; but it fails if the table
name matches a column name available anywhere in the FROM clause.  This
could lead for instance to a view being interpreted differently after dump
and reload.  Adding parentheses doesn't fix it, but there is a reasonably
simple kluge we can use instead: attach a no-op cast, so that the "*" isn't
syntactically at top level anymore.  This makes the printing of such
whole-row Vars a lot more consistent with other Vars, and may indeed fix
more cases than just the reported one; I'm suspicious that cases involving
schema qualification probably didn't work properly before, either.

Per bug report and fix proposal from Abbas Butt, though this patch is quite
different in detail from his.

Back-patch to all supported versions.
2012-04-27 19:49:18 -04:00
Bruce Momjian 993ce4e6c9 Add options to git_changelog for use in major release note creation:
--details-after
	--master-only
	--oldest-first
2012-04-27 17:15:41 -04:00
Tom Lane 537b266953 Fix syslogger's rotation disable/re-enable logic.
If it fails to open a new log file, the syslogger assumes there's something
wrong with its parameters (such as log_directory), and stops attempting
automatic time-based or size-based log file rotations.  Sending it SIGHUP
is supposed to start that up again.  However, the original coding for that
was really bogus, involving clobbering a couple of GUC variables and hoping
that SIGHUP processing would restore them.  Get rid of that technique in
favor of maintaining a separate flag showing we've turned rotation off.
Per report from Mark Kirkwood.

Also, the syslogger will automatically attempt to create the log_directory
directory if it doesn't exist, but that was only happening at startup.
For consistency and ease of use, it should do the same whenever the value
of log_directory is changed by SIGHUP.

Back-patch to all supported branches.
2012-04-27 00:12:42 -04:00
Robert Haas 3424bff90f Prevent index-only scans from returning wrong answers under Hot Standby.
The alternative of disallowing index-only scans in HS operation was
discussed, but the consensus was that it was better to treat marking
a page all-visible as a recovery conflict for snapshots that could still
fail to see XIDs on that page.  We may in the future try to soften this,
so that we simply force index scans to do heap fetches in cases where
this may be an issue, rather than throwing a hard conflict.
2012-04-26 20:00:21 -04:00
Tom Lane 7c85aa39fc Fix oversight in recent parameterized-path patch.
bitmap_scan_cost_est() has to be able to cope with a BitmapOrPath, but
I'd taken a shortcut that didn't work for that case.  Noted by Heikki.
Add some regression tests since this area is evidently under-covered.
2012-04-26 14:17:44 -04:00
Peter Eisentraut ba3e4157a7 PL/Python: Accept strings in functions returning composite types
Before 9.1, PL/Python functions returning composite types could return
a string and it would be parsed using record_in.  The 9.1 changes made
PL/Python only expect dictionaries, tuples, or objects supporting
getattr as output of composite functions, resulting in a regression
and a confusing error message, as the strings were interpreted as
sequences and the code for transforming lists to database tuples was
used.  Fix this by treating strings separately as before, before
checking for the other types.

The reason why it's important to support string to database tuple
conversion is that trigger functions on tables with composite columns
get the composite row passed in as a string (from record_out).
Without supporting converting this back using record_in, this makes it
impossible to implement pass-through behavior for these columns, as
PL/Python no longer accepts strings for composite values.

A better solution would be to fix the code that transforms composite
inputs into Python objects to produce dictionaries that would then be
correctly interpreted by the Python->PostgreSQL counterpart code.  But
that would be too invasive to backpatch to 9.1, and it is too late in
the 9.2 cycle to attempt it.  It should be revisited in the future,
though.

Reported as bug #6559 by Kirill Simonov.

Jan Urbański
2012-04-26 21:03:48 +03:00
Peter Eisentraut cc71ceab57 psql: Tab completion updates
Add/complete support for:

- ALTER DOMAIN / VALIDATE CONSTRAINT
- ALTER DOMAIN / RENAME
- ALTER DOMAIN / RENAME CONSTRAINT
- ALTER TABLE / RENAME CONSTRAINT
2012-04-26 20:07:40 +03:00
Tom Lane d6d5f67b5b Modify create_index regression test to avoid intermittent failures.
We have been seeing intermittent buildfarm failures due to a query
sometimes not using an index-only scan plan, because a background
auto-ANALYZE prevented the table's all-visible bits from being set
immediately, thereby causing the estimated cost of an index-only scan
to go up considerably.  Adjust the test case so that a bitmap index scan is
preferred instead, which serves equally well for the purpose the test case
is actually meant for.  (Of course, it would be better to eliminate the
interference from auto-ANALYZE, but I see no low-risk way to do that,
so any such fix will have to be left for 9.3 or later.)
2012-04-25 22:57:48 -04:00
Tom Lane 9fa82c9809 Fix planner's handling of RETURNING lists in writable CTEs.
setrefs.c failed to do "rtoffset" adjustment of Vars in RETURNING lists,
which meant they were left with the wrong varnos when the RETURNING list
was in a subquery.  That was never possible before writable CTEs, of
course, but now it's broken.  The executor fails to notice any problem
because ExecEvalVar just references the ecxt_scantuple for any normal
varno; but EXPLAIN breaks when the varno is wrong, as illustrated in a
recent complaint from Bartosz Dmytrak.

Since the eventual rtoffset of the subquery is not known at the time
we are preparing its plan node, the previous scheme of executing
set_returning_clause_references() at that time cannot handle this
adjustment.  Fortunately, it turns out that we don't really need to do it
that way, because all the needed information is available during normal
setrefs.c execution; we just have to dig it out of the ModifyTable node.
So, do that, and get rid of the kluge of early setrefs processing of
RETURNING lists.  (This is a little bit of a cheat in the case of inherited
UPDATE/DELETE, because we are not passing a "root" struct that corresponds
exactly to what the subplan was built with.  But that doesn't matter, and
anyway this is less ugly than early setrefs processing was.)

Back-patch to 9.1, where the problem became possible to hit.
2012-04-25 20:20:33 -04:00
Tom Lane c62b8eaae1 Fix edge-case behavior of pg_next_dst_boundary().
Due to rather sloppy thinking (on my part, I'm afraid) about the
appropriate behavior for boundary conditions, pg_next_dst_boundary() gave
undefined, platform-dependent results when the input time is exactly the
last recorded DST transition time for the specified time zone, as a result
of fetching values one past the end of its data arrays.

Change its specification to be that it always finds the next DST boundary
*after* the input time, and adjust code to match that.  The sole existing
caller, DetermineTimeZoneOffset, doesn't actually care about this
distinction, since it always uses a probe time earlier than the instant
that it does care about.  So it seemed best to me to change the API to make
the result=1 and result=0 cases more consistent, specifically to ensure
that the "before" outputs always describe the state at the given time,
rather than hacking the code to obey the previous API comment exactly.

Per bug #6605 from Sergey Burladyan.  Back-patch to all supported versions.
2012-04-25 17:26:10 -04:00
Robert Haas ca1e1a8da1 Remove prototype for nonexistent function. 2012-04-25 15:32:15 -04:00
Tom Lane 9873001e6d Another trivial comment-typo fix. 2012-04-25 14:28:58 -04:00
Peter Eisentraut 65ca8e68b7 PL/Python: Improve error messages 2012-04-25 21:11:59 +03:00
Peter Eisentraut 8bd44677df entab: Improve makefile
A few simplifications and stylistic improvements, found while grepping
around for makefile problems elsewhere.
2012-04-24 21:20:55 +03:00
Robert Haas 3ce7f18e92 Casts to or from a domain type are ignored; warn and document.
Prohibiting this outright would break dumps taken from older versions
that contain such casts, which would create far more pain than is
justified here.

Per report by Jaime Casanova and subsequent discussion.
2012-04-24 09:20:53 -04:00
Robert Haas 5d4b60f2f2 Lots of doc corrections.
Josh Kupershmidt
2012-04-23 22:43:09 -04:00
Robert Haas 7ab9b2f3b7 Rearrange lazy_scan_heap to avoid visibility map race conditions.
We must set the visibility map bit before releasing our exclusive lock
on the heap page; otherwise, someone might clear the heap page bit
before we set the visibility map bit, leading to a situation where the
visibility map thinks the page is all-visible but it's really not.

This problem has existed since 8.4, but it wasn't critical before we
had index-only scans, since the worst case scenario was that the page
wouldn't get vacuumed until the next scan_all vacuum.

Along the way, a couple of minor, related improvements: (1) if we
pause the heap scan to do an index vac cycle, release any visibility
map page we're holding, since really long-running pins are not good
for a variety of reasons; and (2) warn if we see a page that's marked
all-visible in the visibility map but not on the page level, since
that should never happen any more (it was allowed in previous
releases, but not in 9.2).
2012-04-23 22:08:06 -04:00
Robert Haas 85efd5f065 Reduce hash size for compute_array_stats, compute_tsvector_stats.
The size is only a hint, but a big hint chews up a lot of memory without
apparently improving performance much.

Analysis and patch by Noah Misch.
2012-04-23 22:05:41 -04:00
Peter Eisentraut 48658a1b81 Fix some typos
Josh Kupershmidt
2012-04-22 19:23:47 +03:00
Tom Lane 33e99153e9 Use fuzzy not exact cost comparison for the final tie-breaker in add_path.
Instead of an exact cost comparison, use a fuzzy comparison with 1e-10
delta after all other path metrics have proved equal.  This is to avoid
having platform-specific roundoff behaviors determine the choice when
two paths are really the same to our cost estimators.  Adjust the
recently-added test case that made it obvious we had a problem here.
2012-04-21 00:51:14 -04:00
Alvaro Herrera 09ff76fcdb Recast "ONLY" column CHECK constraints as NO INHERIT
The original syntax wasn't universally loved, and it didn't allow its
usage in CREATE TABLE, only ALTER TABLE.  It now works everywhere, and
it also allows using ALTER TABLE ONLY to add an uninherited CHECK
constraint, per discussion.

The pg_constraint column has accordingly been renamed connoinherit.

This commit partly reverts some of the changes in
61d81bd28d, particularly some pg_dump and
psql bits, because now pg_get_constraintdef includes the necessary NO
INHERIT within the constraint definition.

Author: Nikhil Sontakke
Some tweaks by me
2012-04-20 23:56:57 -03:00
Tom Lane 1f03630011 Adjust join_search_one_level's handling of clauseless joins.
For an initial relation that lacks any join clauses (that is, it has to be
cartesian-product-joined to the rest of the query), we considered only
cartesian joins with initial rels appearing later in the initial-relations
list.  This creates an undesirable dependency on FROM-list order.  We would
never fail to find a plan, but perhaps we might not find the best available
plan.  Noted while discussing the logic with Amit Kapila.

Improve the comments a bit in this area, too.

Arguably this is a bug fix, but given the lack of complaints from the
field I'll refrain from back-patching.
2012-04-20 20:10:46 -04:00
Tom Lane 5b7b5518d0 Revise parameterized-path mechanism to fix assorted issues.
This patch adjusts the treatment of parameterized paths so that all paths
with the same parameterization (same set of required outer rels) for the
same relation will have the same rowcount estimate.  We cache the rowcount
estimates to ensure that property, and hopefully save a few cycles too.
Doing this makes it practical for add_path_precheck to operate without
a rowcount estimate: it need only assume that paths with different
parameterizations never dominate each other, which is close enough to
true anyway for coarse filtering, because normally a more-parameterized
path should yield fewer rows thanks to having more join clauses to apply.

In add_path, we do the full nine yards of comparing rowcount estimates
along with everything else, so that we can discard parameterized paths that
don't actually have an advantage.  This fixes some issues I'd found with
add_path rejecting parameterized paths on the grounds that they were more
expensive than not-parameterized ones, even though they yielded many fewer
rows and hence would be cheaper once subsequent joining was considered.

To make the same-rowcounts assumption valid, we have to require that any
parameterized path enforce *all* join clauses that could be obtained from
the particular set of outer rels, even if not all of them are useful for
indexing.  This is required at both base scans and joins.  It's a good
thing anyway since the net impact is that join quals are checked at the
lowest practical level in the join tree.  Hence, discard the original
rather ad-hoc mechanism for choosing parameterization joinquals, and build
a better one that has a more principled rule for when clauses can be moved.
The original rule was actually buggy anyway for lack of knowledge about
which relations are part of an outer join's outer side; getting this right
requires adding an outer_relids field to RestrictInfo.
2012-04-19 15:53:47 -04:00
Robert Haas 293ec33c32 Remove bogus comment from HeapTupleSatisfiesNow.
This has been wrong for a really long time.  We don't use two-phase
locking to protect against serialization anomalies.

Per discussion on pgsql-hackers about 2011-03-07; original report
by Dan Ports.
2012-04-18 11:50:45 -04:00