Commit Graph

23760 Commits

Author SHA1 Message Date
Tom Lane c1793f2e0c In SPGiST replay, do conflict resolution before modifying the page.
In yesterday's commit 962e0cc71e, I added the
ResolveRecoveryConflictWithSnapshot call in the wrong place.  I correctly
put it before spgRedoVacuumRedirect itself would modify the index page ---
but not before RestoreBkpBlocks, so replay of a record with a full-page
image would modify the page before kicking off any conflicting HS
transactions.  Oops.
2012-08-03 15:23:14 -04:00
Tom Lane 962e0cc71e Fix race conditions associated with SPGiST redirection tuples.
The correct test for whether a redirection tuple is removable is whether
tuple's xid < RecentGlobalXmin, not OldestXmin; the previous coding
failed to protect index searches being done in concurrent transactions that
have no XID.  This mirrors the recent fix in btree's page recycling logic
made in commit d3abbbebe5.

Also, WAL-log the newest XID of any removed redirection tuple on an index
page, and apply ResolveRecoveryConflictWithSnapshot during InHotStandby WAL
replay.  This protects against concurrent Hot Standby transactions possibly
needing to see the redirection tuple(s).

Per my query of 2012-03-12 and subsequent discussion.
2012-08-02 15:34:14 -04:00
Tom Lane 41b9c8452b Replace libpq's "row processor" API with a "single row" mode.
After taking awhile to digest the row-processor feature that was added to
libpq in commit 92785dac2e, we've concluded
it is over-complicated and too hard to use.  Leave the core infrastructure
changes in place (that is, there's still a row processor function inside
libpq), but remove the exposed API pieces, and instead provide a "single
row" mode switch that causes PQgetResult to return one row at a time in
separate PGresult objects.

This approach incurs more overhead than proper use of a row processor
callback would, since construction of a PGresult per row adds extra cycles.
However, it is far easier to use and harder to break.  The single-row mode
still affords applications the primary benefit that the row processor API
was meant to provide, namely not having to accumulate large result sets in
memory before processing them.  Preliminary testing suggests that we can
probably buy back most of the extra cycles by micro-optimizing construction
of the extra results, but that task will be left for another day.

Marko Kreen
2012-08-02 13:10:30 -04:00
Tom Lane f6ce81f55a Fix WITH attached to a nested set operation (UNION/INTERSECT/EXCEPT).
Parse analysis neglected to cover the case of a WITH clause attached to an
intermediate-level set operation; it only handled WITH at the top level
or WITH attached to a leaf-level SELECT.  Per report from Adam Mackler.

In HEAD, I rearranged the order of SelectStmt's fields to put withClause
with the other fields that can appear on non-leaf SelectStmts.  In back
branches, leave it alone to avoid a possible ABI break for third-party
code.

Back-patch to 8.4 where WITH support was added.
2012-07-31 17:56:21 -04:00
Tom Lane b76356ac22 Fix syslogger so that log_truncate_on_rotation works in the first rotation.
In the original coding of the log rotation stuff, we did not bother to make
the truncation logic work for the very first rotation after postmaster
start (or after a syslogger crash and restart).  It just always appended
in that case.  It did not seem terribly important at the time, but we've
recently had two separate complaints from people who expected it to work
unsurprisingly.  (Both users tend to restart the postmaster about as often
as a log rotation is configured to happen, which is maybe not typical use,
but still...)  Since the initial log file is opened in the postmaster,
fixing this requires passing down some more state to the syslogger child
process.

It's always been like this, so back-patch to all supported branches.
2012-07-31 14:36:54 -04:00
Alvaro Herrera 2f29f011c8 pg_basebackup: stylistic adjustments
The most user-visible part of this is to change the long options
--statusint and --noloop to --status-interval and --no-loop,
respectively, per discussion.

Also, consistently enclose file names in double quotes, per our
conventions; and consistently use the term "transaction log file" to
talk about WAL segments.  (Someday we may need to go over this
terminology and make it consistent across the whole source code.)

Finally, reflow the code to better fit in 80 columns, and have pgindent
fix it up some more.
2012-07-31 11:02:39 -04:00
Tom Lane 9ae8ebe0b2 Improve reporting of error situations in find_other_exec().
This function suppressed any stderr output from the called program, which
is unnecessary in the normal case and unhelpful in error cases.  It also
gave a rather opaque message along the lines of "fgets failure: Success"
in case the called program failed to return anything on stdout.  Since
we've seen multiple reports of people not understanding what's wrong when
pg_ctl reports this, improve the message.

Back-patch to all active branches.
2012-07-27 19:31:13 -04:00
Tom Lane 26b438694c Only allow autovacuum to be auto-canceled by a directly blocked process.
In the original coding of the autovacuum cancel feature, commit
acac68b2bc, an autovacuum process was
considered a target for cancellation if it was found to hard-block any
process examined in the deadlock search.  This patch tightens the test so
that the autovacuum must directly hard-block the current process.  This
should make the behavior more predictable in general, and in particular
it ensures that an autovacuum will not be canceled with less than
deadlock_timeout grace period.  In the old coding, it was possible for an
autovacuum to be canceled almost instantly, given unfortunate timing of two
or more other processes' lock attempts.

This also justifies the logging methodology in the recent commit
d7318d43d891bd63e82dcfc27948113ed7b1db80; without this restriction, that
patch isn't providing enough information to see the connection of the
canceling process to the autovacuum.  Like that one, patch all the way
back.
2012-07-26 14:29:22 -04:00
Robert Haas d20cdd31c0 Tab complete table names after ALTER TABLE x [NO] INHERIT.
Jeff Janes
2012-07-26 10:16:55 -04:00
Robert Haas d7318d43d8 Log a better message when canceling autovacuum.
The old message was at DEBUG2, so typically it didn't show up in the
log at all.  As a result, in most cases where autovacuum was canceled,
the only information that was logged was the table being vacuumed,
with no indication as to what problem caused the cancel.  Crank up
the level to LOG and add some more details to assist with debugging.

Back-patch all the way, per discussion on pgsql-hackers.
2012-07-26 09:19:03 -04:00
Tom Lane af026b5d9b Fix longstanding crash-safety bug with newly-created-or-reset sequences.
If a crash occurred immediately after the first nextval() call for a serial
column, WAL replay would restore the sequence to a state in which it
appeared that no nextval() had been done, thus allowing the first sequence
value to be returned again by the next nextval() call; as reported in
bug #6748 from Xiangming Mei.

More generally, the problem would occur if an ALTER SEQUENCE was executed
on a freshly created or reset sequence.  (The manifestation with serial
columns was introduced in 8.2 when we added an ALTER SEQUENCE OWNED BY step
to serial column creation.)  The cause is that sequence creation attempted
to save one WAL entry by writing out a WAL record that made it appear that
the first nextval() had already happened (viz, with is_called = true),
while marking the sequence's in-database state with log_cnt = 1 to show
that the first nextval() need not emit a WAL record.  However, ALTER
SEQUENCE would emit a new WAL entry reflecting the actual in-database state
(with is_called = false).  Then, nextval would allocate the first sequence
value and set is_called = true, but it would trust the log_cnt value and
not emit any WAL record.  A crash at this point would thus restore the
sequence to its post-ALTER state, causing the next nextval() call to return
the first sequence value again.

To fix, get rid of the idea of logging an is_called status different from
reality.  This means that the first nextval-driven WAL record will happen
at the first nextval call not the second, but the marginal cost of that is
pretty negligible.  In addition, make sure that ALTER SEQUENCE resets
log_cnt to zero in any case where it touches sequence parameters that
affect future nextval results.  This will result in some user-visible
changes in the contents of a sequence's log_cnt column, as reflected in the
patch's regression test changes; but no application should be depending on
that anyway, since it was already true that log_cnt changes rather
unpredictably depending on checkpoint timing.

In addition, make some basically-cosmetic improvements to get rid of
sequence.c's undesirable intimacy with page layout details.  It was always
really trying to WAL-log the contents of the sequence tuple, so we should
have it do that directly using a HeapTuple's t_data and t_len, rather than
backing into it with some magic assumptions about where the tuple would be
on the sequence's page.

Back-patch to all supported branches.
2012-07-25 17:42:23 -04:00
Alvaro Herrera 58f17dcf83 Add translator comments to module names 2012-07-25 00:02:49 -04:00
Alvaro Herrera d7b47e5155 Change syntax of new CHECK NO INHERIT constraints
The initially implemented syntax, "CHECK NO INHERIT (expr)" was not
deemed very good, so switch to "CHECK (expr) NO INHERIT" instead.  This
way it looks similar to SQL-standards compliant constraint attribute.

Backport to 9.2 where the new syntax and feature was introduced.

Per discussion.
2012-07-24 16:01:32 -04:00
Peter Eisentraut d61d9aa750 Update information schema to SQL:2011
This is just a section renumbering for now.  Some details might be
filled in later.
2012-07-23 22:32:56 +03:00
Tom Lane b71258af56 Fix name collision between concurrent regression tests.
Commit f5bcd398ad introduced a test using
a table named "circles" in inherit.sql.  Unfortunately, the concurrently
executed constraints test was already using that table name, so the
parallel regression tests would sometimes fail.  Rename table to dodge
the problem.  Per buildfarm.
2012-07-22 00:01:19 -04:00
Tom Lane 2d46a57ddc Improve copydir() code for the case that fsync is off.
We should avoid calling sync_file_range or posix_fadvise in this case,
since (a) we don't really care if the data gets synced, and might as
well save the kernel calls; (b) at least on Linux we know that the
kernel might block us until it's scheduled the write.

Also, avoid making a useless second traversal of the directory tree
if we're not actually going to call fsync(2) after all.
2012-07-21 20:10:29 -04:00
Tom Lane 2c4f5b4bc5 Use --nosync during make check's initdb call.
We left this out of commit b966dd6c42
so as to get some more buildfarm testing of the new fsync code in initdb.
But since no problems have turned up, it's probably time to save the
cycles.
2012-07-21 19:56:22 -04:00
Tom Lane 1f115d98b9 Suppress volatile-related warning seen in some compilers.
Antique versions of gcc complain about vars that are initialized outside
PG_TRY and then modified within it.  Rather than marking the var volatile,
expend one more line of code.
2012-07-21 19:39:03 -04:00
Tom Lane 31c7c642b6 Account for SRFs in targetlists in planner rowcount estimates.
We made use of the ROWS estimate for set-returning functions used in FROM,
but not for those used in SELECT targetlists; which is a bit of an
oversight considering there are common usages that require the latter
approach.  Improve that.  (I had initially thought it might be worth
folding this into cost_qual_eval, but after investigation concluded that
that wouldn't be very helpful, so just do it separately.)  Per complaint
from David Johnston.

Back-patch to 9.2, but not further, for fear of destabilizing plan choices
in existing releases.
2012-07-21 17:45:07 -04:00
Robert Haas ed0af33247 Revert temporary patch to debug Windows breakage.
This reverts commit 0a248208a0.
2012-07-20 22:31:19 -04:00
Robert Haas 0635c0b524 Repair plpgsql_validator breakage.
Commit 3a0e4d36eb arranged to
reference stack-allocated variables after they were out of scope.
That's no good, so let's arrange to not do that after all.
2012-07-20 21:28:26 -04:00
Andrew Dunstan a1e5705c9f Remove now unneeded results file for disabled prepared transactions case. 2012-07-20 16:30:34 -04:00
Robert Haas 0a248208a0 Temporary patch to try to debug why event trigger patch broke Windows.
Apologies for the ugliness.
2012-07-20 16:22:11 -04:00
Andrew Dunstan ae55d9fbe3 Remove prepared transactions from main isolation test schedule.
There is no point in running this test when prepared transactions are disabled,
which is the default. New make targets that include the test are provided. This
will save some useless waste of cycles on buildfarm machines.

Backpatch to 9.1 where these tests were introduced.
2012-07-20 15:51:40 -04:00
Peter Eisentraut 8ca03aa414 pg_dump: Simplify mkdir() error checking
mkdir() can check for errors itself.  We don't need to code that
ourselves again.
2012-07-20 22:34:11 +03:00
Alvaro Herrera f5bcd398ad connoinherit may be true only for CHECK constraints
The code was setting it true for other constraints, which is
bogus.  Doing so caused bogus catalog entries for such constraints, and
in particular caused an error to be raised when trying to drop a
constraint of types other than CHECK from a table that has children,
such as reported in bug #6712.

In 9.2, additionally ignore connoinherit=true for other constraint
types, to avoid having to force initdb; existing databases might already
contain bogus catalog entries.

Includes a catversion bump (in HEAD only).

Bug report from Miroslav Šulc
Analysis from Amit Kapila and Noah Misch; Amit also contributed the patch.
2012-07-20 14:08:07 -04:00
Tom Lane 8e617e29aa Fix whole-row Var evaluation to cope with resjunk columns (again).
When a whole-row Var is reading the result of a subquery, we need it to
ignore any "resjunk" columns that the subquery might have evaluated for
GROUP BY or ORDER BY purposes.  We've hacked this area before, in commit
68e40998d0, but that fix only covered
whole-row Vars of named composite types, not those of RECORD type; and it
was mighty klugy anyway, since it just assumed without checking that any
extra columns in the result must be resjunk.  A proper fix requires getting
hold of the subquery's targetlist so we can actually see which columns are
resjunk (whereupon we can use a JunkFilter to get rid of them).  So bite
the bullet and add some infrastructure to make that possible.

Per report from Andrew Dunstan and additional testing by Merlin Moncure.
Back-patch to all supported branches.  In 8.3, also back-patch commit
292176a118, which for some reason I had
not done at the time, but it's a prerequisite for this change.
2012-07-20 13:10:58 -04:00
Robert Haas 3a0e4d36eb Make new event trigger facility actually do something.
Commit 3855968f32 added syntax, pg_dump,
psql support, and documentation, but the triggers didn't actually fire.
With this commit, they now do.  This is still a pretty basic facility
overall because event triggers do not get a whole lot of information
about what the user is trying to do unless you write them in C; and
there's still no option to fire them anywhere except at the very
beginning of the execution sequence, but it's better than nothing,
and a good building block for future work.

Along the way, add a regression test for ALTER LARGE OBJECT, since
testing of event triggers reveals that we haven't got one.

Dimitri Fontaine and Robert Haas
2012-07-20 11:39:01 -04:00
Tom Lane be86e3dd5b Rethink checkpointer's fsync-request table representation.
Instead of having one hash table entry per relation/fork/segment, just have
one per relation, and use bitmapsets to represent which specific segments
need to be fsync'd.  This eliminates the need to scan the whole hash table
to implement FORGET_RELATION_FSYNC, which fixes the O(N^2) behavior
recently demonstrated by Jeff Janes for cases involving lots of TRUNCATE or
DROP TABLE operations during a single checkpoint cycle.  Per an idea from
Robert Haas.

(FORGET_DATABASE_FSYNC still sucks, but since dropping a database is a
pretty expensive operation anyway, we'll live with that.)

In passing, improve the delayed-unlink code: remove the pass over the list
in mdpreckpt, since it wasn't doing anything for us except supporting a
useless Assert in mdpostckpt, and fix mdpostckpt so that it will absorb
fsync requests every so often when clearing a large backlog of deletion
requests.
2012-07-19 19:28:22 -04:00
Tom Lane 3072b7bade Send only one FORGET_RELATION_FSYNC request when dropping a relation.
We were sending one per fork, but a little bit of refactoring allows us
to send just one request with forknum == InvalidForkNumber.  This not only
reduces pressure on the shared-memory request queue, but saves repeated
traversals of the checkpointer's hash table.
2012-07-19 13:07:33 -04:00
Heikki Linnakangas a7a4add6c4 Refactor the way code is shared between some range type functions.
Functions like range_eq, range_before etc. are exposed at the SQL-level, but
they're also used internally by the GiST consistent support function. The
code sharing was done by a hack, TrickFunctionCall2, which relied on the
knowledge that all the functions used fn_extra the same way. This commit
splits the functions into internal versions that take a TypeCacheEntry as
argument, and thin wrappers to expose the functions at the SQL-level. The
internal versions can then be called directly and in a less hacky way from
the GiST consistent function.

This is just cosmetic, but backpatch to 9.2 anyway, to avoid having a
different version of this code in the 9.2 branch. That would make
backpatching fixes in this area more difficult.

Alexander Korotkov
2012-07-18 23:14:56 +03:00
Tom Lane 80e373c3a8 Fix statistics breakage from bgwriter/checkpointer process split.
ForwardFsyncRequest() supposed that it could only be called in regular
backends, which used to be true; but since the splitup of bgwriter and
checkpointer, it is also called in the bgwriter.  We do not want to count
such calls in pg_stat_bgwriter.buffers_backend statistics, so fix things
so that they aren't.

(It's worth noting here that this implies an alarmingly large increase in
the expected amount of cross-process fsync request traffic, which may well
mean that the process splitup was not such a hot idea.)
2012-07-18 15:40:31 -04:00
Tom Lane 4a9c30a8a1 Fix management of pendingOpsTable in auxiliary processes.
mdinit() was misusing IsBootstrapProcessingMode() to decide whether to
create an fsync pending-operations table in the current process.  This led
to creating a table not only in the startup and checkpointer processes as
intended, but also in the bgwriter process, not to mention other auxiliary
processes such as walwriter and walreceiver.  Creation of the table in the
bgwriter is fatal, because it absorbs fsync requests that should have gone
to the checkpointer; instead they just sit in bgwriter local memory and are
never acted on.  So writes performed by the bgwriter were not being fsync'd
which could result in data loss after an OS crash.  I think there is no
live bug with respect to walwriter and walreceiver because those never
perform any writes of shared buffers; but the potential is there for
future breakage in those processes too.

To fix, make AuxiliaryProcessMain() export the current process's
AuxProcType as a global variable, and then make mdinit() test directly for
the types of aux process that should have a pendingOpsTable.  Having done
that, we might as well also get rid of the random bool flags such as
am_walreceiver that some of the aux processes had grown.  (Note that we
could not have fixed the bug by examining those variables in mdinit(),
because it's called from BaseInit() which is run by AuxiliaryProcessMain()
before entering any of the process-type-specific code.)

Back-patch to 9.2, where the problem was introduced by the split-up of
bgwriter and checkpointer processes.  The bogus pendingOpsTable exists
in walwriter and walreceiver processes in earlier branches, but absent
any evidence that it causes actual problems there, I'll leave the older
branches alone.
2012-07-18 15:28:10 -04:00
Robert Haas 3855968f32 Syntax support and documentation for event triggers.
They don't actually do anything yet; that will get fixed in a
follow-on commit.  But this gets the basic infrastructure in place,
including CREATE/ALTER/DROP EVENT TRIGGER; support for COMMENT,
SECURITY LABEL, and ALTER EXTENSION .. ADD/DROP EVENT TRIGGER;
pg_dump and psql support; and documentation for the anticipated
initial feature set.

Dimitri Fontaine, with review and a bunch of additional hacking by me.
Thom Brown extensively reviewed earlier versions of this patch set,
but there's not a whole lot of that code left in this commit, as it
turns out.
2012-07-18 10:16:16 -04:00
Tom Lane 73b796a52c Improve coding around the fsync request queue.
In all branches back to 8.3, this patch fixes a questionable assumption in
CompactCheckpointerRequestQueue/CompactBgwriterRequestQueue that there are
no uninitialized pad bytes in the request queue structs.  This would only
cause trouble if (a) there were such pad bytes, which could happen in 8.4
and up if the compiler makes enum ForkNumber narrower than 32 bits, but
otherwise would require not-currently-planned changes in the widths of
other typedefs; and (b) the kernel has not uniformly initialized the
contents of shared memory to zeroes.  Still, it seems a tad risky, and we
can easily remove any risk by pre-zeroing the request array for ourselves.
In addition to that, we need to establish a coding rule that struct
RelFileNode can't contain any padding bytes, since such structs are copied
into the request array verbatim.  (There are other places that are assuming
this anyway, it turns out.)

In 9.1 and up, the risk was a bit larger because we were also effectively
assuming that struct RelFileNodeBackend contained no pad bytes, and with
fields of different types in there, that would be much easier to break.
However, there is no good reason to ever transmit fsync or delete requests
for temp files to the bgwriter/checkpointer, so we can revert the request
structs to plain RelFileNode, getting rid of the padding risk and saving
some marginal number of bytes and cycles in fsync queue manipulation while
we are at it.  The savings might be more than marginal during deletion of
a temp relation, because the old code transmitted an entirely useless but
nonetheless expensive-to-process ForgetRelationFsync request to the
background process, and also had the background process perform the file
deletion even though that can safely be done immediately.

In addition, make some cleanup of nearby comments and small improvements to
the code in CompactCheckpointerRequestQueue/CompactBgwriterRequestQueue.
2012-07-17 16:56:54 -04:00
Peter Eisentraut 71f2dd2321 PL/Python: Remove PLy_result_ass_item
It is apparently no longer used after the new slicing support was
implemented (a97207b690), so let's
remove the dead code and see if anything cares.
2012-07-17 23:26:49 +03:00
Alvaro Herrera 65558995a2 Remove recently added PL/Perl encoding tests
These only pass cleanly on UTF8 and SQL_ASCII encodings, besides the
Japanese encoding in which they were originally written, which is clearly
not good enough.  Since the functionality they test has not ever been
tested from PL/Perl, the best answer seems to be to remove the new tests
completely.

Per buildfarm results and ensuing discussion.
2012-07-17 13:26:25 -04:00
Tom Lane 57b9bdda39 Put back storage/proc.h in postmaster.c.
I took this out thinking it wasn't needed anymore, but the EXEC_BACKEND
code still needs it.  Per buildfarm.
2012-07-17 10:14:06 -04:00
Alvaro Herrera f34c68f096 Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier.  Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.

External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.

timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.

Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase.  This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.

Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-16 22:55:33 -04:00
Peter Eisentraut dd16f9480a Remove unreachable code
The Solaris Studio compiler warns about these instances, unlike more
mainstream compilers such as gcc.  But manual inspection showed that
the code is clearly not reachable, and we hope no worthy compiler will
complain about removing this code.
2012-07-16 22:15:03 +03:00
Peter Eisentraut a76c857eba Add comment why seemingly dead code is necessary 2012-07-16 22:08:04 +03:00
Tom Lane c92be3c059 Avoid pre-determining index names during CREATE TABLE LIKE parsing.
Formerly, when trying to copy both indexes and comments, CREATE TABLE LIKE
had to pre-assign names to indexes that had comments, because it made up an
explicit CommentStmt command to apply the comment and so it had to know the
name for the index.  This creates bad interactions with other indexes, as
shown in bug #6734 from Daniele Varrazzo: the preassignment logic couldn't
take any other indexes into account so it could choose a conflicting name.

To fix, add a field to IndexStmt that allows it to carry a comment to be
assigned to the new index.  (This isn't a user-exposed feature of CREATE
INDEX, only an internal option.)  Now we don't need preassignment of index
names in any situation.

I also took the opportunity to refactor DefineIndex to accept the IndexStmt
as such, rather than passing all its fields individually in a mile-long
parameter list.

Back-patch to 9.2, but no further, because it seems too dangerous to change
IndexStmt or DefineIndex's API in released branches.  The bug exists back
to 9.0 where CREATE TABLE LIKE grew the ability to copy comments, but given
the lack of prior complaints we'll just let it go unfixed before 9.2.
2012-07-16 13:25:18 -04:00
Tom Lane 54fd196ffc Prevent corner-case core dump in rfree().
rfree() failed to cope with the case that pg_regcomp() had initialized the
regex_t struct but then failed to allocate any memory for re->re_guts (ie,
the first malloc call in pg_regcomp() failed).  It would try to touch the
guts struct anyway, and thus dump core.  This is a sufficiently narrow
corner case that it's not surprising it's never been seen in the field;
but still a bug is a bug, so patch all active branches.

Noted while investigating whether we need to call pg_regfree after a
failure return from pg_regcomp.  Other than this bug, it turns out we
don't, so adjust comments appropriately.
2012-07-15 13:27:54 -04:00
Heikki Linnakangas 2686da9db2 Don't initialize TLI variable to -1, as TimeLineID is unsigned.
This was causing a compiler warning with Solaris compiler. Use 0 instead.
The variable is initialized just for the sake of tidyness  and/or debugging,
it's not used for anything before setting it to a real value.

Per report and suggestion from Peter Eisentraut.
2012-07-14 21:04:53 +03:00
Heikki Linnakangas 6c349a565a Print the name of the WAL file containing latest REDO ptr in pg_controldata.
This makes it easier to determine how far back you need to keep archived WAL
files, to restore from a backup.

Fujii Masao
2012-07-14 14:22:57 +03:00
Tom Lane b966dd6c42 Add fsync capability to initdb, and use sync_file_range() if available.
Historically we have not worried about fsync'ing anything during initdb
(in fact, initdb intentionally passes -F to each backend launch to prevent
it from fsync'ing).  But with filesystems getting more aggressive about
caching data, that's not such a good plan anymore.  Make initdb do a pass
over the finished data directory tree to fsync everything.  For testing
purposes, the -N/--nosync flag can be used to restore the old behavior.

Also, testing shows that on Linux, sync_file_range() is much faster than
posix_fadvise() for hinting to the kernel that an fsync is coming,
apparently because the latter blocks on a rather small request queue while
the former doesn't.  So use this function if available in initdb, and also
in the backend's pg_flush_data() (where it currently will affect only the
speed of CREATE DATABASE's cloning step).

We will later make pg_regress invoke initdb with the --nosync flag
to avoid slowing down cases such as "make check" in contrib.  But
let's not do so until we've shaken out any portability issues in this
patch.

Jeff Davis, reviewed by Andres Freund
2012-07-13 17:16:58 -04:00
Tom Lane 1a9405d265 Cosmetic cleanup of ginInsertValue().
Make it clearer that the passed stack mustn't be empty, and that we
are not supposed to fall off the end of the stack in the main loop.
Tighten the loop that extracts the root block number, too.

Markus Wanner and Tom Lane
2012-07-13 11:37:39 -04:00
Peter Eisentraut a84bf4922e Avoid extra newlines in XML mapping in table forest mode
found by P. Broennimann
2012-07-12 23:52:50 +03:00
Tom Lane a36088bcfa Skip text->binary conversion of unnecessary columns in contrib/file_fdw.
When reading from a text- or CSV-format file in file_fdw, the datatype
input routines can consume a significant fraction of the runtime.
Often, the query does not need all the columns, so we can get a useful
speed boost by skipping I/O conversion for unnecessary columns.

To support this, add a "convert_selectively" option to the core COPY code.
This is undocumented and not accessible from SQL (for now, anyway).

Etsuro Fujita, reviewed by KaiGai Kohei
2012-07-12 16:26:59 -04:00
Bruce Momjian 76720bdf1a Remove 'x =- 1' check for pgindent, not needed, per report from Andrew
Dunstan.
2012-07-12 14:37:47 -04:00
Magnus Hagander 058a050ec7 Fix memory and file descriptor leaks in pg_receivexlog/pg_basebackup
When the internal loop mode was added, freeing memory and closing
filedescriptors before returning became important, and a few cases
in the code missed that.

Fujii Masao
2012-07-12 13:33:58 +02:00
Tom Lane 84a42560c8 Add array_remove() and array_replace() functions.
These functions support removing or replacing array element value(s)
matching a given search value.  Although intended mainly to support a
future array-foreign-key feature, they seem useful in their own right.

Marco Nenciarini and Gabriele Bartolini, reviewed by Alex Hunsaker
2012-07-11 13:59:35 -04:00
Tom Lane 01215d61a7 Fix bogus macro definition.
Per buildfarm complaints.
2012-07-10 22:36:11 -04:00
Tatsuo Ishii 1c7a7faa5b Add comments about additional mule-internal charsets from emacs's
source code(lisp/international/mule-conf.el).  These charsets have not
been supported up to now anyway, so this is just for adding
commentary.  Also add mention that we follow emacs's implementation,
not xemacs's.
2012-07-11 08:10:50 +09:00
Tom Lane 60e9c224a1 Fix ASCII case in pg_wchar2mule_with_len.
Also some cosmetic improvements for wchar-to-mblen patch.
2012-07-10 15:59:39 -04:00
Alvaro Herrera 379607c9e8 plperl: Skip setting UTF8 flag when in SQL_ASCII encoding
When in SQL_ASCII encoding, strings passed around are not necessarily
UTF8-safe.  We had already fixed this in some places, but it looks like
we missed some.

I had to backpatch Peter Eisentraut's a8b92b60 to 9.1 in order for this
patch to cherry-pick more cleanly.

Patch from Alex Hunsaker, tweaked by Kyotaro HORIGUCHI and myself.

Some desultory cleanup and comment addition by me, during patch review.

Per bug report from Christoph Berg in
20120209102116.GA14429@msgid.df7cb.de
2012-07-10 15:15:16 -04:00
Alvaro Herrera fc4a8a6d74 perltidy adjustments to new file 2012-07-10 15:15:16 -04:00
Tom Lane 628cbb50ba Re-implement extraction of fixed prefixes from regular expressions.
To generate btree-indexable conditions from regex WHERE conditions (such as
WHERE indexed_col ~ '^foo'), we need to be able to identify any fixed
prefix that a regex might have; that is, find any string that must be a
prefix of all strings satisfying the regex.  We used to do that with
entirely ad-hoc code that looked at the source text of the regex.  It
didn't know very much about regex syntax, which mostly meant that it would
fail to identify some optimizable cases; but Viktor Rosenfeld reported that
it would produce actively wrong answers for quantified parenthesized
subexpressions, such as '^(foo)?bar'.  Rather than trying to extend the
ad-hoc code to cover this, let's get rid of it altogether in favor of
identifying prefixes by examining the compiled form of a regex.

To do this, I've added a new entry point "pg_regprefix" to the regex library;
hopefully it is defined in a sufficiently general fashion that it can remain
in the library when/if that code gets split out as a standalone project.

Since this bug has been there for a very long time, this fix needs to get
back-patched.  However it depends on some other recent commits (particularly
the addition of wchar-to-database-encoding conversion), so I'll commit this
separately and then go to work on back-porting the necessary fixes.
2012-07-10 14:54:37 -04:00
Tom Lane 00dac6000d Refactor pattern_fixed_prefix() to avoid dealing in incomplete patterns.
Previously, pattern_fixed_prefix() was defined to return whatever fixed
prefix it could extract from the pattern, plus the "rest" of the pattern.
That definition was sensible for LIKE patterns, but not so much for
regexes, where reconstituting a valid pattern minus the prefix could be
quite tricky (certainly the existing code wasn't doing that correctly).
Since the only thing that callers ever did with the "rest" of the pattern
was to pass it to like_selectivity() or regex_selectivity(), let's cut out
the middle-man and just have pattern_fixed_prefix's subroutines do this
directly.  Then pattern_fixed_prefix can return a simple selectivity
number, and the question of how to cope with partial patterns is removed
from its API specification.

While at it, adjust the API spec so that callers who don't actually care
about the pattern's selectivity (which is a lot of them) can pass NULL for
the selectivity pointer to skip doing the work of computing a selectivity
estimate.

This patch is only an API refactoring that doesn't actually change any
processing, other than allowing a little bit of useless work to be skipped.
However, it's necessary infrastructure for my upcoming fix to regex prefix
extraction, because after that change there won't be any simple way to
identify the "rest" of the regex, not even to the low level of fidelity
needed by regex_selectivity.  We can cope with that if regex_fixed_prefix
and regex_selectivity communicate directly, but not if we have to work
within the old API.  Hence, back-patch to all active branches.
2012-07-09 23:22:55 -04:00
Tom Lane e7ef6d7e24 Fix planner to pass correct collation to operator selectivity estimators.
We can do this without creating an API break for estimation functions
by passing the collation using the existing fmgr functionality for
passing an input collation as a hidden parameter.

The need for this was foreseen at the outset, but we didn't get around to
making it happen in 9.1 because of the decision to sort all pg_statistic
histograms according to the database's default collation.  That meant that
selectivity estimators generally need to use the default collation too,
even if they're estimating for an operator that will do something
different.  The reason it's suddenly become more interesting is that
regexp interpretation also uses a collation (for its LC_TYPE not LC_COLLATE
property), and we no longer want to use the wrong collation when examining
regexps during planning.  It's not that the selectivity estimate is likely
to change much from this; rather that we are thinking of caching compiled
regexps during planner estimation, and we won't get the intended benefit
if we cache them with a different collation than the executor will use.

Back-patch to 9.1, both because the regexp change is likely to get
back-patched and because we might as well get this right in all
collation-supporting branches, in case any third-party code wants to
rely on getting the collation.  The patch turns out to be minuscule
now that I've done it ...
2012-07-08 23:51:08 -04:00
Tom Lane c6aae3042b Simplify and document regex library's compact-NFA representation.
The previous coding abused the first element of a cNFA state's arcs list
to hold a per-state flag bit, which was confusing, undocumented, and not
even particularly efficient.  Get rid of that in favor of a separate
"stflags" vector.  Since there's only one bit in use, I chose to allocate a
char per state; we could possibly replace this with a bitmap at some point,
but that would make accesses a little slower.  It's already about 8X
smaller than before, so let's not get overly tense.

Also document the representation better than it was before, which is to say
not at all.

This patch is a byproduct of investigations towards extracting a "fixed
prefix" string from the compact-NFA representation of regex patterns.
Might need to back-patch it if we decide to back-patch that fix, but for
now it's just code cleanup so I'll just put it in HEAD.
2012-07-07 17:39:50 -04:00
Alvaro Herrera a184e4db83 Convert libpq regress script to Perl
This should ease its use on the Windows build environment.
2012-07-06 16:45:48 -04:00
Alvaro Herrera adb9b7d53b Update libpq test expected output
Commit 2b443063 changed wording for some of the error messages, but
neglected updating the regress output to match.
2012-07-06 16:45:47 -04:00
Bruce Momjian 3c9b406420 Run updated copyright.pl on HEAD and 9.2 trees, updating the psql
\copyright output to 2012.

Backpatch to 9.2.
2012-07-06 12:28:18 -04:00
Bruce Momjian d17c0135cd Have copyright.pl skip updating something that is just the current year,
to avoid producing dups, e.g. 2012-2012

Backpatch to 9.2.
2012-07-06 12:21:43 -04:00
Bruce Momjian 95203e0833 Modify copyright.pl so all lines are processed, not just the first
match, so files that contain embedded copyrights are updated, e.g.
pgsql/help.c.

Backpatch to 9.2.
2012-07-06 11:58:55 -04:00
Bruce Momjian 5198ae8992 Fix copyright.pl to properly skip the .git directory by adding a
basename() qualification.
2012-07-06 11:43:59 -04:00
Bruce Momjian b9eb808bf2 Fix spacing in copyright.pl after being run with missing regex slash
(now added).

Backpatch to 9.2.
2012-07-06 10:57:08 -04:00
Robert Haas f6a05fd973 Fix failure of new wchar->mb functions to advance from pointer.
Bug spotted by Tom Lane.
2012-07-05 23:47:53 -04:00
Tom Lane 8525419947 Don't try to trim "../" in join_path_components().
join_path_components() tried to remove leading ".." components from its
tail argument, but it was not nearly bright enough to do so correctly
unless the head argument was (a) absolute and (b) canonicalized.
Rather than try to fix that logic, let's just get rid of it: there is no
correctness reason to remove "..", and cosmetic concerns can be taken
care of by a subsequent canonicalize_path() call.  Per bug #6715 from
Greg Davidson.

Back-patch to all supported branches.  It appears that pre-9.2, this
function is only used with absolute paths as head arguments, which is why
we'd not noticed the breakage before.  However, third-party code might be
expecting this function to work in more general cases, so it seems wise
to back-patch.

In HEAD and 9.2, also make some minor cosmetic improvements to callers.
2012-07-05 17:16:11 -04:00
Heikki Linnakangas de479e2ed2 Revert part of the previous patch that avoided using PLy_elog().
That caused the plpython_unicode regression test to fail on SQL_ASCII
encoding, as evidenced by the buildfarm. The reason is that with the patch,
you don't get the detail in the error message that you got before. That
detail is actually very informative, so rather than just adjust the expected
output, let's revert that part of the patch for now to make the buildfarm
green again, and figure out some other way to avoid the recursion of
PLy_elog() that doesn't lose the detail.
2012-07-05 23:40:25 +03:00
Heikki Linnakangas b66de4c6d7 Fix mapping of PostgreSQL encodings to Python encodings.
Windows encodings, "win1252" and so forth, are named differently in Python,
like "cp1252". Also, if the PyUnicode_AsEncodedString() function call fails
for some reason, use a plain ereport(), not a PLy_elog(), to report that
error. That avoids recursion and crash, if PLy_elog() tries to call
PLyUnicode_Bytes() again.

This fixes bug reported by Asif Naeem. Backpatch down to 9.0, before that
plpython didn't even try these conversions.

Jan Urbański, with minor comment improvements by me.
2012-07-05 22:31:29 +03:00
Tom Lane fc548b2296 Remove support for using wait3() in place of waitpid().
All Unix-oid platforms that we currently support should have waitpid(),
since it's in V2 of the Single Unix Spec.  Our git history shows that
the wait3 code was added to support NextStep, which we officially dropped
support for as of 9.2.  So get rid of the configure test, and simplify the
macro spaghetti in reaper().  Per suggestion from Fujii Masao.
2012-07-05 14:00:40 -04:00
Magnus Hagander 3644a63984 Fix function argument tab completion for schema-qualified or quoted function names
Dean Rasheed, reviewed by Josh Kupershmidt
2012-07-05 14:06:55 +02:00
Bruce Momjian 539d38757a Fix missing regex slash that caused perltidy to get confused on
copyright.pl.

Backpatch to 9.2.
2012-07-04 21:58:48 -04:00
Bruce Momjian 042d9ffc28 Run newly-configured perltidy script on Perl files.
Run on HEAD and 9.2.
2012-07-04 21:47:49 -04:00
Robert Haas d7c734841b Reduce messages about implicit indexes and sequences to DEBUG1.
Per recent discussion on pgsql-hackers, these messages are too
chatty for most users.
2012-07-04 20:35:29 -04:00
Bruce Momjian 3e00d33261 Have pg_dump in binary-upgrade mode properly drop user-created
extensions that might exist in the new empty cluster databases, like
plpgsql.

Backpatch to 9.2.
2012-07-04 17:37:01 -04:00
Robert Haas 72dd6291f2 Add wchar -> mb conversion routines.
This is infrastructure for Alexander Korotkov's work on indexing regular
expression searches.

Alexander Korotkov, with a bit of further hackery on the MULE conversion
by me
2012-07-04 17:10:10 -04:00
Robert Haas f358428280 Increase the maximum initdb-configured value for shared_buffers to 128MB.
The old value of 32MB has been around for a very long time, and in the
meantime typical system memories have become vastly larger.  Also, now
that we no longer depend on being able to fit the entirety of our
shared memory segment into the system's limit on System V shared
memory, there's a much better chance of the higher limit actually
proving productive.

Per recent discussion on pgsql-hackers.
2012-07-04 15:55:21 -04:00
Magnus Hagander 10e0dd8f91 Remove duplicate, unnecessary, variable declaration 2012-07-04 16:17:30 +02:00
Magnus Hagander dbc6fcf35d Set the write location in the pg_receivexlog status messages
This makes it possible for the master to track how much data has
actually been written my pg_receivexlog - and not just how much
has been sent towards it.
2012-07-04 15:14:49 +02:00
Magnus Hagander 0c4b468692 Always treat a standby returning an an invalid flush location as async
This ensures that a standby such as pg_receivexlog will not be selected
as sync standby - which would cause the master to block waiting for
a location that could never happen.

Fujii Masao
2012-07-04 15:14:42 +02:00
Tom Lane 09022de1f5 Improve documentation about MULE encoding.
This commit improves the comments in pg_wchar.h and creates #define symbols
for some formerly hard-coded values.  No substantive code changes.

Tatsuo Ishii and Tom Lane
2012-07-04 00:29:57 -04:00
Alvaro Herrera 47a2adc83c Forgot an #include in the previous patch :-( 2012-07-03 16:40:15 -04:00
Alvaro Herrera 0c7b9dc7d0 Have REASSIGN OWNED work on extensions, too
Per bug #6593, REASSIGN OWNED fails when the affected role has created
an extension.  Even though the user related to the extension is not
nominally the owner, its OID appears on pg_shdepend and thus causes
problems when the user is to be dropped.

This commit adds code to change the "ownership" of the extension itself,
not of the contained objects.  This is fine because it's currently only
called from REASSIGN OWNED, which would also modify the ownership of the
contained objects.  However, this is not sufficient for a working ALTER
OWNER implementation extension.

Back-patch to 9.1, where extensions were introduced.

Bug #6593 reported by Emiliano Leporati.
2012-07-03 15:09:59 -04:00
Bruce Momjian b33385b89d Have copyright tool mention that certain files should be updated in back branches. 2012-07-03 12:02:17 -04:00
Robert Haas 6a77bff086 Remove misleading hints about reducing the System V request size.
Since the request size will now be ~48 bytes regardless of how
shared_buffers et. al. are set, much of this advice is no longer
relevant.
2012-07-03 10:07:47 -04:00
Robert Haas 3cf39e6ddb Fix a stupid bug I introduced into XLogFlush().
Commit f11e8be3e8 broke this; it was right
in Peter's original patch, but I messed it up before committing.
2012-07-02 15:33:59 -04:00
Robert Haas 3bb592bb20 Fix position of WalSndWakeupRequest call.
This avoids discriminating against wal_sync_method = open_sync or
open_datasync.

Fujii Masao, reviewed by Andres Freund
2012-07-02 14:44:10 -04:00
Peter Eisentraut 2b44306315 Assorted message style improvements 2012-07-02 21:12:46 +03:00
Tom Lane 41f4a0ab78 Fix to_date's handling of year 519.
A thinko in commit 029dfdf115 caused the year
519 to be handled differently from either adjacent year, which was not the
intention AFAICS.  Report and diagnosis by Marc Cousin.

In passing, remove redundant re-tests of year value.
2012-07-02 11:35:35 -04:00
Robert Haas 82cdd2df75 Work a little harder on comments for walsender wakeup patch.
Per gripe from Tom Lane.
2012-07-02 11:28:53 -04:00
Robert Haas f11e8be3e8 Make commit_delay much smarter.
Instead of letting every backend participating in a group commit wait
independently, have the first one that becomes ready to flush WAL wait
for the configured delay, and let all the others wait just long enough
for that first process to complete its flush.  This greatly increases
the chances of being able to configure a commit_delay setting that
actually improves performance.

As a side consequence of this change, commit_delay now affects all WAL
flushes, rather than just commits.  There was some discussion on
pgsql-hackers about whether to rename the GUC to, say, wal_flush_delay,
but in the absence of consensus I am leaving it alone for now.

Peter Geoghegan, with some changes, mostly to the documentation, by me.
2012-07-02 10:26:31 -04:00
Robert Haas f83b59997d Make walsender more responsive.
Per testing by Andres Freund, this improves replication performance
and reduces replication latency and latency jitter.  I was a bit
concerned about moving more work into XLogInsert, but testing seems
to show that it's not a problem in practice.

Along the way, improve comments for WaitLatchOrSocket.

Andres Freund.  Review and stylistic cleanup by me.
2012-07-02 09:41:01 -04:00
Tom Lane 9ad45c18b6 Fix race condition in enum value comparisons.
When (re) loading the typcache comparison cache for an enum type's values,
use an up-to-date MVCC snapshot, not the transaction's existing snapshot.
This avoids problems if we encounter an enum OID that was created since our
transaction started.  Per report from Andres Freund and diagnosis by Robert
Haas.

To ensure this is safe even if enum comparison manages to get invoked
before we've set a transaction snapshot, tweak GetLatestSnapshot to
redirect to GetTransactionSnapshot instead of throwing error when
FirstSnapshotSet is false.  The existing uses of GetLatestSnapshot (in
ri_triggers.c) don't care since they couldn't be invoked except in a
transaction that's already done some work --- but it seems just conceivable
that this might not be true of enums, especially if we ever choose to use
enums in system catalogs.

Note that the comparable coding in enum_endpoint and enum_range_internal
remains GetTransactionSnapshot; this is perhaps debatable, but if we
changed it those functions would have to be marked volatile, which doesn't
seem attractive.

Back-patch to 9.1 where ALTER TYPE ADD VALUE was added.
2012-07-01 17:12:49 -04:00
Tom Lane 39bfc94c86 Suppress compiler warnings in readfuncs.c.
Commit 7357558fc8 introduced "(void) token;"
into the READ_TEMP_LOCALS() macro, to suppress complaints from gcc 4.6
when the value of token was not used anywhere in a particular node-read
function.  However, this just moved the warning around: inspection of
buildfarm results shows that some compilers are now complaining that token
is being read before it's set.  Revert the READ_TEMP_LOCALS() macro change
and instead put "(void) token;" into READ_NODE_FIELD(), which is the
principal culprit for cases where the warning might occur.  In principle we
might need the same in READ_BITMAPSET_FIELD() and/or READ_LOCATION_FIELD(),
but it seems unlikely that a node would consist only of such fields, so
I'll leave them alone for now.
2012-06-30 22:27:49 -04:00
Tom Lane fa188b5ef5 Remove inappropriate semicolons after function definitions.
Solaris Studio warns about this, and some compilers might think it's an
outright syntax error.
2012-06-30 17:29:39 -04:00
Tom Lane 81e8264383 Declare AnonymousShmem pointer as "void *".
The original coding had it as "PGShmemHeader *", but that doesn't offer any
notational benefit because we don't dereference it.  And it was resulting
in compiler warnings on some platforms, notably buildfarm member
castoroides, where mmap() and munmap() are evidently declared to take and
return "char *".
2012-06-30 17:19:46 -04:00
Tom Lane 541ffa65c3 Prevent CREATE TABLE LIKE/INHERITS from (mis) copying whole-row Vars.
If a CHECK constraint or index definition contained a whole-row Var (that
is, "table.*"), an attempt to copy that definition via CREATE TABLE LIKE or
table inheritance produced incorrect results: the copied Var still claimed
to have the rowtype of the source table, rather than the created table.

For the LIKE case, it seems reasonable to just throw error for this
situation, since the point of LIKE is that the new table is not permanently
coupled to the old, so there's no reason to assume its rowtype will stay
compatible.  In the inheritance case, we should ideally allow such
constraints, but doing so will require nontrivial refactoring of CREATE
TABLE processing (because we'd need to know the OID of the new table's
rowtype before we adjust inherited CHECK constraints).  In view of the lack
of previous complaints, that doesn't seem worth the risk in a back-patched
bug fix, so just make it throw error for the inheritance case as well.

Along the way, replace change_varattnos_of_a_node() with a more robust
function map_variable_attnos(), which is capable of being extended to
handle insertion of ConvertRowtypeExpr whenever we get around to fixing
the inheritance case nicely, and in the meantime it returns a failure
indication to the caller so that a helpful message with some context can be
thrown.  Also, this code will do the right thing with subselects (if we
ever allow them in CHECK or indexes), and it range-checks varattnos before
using them to index into the map array.

Per report from Sergey Konoplev.  Back-patch to all supported branches.
2012-06-30 16:45:14 -04:00
Peter Eisentraut e4ffa86b57 initdb: Update check_need_password for new options
Change things so that something like initdb --auth-local=peer
--auth-host=md5 does not cause a "must specify a password" error,
like initdb -A md5 does.
2012-06-30 23:42:32 +03:00
Heikki Linnakangas 567787f216 Validate xlog record header before enlarging the work area to store it.
If the record header is garbled, we're now quite likely to notice it before
we try to make a bogus memory allocation and run out of memory. That can
still happen, if the xlog record is split across pages (we cannot verify
the record header until reading the next page in that scenario), but this
reduces the chances. An out-of-memory is treated as a corrupt record
anyway, so this isn't a correctness issue, just a case of giving a better
error message.

Per Amit Kapila's suggestion.
2012-06-30 23:14:35 +03:00
Tom Lane 42e2ce6ae3 Fix confusion between "size" and "AnonymousShmemSize".
Noted by Andres Freund.  Also improve a couple of comments.
2012-06-29 15:12:10 -04:00
Heikki Linnakangas 7a5c9ca93a Initialize shared memory copy of ckptXidEpoch correctly when not in recovery.
This bug was introduced by commit 20d98ab6e4,
so backpatch this to 9.0-9.2 like that one.

This fixes bug #6710, reported by Tarvi Pillessaar
2012-06-29 19:32:15 +03:00
Peter Eisentraut b344c651fb Make init-po and update-po recursive make targets
This is for convenience, now that adding recursive targets is much
easier than it used to be when the NLS stuff was initially added.
2012-06-29 14:01:54 +03:00
Tom Lane ae90128dc5 Fix NOTIFY to cope with I/O problems, such as out-of-disk-space.
The LISTEN/NOTIFY subsystem got confused if SimpleLruZeroPage failed,
which would typically happen as a result of a write() failure while
attempting to dump a dirty pg_notify page out of memory.  Subsequently,
all attempts to send more NOTIFY messages would fail with messages like
"Could not read from file "pg_notify/nnnn" at offset nnnnn: Success".
Only restarting the server would clear this condition.  Per reports from
Kevin Grittner and Christoph Berg.

Back-patch to 9.0, where the problem was introduced during the
LISTEN/NOTIFY rewrite.
2012-06-29 00:51:34 -04:00
Tom Lane c1494b7330 Provide MAP_FAILED if sys/mman.h doesn't.
On old HPUX this has to be #defined to -1.  It might be that other values
are required on other dinosaur systems, but we'll worry about that when
and if we get reports.
2012-06-28 14:19:20 -04:00
Heikki Linnakangas 8f85667a86 Update outdated commit; xlp_rem_len field is in page header now.
Spotted by Amit Kapila
2012-06-28 20:35:18 +03:00
Peter Eisentraut dcd5af6c34 Further fix install program detection
The $(or) make function was introduced in GNU make 3.81, so the
previous coding didn't work in 3.80.  Write it differently, and
improve the variable naming to make more sense in the new coding.
2012-06-28 20:07:02 +03:00
Robert Haas 39715af23a Fix broken mmap failure-detection code, and improve error message.
Per an observation by Thom Brown that my previous commit made an
overly large shmem allocation crash the server, on Linux.
2012-06-28 12:57:22 -04:00
Robert Haas b0fc0df936 Dramatically reduce System V shared memory consumption.
Except when compiling with EXEC_BACKEND, we'll now allocate only a tiny
amount of System V shared memory (as an interlock to protect the data
directory) and allocate the rest as anonymous shared memory via mmap.
This will hopefully spare most users the hassle of adjusting operating
system parameters before being able to start PostgreSQL with a
reasonable value for shared_buffers.

There are a bunch of documentation updates needed here, and we might
need to adjust some of the HINT messages related to shared memory as
well.  But it's not 100% clear how portable this is, so before we
write the documentation, let's give it a spin on the buildfarm and
see what turns red.
2012-06-28 11:05:16 -04:00
Robert Haas c5b3451a8e Add missing space in event_source GUC description.
This has apparently been wrong since event_source was added.

Alexander Lakhin
2012-06-28 08:15:50 -04:00
Tom Lane bde689f809 Make UtilityContainsQuery recurse until it finds a non-utility Query.
The callers of UtilityContainsQuery want it to return a non-utility Query
if it returns anything at all.  However, since we made CREATE TABLE
AS/SELECT INTO into a utility command instead of a variant of SELECT,
a command like "EXPLAIN SELECT INTO" results in two nested utility
statements.  So what we need UtilityContainsQuery to do is drill down
to the bottom non-utility Query.

I had thought of this possibility in setrefs.c, and fixed it there by
looping around the UtilityContainsQuery call; but overlooked that the call
sites in plancache.c have a similar issue.  In those cases it's
notationally inconvenient to provide an external loop, so let's redefine
UtilityContainsQuery as recursing down to a non-utility Query instead.

Noted by Rushabh Lathia.  This is a somewhat cleaned-up version of his
proposed patch.
2012-06-27 23:18:30 -04:00
Peter Eisentraut f786715412 Fix install program detection
configure handles INSTALL as a substitution variable specially, and
apparently it gets confused when it's set to empty.  Use INSTALL_
instead as a workaround to avoid the issue.
2012-06-27 21:22:41 +03:00
Heikki Linnakangas a8f97b39c7 Fix two more neglected comments, still referring to log/seg.
Fujii Masao
2012-06-27 19:11:26 +03:00
Heikki Linnakangas ec786c6c81 I neglected many comments in the log+seg -> 64-bit segno patch. Fix.
Reported by Amit Kapila.
2012-06-27 17:53:53 +03:00
Peter Eisentraut 9db7ccae20 Use system install program when available and usable
In a3176dac22 we switched to using
install-sh unconditionally, because the configure check
AC_PROG_INSTALL would pick up any random program named install, which
has caused failure reports
(http://archives.postgresql.org/pgsql-hackers/2001-03/msg00312.php).
Now the configure check is much improved and should avoid false
positives.  It has also been shown that using a system install program
can significantly reduce "make install" times, so it's worth trying.
2012-06-27 13:40:51 +03:00
Robert Haas c60ca19de9 Allow pg_terminate_backend() to be used on backends with matching role.
A similar change was made previously for pg_cancel_backend, so now it
all matches again.

Dan Farina, reviewed by Fujii Masao, Noah Misch, and Jeff Davis,
with slight kibitzing on the doc changes by me.
2012-06-26 16:16:52 -04:00
Robert Haas b79ab00144 When LWLOCK_STATS is defined, count spindelays.
When LWLOCK_STATS is *not* defined, the only change is that
SpinLockAcquire now returns the number of delays.

Patch by me, review by Jeff Janes.
2012-06-26 16:06:07 -04:00
Tom Lane 757773602c Cope with smaller-than-normal BLCKSZ setting in SPGiST indexes on text.
The original coding failed miserably for BLCKSZ of 4K or less, as reported
by Josh Kupershmidt.  With the present design for text indexes, a given
inner tuple could have up to 256 labels (requiring either 3K or 4K bytes
depending on MAXALIGN), which means that we can't positively guarantee no
failures for smaller blocksizes.  But we can at least make it behave sanely
so long as there are few enough labels to fit on a page.  Considering that
btree is also more prone to "index tuple too large" failures when BLCKSZ is
small, it's not clear that we should expend more work than this on this
case.
2012-06-26 14:36:25 -04:00
Robert Haas 0caa0d04db Make DROP FUNCTION hint more informative.
If you decide you want to take the hint, this gives you something you
can paste right back to the server.

Dean Rasheed
2012-06-26 13:33:23 -04:00
Robert Haas 76837c1507 Reduce use of heavyweight locking inside hash AM.
Avoid using LockPage(rel, 0, lockmode) to protect against changes to
the bucket mapping.  Instead, an exclusive buffer content lock is now
viewed as sufficient permission to modify the metapage, and a shared
buffer content lock is used when such modifications need to be
prevented.  This more relaxed locking regimen makes it possible that,
when we're busy getting a heavyweight bucket on the bucket we intend
to search or insert into, a bucket split might occur underneath us.
To compenate for that possibility, we use a loop-and-retry system:
release the metapage content lock, acquire the heavyweight lock on the
target bucket, and then reacquire the metapage content lock and check
that the bucket mapping has not changed.   Normally it hasn't, and
we're done.  But if by chance it has, we simply unlock the metapage,
release the heavyweight lock we acquired previously, lock the new
bucket, and loop around again.  Even in the worst case we cannot loop
very many times here, since we don't split the same bucket again until
we've split all the other buckets, and 2^N gets big pretty fast.

This results in greatly improved concurrency, because we're
effectively replacing two lwlock acquire-and-release cycles in
exclusive mode (on one of the lock manager locks) with a single
acquire-and-release cycle in shared mode (on the metapage buffer
content lock).  Testing shows that it's still not quite as good as
btree; for that, we'd probably have to find some way of getting rid
of the heavyweight bucket locks as well, which does not appear
straightforward.

Patch by me, review by Jeff Janes.
2012-06-26 06:56:10 -04:00
Heikki Linnakangas 038f3a0509 Fix pg_upgrade, broken by the xlogid/segno -> 64-bit int refactoring.
The xlogid + segno representation of a particular WAL segment doesn't make
much sense in pg_resetxlog anymore, now that we don't use that anywhere
else. Use the WAL filename instead, since that's a convenient way to name a
particular WAL segment.

I did this partially for pg_resetxlog in the original xlogid/segno -> uint64
patch, but I neglected pg_upgrade and the docs. This should now be more
complete.
2012-06-26 07:49:02 +03:00
Tom Lane 8a504a3639 Make pg_dump emit more accurate dependency information.
While pg_dump has included dependency information in archive-format output
ever since 7.3, it never made any large effort to ensure that that
information was actually useful.  In particular, in common situations where
dependency chains include objects that aren't separately emitted in the
dump, the dependencies shown for objects that were emitted would reference
the dump IDs of these un-dumped objects, leaving no clue about which other
objects the visible objects indirectly depend on.  So far, parallel
pg_restore has managed to avoid tripping over this misfeature, but only
by dint of some crude hacks like not trusting dependency information in
the pre-data section of the archive.

It seems prudent to do something about this before it rises up to bite us,
so instead of emitting the "raw" dependencies of each dumped object,
recursively search for its actual dependencies among the subset of objects
that are being dumped.

Back-patch to 9.2, since that code hasn't yet diverged materially from
HEAD.  At some point we might need to back-patch further, but right now
there are no known cases where this is actively necessary.  (The one known
case, bug #6699, is fixed in a different way by my previous patch.)  Since
this patch depends on 9.2 changes that made TOC entries be marked before
output commences as to whether they'll be dumped, back-patching further
would require additional surgery; and as of now there's no evidence that
it's worth the risk.
2012-06-25 21:21:18 -04:00
Tom Lane a1ef01fe16 Improve pg_dump's dependency-sorting logic to enforce section dump order.
As of 9.2, with the --section option, it is very important that the concept
of "pre data", "data", and "post data" sections of the output be honored
strictly; else a dump divided into separate sectional files might be
unrestorable.  However, the dependency-sorting logic knew nothing of
sections and would happily select output orderings that didn't fit that
structure.  Doing so was mostly harmless before 9.2, but now we need to be
sure it doesn't do that.  To fix, create dummy objects representing the
section boundaries and add dependencies between them and all the normal
objects.  (This might sound expensive but it seems to only add a percent or
two to pg_dump's runtime.)

This also fixes a problem introduced in 9.1 by the feature that allows
incomplete GROUP BY lists when a primary key is given in GROUP BY.
That means that views can depend on primary key constraints.  Previously,
pg_dump would deal with that by simply emitting the primary key constraint
before the view definition (and hence before the data section of the
output).  That's bad enough for simple serial restores, where creating an
index before the data is loaded works, but is undesirable for speed
reasons.  But it could lead to outright failure of parallel restores, as
seen in bug #6699 from Joe Van Dyk.  That happened because pg_restore would
switch into parallel mode as soon as it reached the constraint, and then
very possibly would try to emit the view definition before the primary key
was committed (as a consequence of another bug that causes the view not to
be correctly marked as depending on the constraint).  Adding the section
boundary constraints forces the dependency-sorting code to break the view
into separate table and rule declarations, allowing the rule, and hence the
primary key constraint it depends on, to revert to their intended location
in the post-data section.  This also somewhat accidentally works around the
bogus-dependency-marking problem, because the rule will be correctly shown
as depending on the constraint, so parallel pg_restore will now do the
right thing.  (We will fix the bogus-dependency problem for real in a
separate patch, but that patch is not easily back-portable to 9.1, so the
fact that this patch is enough to dodge the only known symptom is
fortunate.)

Back-patch to 9.1, except for the hunk that adds verification that the
finished archive TOC list is in correct section order; the place where
it was convenient to add that doesn't exist in 9.1.
2012-06-25 21:21:17 -04:00
Alvaro Herrera 77ed0c6950 Tighten up includes in sinvaladt.h, twophase.h, proc.h
Remove proc.h from sinvaladt.h and twophase.h; also replace xlog.h in
proc.h with xlogdefs.h.
2012-06-25 18:40:40 -04:00
Peter Eisentraut eeece9e609 Unify calling conventions for postgres/postmaster sub-main functions
There was a wild mix of calling conventions: Some were declared to
return void and didn't return, some returned an int exit code, some
claimed to return an exit code, which the callers checked, but
actually never returned, and so on.

Now all of these functions are declared to return void and decorated
with attribute noreturn and don't return.  That's easiest, and most
code already worked that way.
2012-06-25 21:30:12 +03:00
Robert Haas c7d47abd04 Fix typo in DEBUG message, introduced by recent WAL refactoring.
Fujii Masao
2012-06-25 14:00:35 -04:00
Robert Haas a6427f1f47 Unbreak pg_resetxlog -l.
Fujii Masao
2012-06-25 13:58:38 -04:00
Robert Haas 2dfa87bcb6 Remove sanity test in XRecOffIsValid.
Commit 061e7efb1b changed the rules
for splitting xlog records across pages, but neglected to update this
test.  It's possible that there's some better action here than just
removing the test completely, but this at least appears to get some
of the things that are currently broken (like initdb on MacOS X)
working again.
2012-06-25 12:14:43 -04:00
Kevin Grittner 5c7f954d31 Fix warning for 64-bit literal on 32-bit build. 2012-06-25 07:25:00 -05:00
Peter Eisentraut b8b2e3b2de Replace int2/int4 in C code with int16/int32
The latter was already the dominant use, and it's preferable because
in C the convention is that intXX means XX bits.  Therefore, allowing
mixed use of int2, int4, int8, int16, int32 is obviously confusing.

Remove the typedefs for int2 and int4 for now.  They don't seem to be
widely used outside of the PostgreSQL source tree, and the few uses
can probably be cleaned up by the time this ships.
2012-06-25 01:51:46 +03:00
Heikki Linnakangas 7eb8c78514 I missed some references to xlogid/xrecoff in Win32-only code. Fix. 2012-06-24 22:14:31 +03:00
Heikki Linnakangas 0687a26002 Use UINT64CONST for 64-bit integer constants.
Peter Eisentraut advised me that UINT64CONST is the proper way to do that,
not LL suffix.
2012-06-24 21:56:45 +03:00
Heikki Linnakangas a218e23a08 Oops. Remove stray paren.
I didn't notice this on my laptop as I don't HAVE_FSYNC_WRITETHROUGH.
2012-06-24 20:03:57 +03:00
Heikki Linnakangas 96ff85e2dd Use LL suffix for 64-bit constants.
Per warning from buildfarm member 'locust'. At least I think this what's
making it upset.
2012-06-24 20:01:55 +03:00
Heikki Linnakangas 0ab9d1c4b3 Replace XLogRecPtr struct with a 64-bit integer.
This simplifies code that needs to do arithmetic on XLogRecPtrs.

To avoid changing on-disk format of data pages, the LSN on data pages is
still stored in the old format. That should keep pg_upgrade happy. However,
we have XLogRecPtrs embedded in the control file, and in the structs that
are sent over the replication protocol, so this changes breaks compatibility
of pg_basebackup and server. I didn't do anything about this in this patch,
per discussion on -hackers, the right thing to do would to be to change the
replication protocol to be architecture-independent, so that you could use
a newer version of pg_receivexlog, for example, against an older server
version.
2012-06-24 19:19:45 +03:00
Heikki Linnakangas 061e7efb1b Allow WAL record header to be split across pages.
This saves a few bytes of WAL space, but the real motivation is to make it
predictable how much WAL space a record requires, as it no longer depends
on whether we need to waste the last few bytes at end of WAL page because
the header doesn't fit.

The total length field of WAL record, xl_tot_len, is moved to the beginning
of the WAL record header, so that it is still always found on the first page
where a WAL record begins.

Bump WAL version number again as this is an incompatible change.
2012-06-24 18:35:56 +03:00
Heikki Linnakangas 20ba5ca64c Move WAL continuation record information to WAL page header.
The continuation record only contained one field, xl_rem_len, so it makes
things simpler to just include it in the WAL page header. This wastes four
bytes on pages that don't begin with a continuation from previos page, plus
four bytes on every page, because of padding.

The motivation of this is to make it easier to calculate how much space a
WAL record needs. Before this patch, it depended on how many page boundaries
the record crosses. The motivation of that, in turn, is to separate the
allocation of space in the WAL from the copying of the record data to the
allocated space. Keeping the calculation of space required simple helps to
keep the critical section of allocating the space from WAL short. But that's
not included in this patch yet.

Bump WAL version number again, as this is an incompatible change.
2012-06-24 18:35:30 +03:00
Heikki Linnakangas dfda6ebaec Don't waste the last segment of each 4GB logical log file.
The comments claimed that wasting the last segment made it easier to do
calculations with XLogRecPtrs, because you don't have problems representing
last-byte-position-plus-1 that way. In my experience, however, it only made
things more complicated, because the there was two ways to represent the
boundary at the beginning of a logical log file: logid = n+1 and xrecoff = 0,
or as xlogid = n and xrecoff = 4GB - XLOG_SEG_SIZE. Some functions were
picky about which representation was used.

Also, use a 64-bit segment number instead of the log/seg combination, to
point to a certain WAL segment. We assume that all platforms have a working
64-bit integer type nowadays.

This is an incompatible change in WAL format, so bumping WAL version number.
2012-06-24 18:35:29 +03:00
Tom Lane d14241c2cf Fix memory leak in ARRAY(SELECT ...) subqueries.
Repeated execution of an uncorrelated ARRAY_SUBLINK sub-select (which
I think can only happen if the sub-select is embedded in a larger,
correlated subquery) would leak memory for the duration of the query,
due to not reclaiming the array generated in the previous execution.
Per bug #6698 from Armando Miraglia.  Diagnosis and fix idea by Heikki,
patch itself by me.

This has been like this all along, so back-patch to all supported versions.
2012-06-21 17:27:19 -04:00
Alvaro Herrera 68d0e3cbf9 Repair comment mangled by a pgindent run long ago 2012-06-21 15:37:05 -04:00
Heikki Linnakangas eeb6f37d89 Add a small cache of locks owned by a resource owner in ResourceOwner.
This speeds up reassigning locks to the parent owner, when the transaction
holds a lot of locks, but only a few of them belong to the current resource
owner. This is particularly helps pg_dump when dumping a large number of
objects.

The cache can hold up to 15 locks in each resource owner. After that, the
cache is marked as overflowed, and we fall back to the old method of
scanning the whole local lock table. The tradeoff here is that the cache has
to be scanned whenever a lock is released, so if the cache is too large,
lock release becomes more expensive. 15 seems enough to cover pg_dump, and
doesn't have much impact on lock release.

Jeff Janes, reviewed by Amit Kapila and Heikki Linnakangas.
2012-06-21 15:30:26 +03:00
Tom Lane dfd9c116cc Remove incomplete/incorrect support for zero-column foreign keys.
The original coding in ri_triggers.c had partial support for the concept of
zero-column foreign key constraints.  But this is not defined in the SQL
standard, nor was it ever allowed by any other part of Postgres, nor was it
very fully implemented even here (eg there was no support for preventing
PK-table deletions that would violate the constraint).  Doesn't seem very
useful to carry 100-plus lines of code for a corner case that no one is
interested in making work.  Instead, just add a check that the column list
read from pg_constraint is non-empty.
2012-06-20 20:15:02 -04:00
Tom Lane 0ce4459a36 Increase MAX_SYSCACHE_CALLBACKS from 20 to 32.
By my count there are 18 callers of CacheRegisterSyscacheCallback in the
core code in HEAD, so we are potentially leaving as few as 2 slots for any
add-on code to use (though possibly not all these callers would actually
activate in any particular session).  That doesn't seem like a lot of
headroom, so let's pump it up a little.
2012-06-20 19:47:37 -04:00
Tom Lane 45ba424f33 Cache the results of ri_FetchConstraintInfo in a backend-local cache.
Extracting data from pg_constraint turned out to take as much as 10% of the
runtime in a bulk-update case where the foreign key column wasn't changing,
because we did it over again for each tuple.  Fix that by maintaining a
backend-local cache of the results.  This is really a pretty small patch,
but converting the trigger functions to work with pointers rather than
local struct variables requires a lot of mechanical changes.
2012-06-20 17:24:14 -04:00
Tom Lane cfa0f4255b Improve tests for whether we can skip queueing RI enforcement triggers.
During an update of a PK row, we can skip firing the RI trigger if any old
key value is NULL, because then the row could not have had any matching
rows in the FK table.  Conversely, during an update of an FK row, the
outcome is determined if any new key value is NULL.  In either case it
becomes unnecessary to compare individual key values.

This patch was inspired by discussion of Vik Reykja's patch to use IS NOT
DISTINCT semantics for the key comparisons.  In the event there is no need
for that and so this patch looks nothing like his, but he should still get
credit for having re-opened consideration of the trigger skip logic.
2012-06-19 20:07:33 -04:00
Alvaro Herrera 11b335ac4c pg_dump: Fix verbosity level in LO progress messages
In passing, reword another instance of the same message that was
gratuitously different.

Author: Josh Kupershmidt
after a bug report by Bosco Rama
2012-06-19 17:20:23 -04:00
Tom Lane fe3db74002 Share RI trigger code between NO ACTION and RESTRICT cases.
These triggers are identical except for whether ri_Check_Pk_Match is to be
called, so factor out the common code to save a couple hundred lines.

Also, eliminate null-column checks in ri_Check_Pk_Match, since they're
duplicate with the calling functions and require unnecessary complication
in its API statement.

Simplify the way code is shared between RI_FKey_check_ins and
RI_FKey_check_upd, too.
2012-06-19 14:31:54 -04:00
Tom Lane 48756be9cf Improve comments about why SET DEFAULT triggers must recheck for matches.
I was confused about this, so try to make it clearer for the next person.

(This seems like a fairly inefficient way of dealing with a corner case,
but I don't have a better idea offhand.  Maybe if there were a way to turn
off the RI_FKey_keyequal_upd_fk event filter temporarily?)
2012-06-18 22:45:07 -04:00
Tom Lane e8c9fd5fdf Allow ON UPDATE/DELETE SET DEFAULT plans to be cached.
Once upon a time, somebody was worried that cached RI plans wouldn't get
remade with new default values after ALTER TABLE ... SET DEFAULT, so they
didn't allow caching of plans for ON UPDATE/DELETE SET DEFAULT actions.
That time is long gone, though (and even at the time I doubt this was the
greatest hazard posed by ALTER TABLE...).  So allow these triggers to cache
their plans just like the others.

The cache_plan argument to ri_PlanCheck is now vestigial, since there
are no callers that don't pass "true"; but I left it alone in case there
is any future need for it.
2012-06-18 19:37:23 -04:00
Tom Lane 03a5ba24b0 Remove derived fields from RI_QueryKey, and do a bit of other cleanup.
We really only need the foreign key constraint's OID and the query type
code to uniquely identify each plan we are caching for FK checks.  The
other stuff that was in the struct had no business being used as part of
a hash key, and was all just being copied from struct RI_ConstraintInfo
anyway.  Get rid of the unnecessary fields, and readjust various function
APIs to make them use RI_ConstraintInfo not RI_QueryKey as info source.

I'd be surprised if this makes any measurable performance difference,
but it certainly feels cleaner.
2012-06-18 18:50:29 -04:00
Peter Eisentraut e1e97e9313 pg_dump: Add missing newlines at end of messages 2012-06-18 23:57:00 +03:00
Tom Lane f9429746c9 Update SQL spec references in ri_triggers code to match SQL:2008.
Now that what we're implementing isn't SQL92, we probably shouldn't cite
chapter and verse in that spec anymore.  Also fix some comments that
talked about MATCH FULL but in fact were in code that's also used for
MATCH SIMPLE.

No code changes in this commit, just comments.
2012-06-18 12:19:38 -04:00
Tom Lane c75be2ad60 Change ON UPDATE SET NULL/SET DEFAULT referential actions to meet SQL spec.
Previously, when executing an ON UPDATE SET NULL or SET DEFAULT action for
a multicolumn MATCH SIMPLE foreign key constraint, we would set only those
referencing columns corresponding to referenced columns that were changed.
This is what the SQL92 standard said to do --- but more recent versions
of the standard say that all referencing columns should be set to null or
their default values, no matter exactly which referenced columns changed.
At least for SET DEFAULT, that is clearly saner behavior.  It's somewhat
debatable whether it's an improvement for SET NULL, but it appears that
other RDBMS systems read the spec this way.  So let's do it like that.

This is a release-notable behavioral change, although considering that
our documentation already implied it was done this way, the lack of
complaints suggests few people use such cases.
2012-06-18 12:12:52 -04:00
Tom Lane f5297bdfe4 Refer to the default foreign key match style as MATCH SIMPLE internally.
Previously we followed the SQL92 wording, "MATCH <unspecified>", but since
SQL99 there's been a less awkward way to refer to the default style.

In addition to the code changes, pg_constraint.confmatchtype now stores
this match style as 's' (SIMPLE) rather than 'u' (UNSPECIFIED).  This
doesn't affect pg_dump or psql because they use pg_get_constraintdef()
to reconstruct foreign key definitions.  But other client-side code might
examine that column directly, so this change will have to be marked as
an incompatibility in the 9.3 release notes.
2012-06-17 20:16:44 -04:00
Peter Eisentraut bb7520cc26 Make documentation of --help and --version options more consistent
Before, some places didn't document the short options (-? and -V),
some documented both, some documented nothing, and they were listed in
various orders.  Now this is hopefully more consistent and complete.
2012-06-18 02:46:59 +03:00
Tom Lane 9e18eacbdf Fix stats collector to recover nicely when system clock goes backwards.
Formerly, if the system clock went backwards, the stats collector would
fail to update the stats file any more until the clock reading again
exceeds whatever timestamp was last written into the stats file.  Such
glitches in the clock's behavior are not terribly unlikely on machines
not using NTP.  Such a scenario has been observed to cause regression test
failures in the buildfarm, and it could have bad effects on the behavior
of autovacuum, so it seems prudent to install some defenses.

We could directly detect the clock going backwards by adding
GetCurrentTimestamp calls in the stats collector's main loop, but that
would hurt performance on platforms where GetCurrentTimestamp is expensive.
To minimize the performance hit in normal cases, adopt a more complicated
scheme wherein backends check for clock skew when reading the stats file,
and if they see it, signal the stats collector by sending an extra stats
inquiry message.  The stats collector does an extra GetCurrentTimestamp
only when it receives an inquiry with an apparently out-of-order
timestamp.

To avoid unnecessary GetCurrentTimestamp calls, expand the inquiry messages
to carry the backend's current clock reading as well as its stats cutoff
time.  The latter, being intentionally slightly in-the-past, would trigger
more clock rechecks than we need if it were used for this purpose.

We might want to backpatch this change at some point, but let's let it
shake out in the buildfarm for awhile first.
2012-06-17 17:11:49 -04:00
Bruce Momjian 47463a8098 Remove 'for' loop perltidy argument, and move args to perltidyrc file.
Backpatch to 9.2.

Per suggestion from Noah Misch
2012-06-16 10:12:50 -04:00
Bruce Momjian 0acd978259 In pgindent, suppress reading the perltidy RC file using --noprofile. 2012-06-15 22:50:02 -04:00
Bruce Momjian d6e0207437 Update pgindent Perl indentation instructions based on feedback from
Àlvaro and Noah Misch.

Backpatch to 9.2.
2012-06-15 22:43:23 -04:00
Peter Eisentraut 15b1918e7d Improve reporting of permission errors for array types
Because permissions are assigned to element types, not array types,
complaining about permission denied on an array type would be
misleading to users.  So adjust the reporting to refer to the element
type instead.

In order not to duplicate the required logic in two dozen places,
refactor the permission denied reporting for types a bit.

pointed out by Yeb Havinga during the review of the type privilege
feature
2012-06-15 22:55:03 +03:00
Peter Eisentraut d933092e0a Add more message pluralization
Even though we can't do much about the case with multiple plurals in
one sentence, we can fix the other cases.
2012-06-15 02:02:02 +03:00
Robert Haas 8507c2f856 Improve readability and error messages in pg_backup_start_time.
Gurjeet Singh, with corrections by me.
2012-06-14 15:20:08 -04:00
Robert Haas 68de499bda New SQL functons pg_backup_in_progress() and pg_backup_start_time()
Darold Gilles, reviewed by Gabriele Bartolini and others, rebased by
Marco Nenciarini.  Stylistic cleanup and OID fixes by me.
2012-06-14 13:25:43 -04:00
Robert Haas cd80073445 During transaction cleanup, release locks before deleting files.
There's no need to hold onto the locks until the files are needed,
and by doing it this way, we reduce the impact on other backends who
may be awaiting locks we hold.

Noah Misch
2012-06-14 10:19:33 -04:00
Robert Haas 6cd015bea3 Add new function log_newpage_buffer.
When I implemented the ginbuildempty() function as part of
implementing unlogged tables, I falsified the note in the header
comment for log_newpage.  Although we could fix that up by changing
the comment, it seems cleaner to add a new function which is
specifically intended to handle this case.  So do that.
2012-06-14 10:11:16 -04:00
Robert Haas a475c60367 Remove misplaced sanity check from heap_create().
Even when allow_system_table_mods is not set, we allow creation of any
type of SQL object in pg_catalog, except for relations.  And you can
get relations into pg_catalog, too, by initially creating them in some
other schema and then moving them with ALTER .. SET SCHEMA.  So this
restriction, which prevents relations (only) from being created in
pg_catalog directly, is fairly pointless.  If we need a safety mechanism
for this, it should be placed further upstream, so that it affects all
SQL objects uniformly, and picks up both CREATE and SET SCHEMA.

For now, just rip it out, per discussion with Tom Lane.
2012-06-14 09:58:53 -04:00
Robert Haas d2c86a1ccd Remove RELKIND_UNCATALOGED.
This may have been important at some point in the past, but it no
longer does anything useful.

Review by Tom Lane.
2012-06-14 09:47:30 -04:00
Robert Haas 7582e0be78 Make \conninfo print SSL information.
Alastair Turner, per suggestion from Bruce Momjian.
2012-06-14 09:43:14 -04:00
Tom Lane 80491a1983 Add 9.2 branch to git_changelog's list. 2012-06-13 22:23:31 -04:00
Tom Lane f32609db72 Flesh out RELEASE_CHANGES instructions for branching in git.
We have this info in the wiki, but it should be here too.
2012-06-13 22:11:06 -04:00
Tom Lane 357c549334 Stamp library minor versions for 9.3.
This includes fixing the MSVC copy of ecpg/preproc's version info, which
seems to have been overlooked repeatedly.  Can't we fix that so there are
not two copies??
2012-06-13 22:06:26 -04:00
Tom Lane bed88fceac Stamp HEAD as 9.3devel.
Let the hacking begin ...
2012-06-13 20:03:02 -04:00
Tom Lane 80edfd7659 Revisit error message details for JSON input parsing.
Instead of identifying error locations only by line number (which could
be entirely unhelpful with long input lines), provide a fragment of the
input text too, placing this info in a new CONTEXT entry.  Make the
error detail messages conform more closely to style guidelines, fix
failure to expose some of them for translation, ensure compiler can
check formats against supplied parameters.
2012-06-13 19:43:35 -04:00
Tom Lane b8b69d8990 Revert "Reduce checkpoints and WAL traffic on low activity database server"
This reverts commit 18fb9d8d21.  Per
discussion, it does not seem like a good idea to allow committed changes to
go un-checkpointed indefinitely, as could happen in a low-traffic server;
that makes us entirely reliant on the WAL stream with no redundancy that
might aid data recovery in case of disk failure.

This re-introduces the original problem of hot-standby setups generating a
small continuing stream of WAL traffic even when idle, but there are other
ways to address that without compromising crash recovery, so we'll revisit
that issue in a future release cycle.
2012-06-13 18:48:44 -04:00
Tom Lane c3bc76bdb0 Deprecate use of GLOBAL and LOCAL in temp table creation.
Aside from adjusting the documentation to say that these are deprecated,
we now report a warning (not an error) for use of GLOBAL, since it seems
fairly likely that we might change that to request SQL-spec-compliant temp
table behavior in the foreseeable future.  Although our handling of LOCAL
is equally nonstandard, there is no evident interest in ever implementing
SQL modules, and furthermore some other products interpret LOCAL as
behaving the same way we do.  So no expectation of change and no warning
for LOCAL; but it still seems a good idea to deprecate writing it.

Noah Misch
2012-06-13 17:48:42 -04:00
Tom Lane 93f4d7f806 Support Linux's oom_score_adj API as well as the older oom_adj API.
The simplest way to handle this is just to copy-and-paste the relevant
code block in fork_process.c, so that's what I did. (It's possible that
something more complicated would be useful to packagers who want to work
with either the old or the new API; but at this point the number of such
people is rapidly approaching zero, so let's just get the minimal thing
done.)  Update relevant documentation as well.
2012-06-13 15:35:52 -04:00
Peter Eisentraut c0a6f9c84b Improve documentation of postgres -C option
Clarify help (s/return/print/), and explain that this option is for
use by other programs, not for user-facing use (it does not print
units).
2012-06-13 13:41:25 +03:00
Tom Lane f871ef74a5 Minor code review for json.c.
Improve commenting, conform to project style for use of ++ etc.
No functional changes.
2012-06-12 16:23:45 -04:00
Robert Haas 36b7e3da17 Mark JSON error detail messages for translation.
Per gripe from Tom Lane.
2012-06-12 10:41:38 -04:00
Tom Lane 51e61b04f8 Ensure pg_ctl behaves sanely when data directory is not specified.
Commit aaa6e1def2 introduced multiple hazards
in the case where pg_ctl is executed with neither a -D switch nor any
PGDATA environment variable.  It would dump core on machines which are
unforgiving about printf("%s", NULL), or failing that possibly give a
rather unhelpful complaint about being unable to execute "postgres -C",
rather than the logically prior complaint about not being told where the
data directory is.

Edmund Horner's report suggests that there is another, Windows-specific
hazard here, but I'm not the person to fix that; it would in any case only
be significant when trying to use a config-only PGDATA pointer.
2012-06-11 22:47:16 -04:00
Tom Lane bf0945e863 Fix pg_dump output to a named tar-file archive.
"pg_dump -Ft -f filename ..." got broken by my recent commit
4317e0246c, which I fear I only tested
in the output-to-stdout variant.

Report and fix by Muhammad Asif Naeem.
2012-06-11 21:55:48 -04:00
Peter Eisentraut 7d754961f7 pg_receivexlog: Rename option --dir to --directory
getopt_long() allows abbreviating long options, so we might as well
give the option the full name, and users can abbreviate it how they
like.

Do some general polishing of the --help output at the same time.
2012-06-12 00:55:27 +03:00
Magnus Hagander 3595a71e9c Prevent non-streaming replication connections from being selected sync slave
This prevents a pg_basebackup backup session that just does a base
backup (no xlog involved at all) from becoming the synchronous slave
and thus blocking all access while it runs.

Also fixes the problem when a higher priority slave shows up it would
become the sync standby before it has reached the STREAMING state, by
making sure we can only switch to a walsender that's actually STREAMING.

Fujii Masao
2012-06-11 15:17:38 +02:00
Magnus Hagander 9af34cdec8 Revert behaviour of -x/--xlog to 9.1 semantics
To replace it, add -X/--xlog-method that allows the specification
of fetch or stream.

Do this to avoid unnecessary backwards-incompatiblity. Spotted and
suggested by Peter Eisentraut.
2012-06-11 14:58:35 +02:00
Bruce Momjian 927d61eeff Run pgindent on 9.2 source tree in preparation for first 9.3
commit-fest.
2012-06-10 15:20:04 -04:00
Bruce Momjian 60801944fa Update pgindent install instructions and update typedef list. 2012-06-10 15:15:31 -04:00
Magnus Hagander a0b4c5a20a Fix pg_basebackup/pg_receivexlog for floating point timestamps
Since the replication protocol deals with TimestampTz, we need to
care for the floating point case as well in the frontend tools.

Fujii Masao, with changes from Magnus Hagander
2012-06-10 12:12:36 +02:00
Magnus Hagander 7c1abc00fa Error message capitalization fix 2012-06-10 12:02:52 +02:00
Peter Eisentraut 8570114dc1 Make include files work without having to include other ones first 2012-06-10 12:46:14 +03:00
Simon Riggs 28ac797287 Revert error message on GLOBAL/LOCAL pending further discussion 2012-06-10 08:41:01 +01:00
Simon Riggs 72335a2015 Add ERROR msg for GLOBAL/LOCAL TEMP is not yet implemented 2012-06-09 16:35:26 +01:00
Simon Riggs 3725570539 Fix bug in early startup of Hot Standby with subtransactions.
When HS startup is deferred because of overflowed subtransactions, ensure
that we re-initialize KnownAssignedXids for when both existing and incoming
snapshots have non-zero qualifying xids.

Fixes bug #6661 reported by Valentine Gogichashvili.

Analysis and fix by Andres Freund
2012-06-08 17:34:04 +01:00
Robert Haas 3b5548a3d5 When using libpq URI syntax, error out on invalid parameter names.
Dan Farina
2012-06-08 08:47:24 -04:00
Tom Lane ece01aae47 Scan the buffer pool just once, not once per fork, during relation drop.
This provides a speedup of about 4X when NBuffers is large enough.
There is also a useful reduction in sinval traffic, since we
only do CacheInvalidateSmgr() once not once per fork.

Simon Riggs, reviewed and somewhat revised by Tom Lane
2012-06-07 17:43:11 -04:00
Peter Eisentraut 5d0109bd27 Message style improvements 2012-06-07 23:54:59 +03:00
Tom Lane e8d029a30b Do unlocked prechecks in bufmgr.c loops that scan the whole buffer pool.
DropRelFileNodeBuffers, DropDatabaseBuffers, FlushRelationBuffers, and
FlushDatabaseBuffers have to scan the whole shared_buffers pool because
we have no index structure that would find the target buffers any more
efficiently than that.  This gets expensive with large NBuffers.  We can
shave some cycles from these loops by prechecking to see if the current
buffer is interesting before we acquire the buffer header lock.
Ordinarily such a test would be unsafe, but in these cases it should be
safe because we are already assuming that the caller holds a lock that
prevents any new target pages from being loaded into the buffer pool
concurrently.  Therefore, no buffer tag should be changing to a value of
interest, only away from a value of interest.  So a false negative match
is impossible, while a false positive is safe because we'll recheck after
acquiring the buffer lock.  Initial testing says that this speeds these
loops by a factor of 2X to 3X on common Intel hardware.

Patch for DropRelFileNodeBuffers by Jeff Janes (based on an idea of
Heikki's); extended to the remaining sequential scans by Tom Lane
2012-06-07 16:46:26 -04:00
Simon Riggs 2c8a4e9be2 Wake WALSender to reduce data loss at failover for async commit.
WALSender now woken up after each background flush by WALwriter, avoiding
multi-second replication delay for an all-async commit workload.
Replication delay reduced from 7s with default settings to 200ms and often
much less, allowing significantly reduced data loss at failover.

Andres Freund and Simon Riggs
2012-06-07 19:22:47 +01:00
Robert Haas b50991eedb Fix more crash-safe visibility map bugs, and improve comments.
In lazy_scan_heap, we could issue bogus warnings about incorrect
information in the visibility map, because we checked the visibility
map bit before locking the heap page, creating a race condition.  Fix
by rechecking the visibility map bit before we complain.  Rejigger
some related logic so that we rely on the possibly-outdated
all_visible_according_to_vm value as little as possible.

In heap_multi_insert, it's not safe to clear the visibility map bit
before beginning the critical section.  The visibility map is not
crash-safe unless we treat clearing the bit as a critical operation.
Specifically, if the transaction were to error out after we set the
bit and before entering the critical section, we could end up writing
the heap page to disk (with the bit cleared) and crashing before the
visibility map page made it to disk.  That would be bad.  heap_insert
has this correct, but somehow the order of operations got rearranged
when heap_multi_insert was added.

Also, add some more comments to visibilitymap_test, lazy_scan_heap,
and IndexOnlyNext, expounding on concurrency issues.

Per extensive code review by Andres Freund, and further review by Tom
Lane, who also made the original report about the bogus warnings.
2012-06-07 12:48:13 -04:00
Magnus Hagander 92135ea0ed Use strerror(errno) instead of %m
Found by Fujii Masao
2012-06-05 15:52:08 +02:00
Tom Lane 3dd8e59681 Fix bogus handling of control characters in json_lex_string().
The original coding misbehaved if "char" is signed, and also made the
extremely poor decision to print control characters literally when trying
to complain about them.  Report and patch by Shigeru Hanada.

In passing, also fix core dump risk in report_parse_error() should the
parse state be something other than what it expects.
2012-06-04 20:43:57 -04:00
Tom Lane d73b7f973d Fix memory leaks in failure paths in buildACLCommands and parseAclItem.
This is currently only cosmetic, since all the call sites just curl up
and die in event of a failure return.  It might be important for some
future use-case, though, and in any case it quiets warnings from the
clang static analyzer (as reported by Anna Zaks).

Josh Kupershmidt
2012-06-03 11:52:52 -04:00
Simon Riggs d3abbbebe5 Avoid early reuse of btree pages, causing incorrect query results.
When we allowed read-only transactions to skip assigning XIDs
we introduced the possibility that a fully deleted btree page
could be reused. This broke the index link sequence which could
then lead to indexscans silently returning fewer rows than would
have been correct. The actual incidence of silent errors from
this is thought to be very low because of the exact workload
required and locking pre-conditions. Fix is to remove pages only
if index page opaque->btpo.xact precedes RecentGlobalXmin.

Noah Misch, reviewed by Simon Riggs
2012-06-01 12:21:45 +01:00
Simon Riggs 055c352abb After any checkpoint, close all smgr files handles in bgwriter 2012-06-01 09:24:53 +01:00
Simon Riggs a297d64d92 Checkpointer starts before bgwriter to avoid missing fsync requests.
Noted while testing Hot Standby startup.
2012-06-01 08:25:17 +01:00
Simon Riggs 1ec6a2bbc9 Provide interim statistics while in mid-checkpoint.
Re-implements similar functionality in 9.1 and previously which
was removed during split of checkpointer and bgwriter.

Requested/spotted by Magnus Hagander
2012-06-01 08:19:06 +01:00
Tom Lane 4bec93ac0f Stamp 9.2beta2. 2012-05-31 19:16:55 -04:00
Tom Lane a04dc87db1 Improve comment for GetStableLatestTransactionId(). 2012-05-31 11:20:02 -04:00
Simon Riggs a2b516dab9 Only throw recovery conflicts when InHotStandby. Bug fix to recent
patch to allow Index Only Scans on Hot Standby.

Bug report from Jaime Casanova
2012-05-31 13:11:47 +01:00
Tom Lane c8105e62bb Update time zone data files to tzdata release 2012c.
DST law changes in Antarctica, Armenia, Chile, Cuba, Falkland Islands,
Gaza, Haiti, Hebron, Morocco, Syria, Tokelau Islands.
Historical corrections for Canada.
2012-05-31 00:47:57 -04:00
Tom Lane ad0009e7be Force PL and range-type support functions to be owned by a superuser.
We allow non-superusers to create procedural languages (with restrictions)
and range datatypes.  Previously, the automatically-created support
functions for these objects ended up owned by the creating user.  This
represents a rather considerable security hazard, because the owning user
might be able to alter a support function's definition in such a way as to
crash the server, inject trojan-horse SQL code, or even execute arbitrary
C code directly.  It appears that right now the only actually exploitable
problem is the infinite-recursion bug fixed in the previous patch for
CVE-2012-2655.  However, it's not hard to imagine that future additions of
more ALTER FUNCTION capability might unintentionally open up new hazards.
To forestall future problems, cause these support functions to be owned by
the bootstrap superuser, not the user creating the parent object.
2012-05-30 23:47:57 -04:00
Tom Lane 33c6eaf78e Ignore SECURITY DEFINER and SET attributes for a PL's call handler.
It's not very sensible to set such attributes on a handler function;
but if one were to do so, fmgr.c went into infinite recursion because
it would call fmgr_security_definer instead of the handler function proper.
There is no way for fmgr_security_definer to know that it ought to call the
handler and not the original function referenced by the FmgrInfo's fn_oid,
so it tries to do the latter, causing the whole process to start over
again.

Ordinarily such misconfiguration of a procedural language's handler could
be written off as superuser error.  However, because we allow non-superuser
database owners to create procedural languages and the handler for such a
language becomes owned by the database owner, it is possible for a database
owner to crash the backend, which ideally shouldn't be possible without
superuser privileges.  In 9.2 and up we will adjust things so that the
handler functions are always owned by superusers, but in existing branches
this is a minor security fix.

Problem noted by Noah Misch (after several of us had failed to detect
it :-().  This is CVE-2012-2655.
2012-05-30 23:27:57 -04:00
Tom Lane cd0ff9c0f4 Expand the allowed range of timezone offsets to +/-15:59:59 from Greenwich.
We used to only allow offsets less than +/-13 hours, then it was +/14,
then it was +/-15.  That's still not good enough though, as per today's bug
report from Patric Bechtel.  This time I actually looked through the Olson
timezone database to find the largest offsets used anywhere.  The winners
are Asia/Manila, at -15:56:00 until 1844, and America/Metlakatla, at
+15:13:42 until 1867.  So we'd better allow offsets less than +/-16 hours.

Given the history, we are way overdue to have some greppable #define
symbols controlling this, so make some ... and also remove an obsolete
comment that didn't get fixed the last time.

Back-patch to all supported branches.
2012-05-30 19:58:35 -04:00
Robert Haas 07ab1383e3 Fix two more bugs in fast-path relation locking.
First, the previous code failed to account for the fact that, during Hot
Standby operation, the startup process takes AccessExclusiveLocks on
relations without setting MyDatabaseId.  This resulted in fast path
strong lock counts failing to be incremented with the startup process
took locks, which in turn allowed conflicting lock requests to succeed
when they should not have.  Report by Erik Rijkers, diagnosis by Heikki
Linnakangas.

Second, LockReleaseAll() failed to honor the allLocks and lockmethodid
restrictions with respect to fast-path locks.  It's not clear to me
whether this produces any user-visible breakage at the moment, but it's
certainly wrong.  Rearrange order of operations in LockReleaseAll to fix.
Noted by Tom Lane.
2012-05-30 16:17:46 -04:00
Heikki Linnakangas d1996ed5e8 Change the way parent pages are tracked during buffered GiST build.
We used to mimic the way a stack is constructed when descending the tree
during normal GiST inserts, but that was quite complicated during a buffered
build. It was also wrong: in GiST, the left-to-right relationships on
different levels might not match each other, so that when you know the
parent of a child page, you won't necessarily find the parent of the page to
the right of the child page by following the rightlinks at the parent level.
This sometimes led to "could not re-find parent" errors while building a
GiST index.

We now use a simple hash table to track the parent of every internal page.
Whenever a page is split, and downlinks are moved from one page to another,
we update the hash table accordingly. This is also better for performance
than the old method, as we never need to move right to re-find the parent
page, which could take a significant amount of time for buffers that were
created much earlier in the index build.
2012-05-30 12:05:57 +03:00
Heikki Linnakangas be02b16826 Delete the temporary file used in buffered GiST build, after the build.
There were two bugs here: We forgot to call gistFreeBuildBuffers() function
at the end of build, and we passed interXact == true to BufFileCreateTemp,
so the file wasn't automatically cleaned up at end-of-transaction either.
2012-05-30 12:05:57 +03:00
Tom Lane 4317e0246c Rewrite --section option to decouple it from --schema-only/--data-only.
The initial implementation of pg_dump's --section option supposed that the
existing --schema-only and --data-only options could be made equivalent to
--section settings.  This is wrong, though, due to dubious but long since
set-in-stone decisions about where to dump SEQUENCE SET items, as seen in
bug report from Martin Pitt.  (And I'm not totally convinced there weren't
other bugs, either.)  Undo that coupling and instead drive --section
filtering off current-section state tracked as we scan through the TOC
list to call _tocEntryRequired().

To make sure those decisions don't shift around and hopefully save a few
cycles, run _tocEntryRequired() only once per TOC entry and save the result
in a new TOC field.  This required minor rejiggering of ACL handling but
also allows a far cleaner implementation of inhibit_data_for_failed_table.

Also, to ensure that pg_dump and pg_restore have the same behavior with
respect to the --section switches, add _tocEntryRequired() filtering to
WriteToc() and WriteDataChunks(), rather than trying to implement section
filtering in an entirely orthogonal way in dumpDumpableObject().  This
required adjusting the handling of the special ENCODING and STDSTRINGS
items, but they were pretty weird before anyway.

Minor other code review for the patch, too.
2012-05-29 23:22:14 -04:00
Heikki Linnakangas 4bc6fb57f7 Fix integer overflow bug in GiST buffering build calculations.
The result of (maintenance_work_mem * 1024) / BLCKSZ doesn't fit in a signed
32-bit integer, if maintenance_work_mem >= 2GB. Use double instead. And
while we're at it, write the calculations in an easier to understand form,
with the intermediary steps written out and commented.
2012-05-29 22:27:42 +03:00
Tom Lane 2755abf386 Teach AbortOutOfAnyTransaction to clean up partially-started transactions.
AbortOutOfAnyTransaction failed to do anything if the state it saw on
entry corresponded to failing partway through StartTransaction.  I fixed
AbortCurrentTransaction to cope with that case way back in commit
60b2444cc3, but evidently overlooked that
AbortOutOfAnyTransaction should do likewise.

Back-patch to all supported branches.  It's not clear that this omission
has any more-than-cosmetic consequences, but it's also not clear that it
doesn't, so back-patching seems the least risky choice.
2012-05-28 23:57:06 -04:00
Tom Lane c89bdf7690 Eliminate some more O(N^2) behaviors in pg_dump/pg_restore.
This patch fixes three places (which AFAICT is all of them) where runtime
was O(N^2) in the number of TOC entries, by using an index array to replace
linear searches of the TOC list.  This performance issue is a bit less bad
than those recently fixed, because it depends on the number of items dumped
not the number in the source database, so the problem can be dodged by
doing partial dumps.

The previous coding already had an instance of one of the two index arrays
needed, but it was only calculated in parallel-restore cases; now we need
it all the time.  I also chose to move the arrays into the ArchiveHandle
data structure, to make this code a bit more ready for the day that we
try to sling multiple ArchiveHandles around in pg_dump or pg_restore.

Since we still need some server-side work before pg_dump can really cope
nicely with tens of thousands of tables, there's probably little point in
back-patching.
2012-05-28 20:38:28 -04:00
Peter Eisentraut 2d612abd4d libpq: URI parsing fixes
Drop special handling of host component with slashes to mean
Unix-domain socket.  Specify it as separate parameter or using
percent-encoding now.

Allow omitting username, password, and port even if the corresponding
designators are present in URI.

Handle percent-encoding in query parameter keywords.

Alex Shulgin

some documentation improvements by myself
2012-05-28 22:44:34 +03:00
Peter Eisentraut 388d251679 Update SQL features list
Set E081 Basic Privileges to supported, since by the letter of it, we
support it, even though not all possible forms of USAGE privileges are
implemented.
2012-05-27 23:34:16 +03:00
Peter Eisentraut 8e497c731b psql: Remove notice about readline from --version output
This was from a time when readline support wasn't standard.  And it
doesn't help analyzing current line editing library problems.
2012-05-27 22:48:20 +03:00
Peter Eisentraut 27314d32a8 Suppress -Wunused-result warning about write()
This is related to aa90e148ca, but this
code is only used under -DLINUX_OOM_ADJ, so it was apparently
overlooked then.
2012-05-27 22:35:01 +03:00
Peter Eisentraut a8b92b6090 PL/Perl: Avoid compiler warning from clang
Use SvREFCNT_inc_simple_void() instead of SvREFCNT_inc() to avoid
warning about unused return value.
2012-05-27 22:30:34 +03:00
Magnus Hagander 16282ae688 Make pg_recievexlog by default loop on connection failures
Avoids the need for an external script in the most common
scenario. Behavior can be overridden using the -n/--noloop
commandline parameter.
2012-05-27 11:05:24 +02:00
Tom Lane 532fe28dad Prevent synchronized scanning when systable_beginscan chooses a heapscan.
The only interesting-for-performance case wherein we force heapscan here
is when we're rebuilding the relcache init file, and the only such case
that is likely to be examining a catalog big enough to be syncscanned is
RelationBuildTupleDesc.  But the early-exit optimization in that code gets
broken if we start the scan at a random place within the catalog, so that
allowing syncscan is actually a big deoptimization if pg_attribute is large
(at least for the normal case where the rows for core system catalogs have
never been changed since initdb).  Hence, prevent syncscan here.  Per my
testing pursuant to complaints from Jeff Frost and Greg Sabino Mullane,
though neither of them seem to have actually hit this specific problem.

Back-patch to 8.3, where syncscan was introduced.
2012-05-26 19:09:52 -04:00
Tom Lane d3b97d1488 Fix string truncation to be multibyte-aware in text_name and bpchar_name.
Previously, casts to name could generate invalidly-encoded results.

Also, make these functions match namein() more exactly, by consistently
using palloc0() instead of ad-hoc zeroing code.

Back-patch to all supported branches.

Karl Schnaitter and Tom Lane
2012-05-25 17:34:51 -04:00
Tom Lane 73cc7d3b24 Use binary search instead of brute-force scan in findNamespace().
The previous coding presented a significant bottleneck when dumping
databases containing many thousands of schemas, since the total time
spent searching would increase roughly as O(N^2) in the number of objects.
Noted by Jeff Janes, though I rewrote his proposed patch to use the
existing findObjectByOid infrastructure.

Since this is a longstanding performance bug, backpatch to all supported
versions.
2012-05-25 14:35:37 -04:00
Magnus Hagander 31d965819b Fix base backup streaming xlog from standby
When backing up from a standby server, the backup process
will not automatically switch xlog segment. So we must
accept a partially transferred xlog file in this case, but
rename it into position anyway.

In passing, merge the two callbacks for segment end and
stop stream into a single callback, since their implementations
were close to identical, and rename this callback to
reflect that it stops streaming rather than continues it.

Patch by Magnus Hagander, review by Fujii Masao
2012-05-25 11:36:22 +02:00
Tom Lane 2a4c46e0ba Fix array overrun in regex code.
zaptreesubs() was coded to unconditionally reset a capture subre's
corresponding pmatch[] entry.  However, in regexes without backrefs, that
array is caller-supplied and might not have as many entries as the regex
has capturing parens.  So check the array length and do nothing if there
is no corresponding entry, much as subset() does.  Failure to check this
resulted in a stack clobber in the case reported by Marko Kreen.

This bug appears to have been latent in the regex library from the
beginning.  It was not exposed because find() called dissect() not
cdissect(), and the dissect() code path didn't ever call zaptreesubs()
(formerly zapmem()).  When I unified dissect() and cdissect() in commit
4dd78bf37a, the problem was exposed.

Now that I've seen this, I'm rather suspicious that we might need to
back-patch it; but will refrain for now, for lack of evidence that
the case can be hit in the previous coding.
2012-05-24 13:56:16 -04:00
Magnus Hagander 77f93cb32d Add missing PQfinish() calls
Fujii Masao
2012-05-23 21:52:23 +02:00
Tom Lane ed962fd712 Ensure that seqscans check for interrupts at least once per page.
If a seqscan encounters many consecutive pages containing only dead tuples,
it can remain in the loop in heapgettup for a long time, and there was no
CHECK_FOR_INTERRUPTS anywhere in that loop.  This meant there were
real-world situations where a query would be effectively uncancelable for
long stretches.  Add a check placed to occur once per page, which should be
enough to provide reasonable response time without adding any measurable
overhead.

Report and patch by Merlin Moncure (though I tweaked it a bit).
Back-patch to all supported branches.
2012-05-22 19:42:05 -04:00
Robert Haas 8fbe5a317d Fix error message for COMMENT/SECURITY LABEL ON COLUMN xxx IS 'yyy'
When the column name is an unqualified name, rather than table.column,
the error message complains about too many dotted names, which is
wrong.  Report by Peter Eisentraut based on examination of the
sepgsql regression test output, but the problem also affects COMMENT.
New wording as suggested by Tom Lane.
2012-05-22 11:23:36 -04:00
Robert Haas 304aa339b2 Prevent pg_basebackup when integer_datetimes flag doesn't match.
Magnus Hagander, reviewed by Fujii Masao, with slight wording changes
by me.
2012-05-22 10:02:47 -04:00
Robert Haas 219c024c64 Repair out-of-date information in src/backend/storage/buffer/README.
In commit d526575f89, we changed things so
that buffer usage counts are incremented when the buffer is pinned, rather
than when it is unpinned, but the README file didn't get the memo.

Report by Amit Kapila.
2012-05-22 09:32:09 -04:00
Tom Lane b94ce6e807 Move postmaster's RemovePgTempFiles call to a less randomly chosen place.
There is no reason to do this as early as possible in postmaster startup,
and good reason not to do it until we have completely created the
postmaster's lock file, namely that it might contribute to pg_ctl thinking
that postmaster startup has timed out.  (This would require a rather
unusual amount of time to be spent scanning temp file directories, but we
have at least one field report of it happening reproducibly.)

Back-patch to 9.1.  Before that, pg_ctl didn't wait for additional info to
be added to the lock file, so it wasn't a problem.

Note that this is not a complete fix to the slow-start issue in 9.1,
because we still had identify_system_timezone being run during postmaster
start in 9.1.  But that's at least a reasonably well-defined delay, with
an easy workaround if needed, whereas the temp-files scan is not so
predictable and cannot be avoided.
2012-05-21 22:50:30 -04:00
Tom Lane efae4653c9 Update woefully-obsolete comment.
The accurate info about what's in a lock file has been in miscadmin.h
for some time, so let's just make this comment point there instead of
maintaining a duplicative copy.
2012-05-21 22:11:00 -04:00
Peter Eisentraut cdf8bcb8d9 pg_ctl: Sort signal list in --help output
The list was neither logical nor numerical nor alphabetical.  Let's go
with alphabetical.
2012-05-21 20:12:30 +03:00
Peter Eisentraut 4c39a09089 libpq: Add missing file to GETTEXT_FILES list
For the record, fe-print.c is also missing, but it's sort of
deprecated, and the string internationalization there has some issues,
and it doesn't seem worth fixing that.  So let's leave that out.
2012-05-21 20:08:50 +03:00
Peter Eisentraut f1f6737e15 Fix incorrect logic in JSON number lexer
Detectable by gcc -Wlogical-op.

Add two regression test cases that would previously allow incorrect
values to pass.
2012-05-20 02:24:46 +03:00
Peter Eisentraut 2273a50364 Realign some --help output to have better spacing between columns 2012-05-18 20:34:14 +03:00
Heikki Linnakangas 1d27dcf578 Fix bug in gistRelocateBuildBuffersOnSplit().
When we create a temporary copy of the old node buffer, in stack, we mustn't
leak that into any of the long-lived data structures. Before this patch,
when we called gistPopItupFromNodeBuffer(), it got added to the array of
"loaded buffers". After gistRelocateBuildBuffersOnSplit() exits, the
pointer added to the loaded buffers array points to garbage. Often that goes
unnotied, because when we go through the array of loaded buffers to unload
them, buffers with a NULL pageBuffer are ignored, which can often happen by
accident even if the pointer points to garbage.

This patch fixes that by marking the temporary copy in stack explicitly as
temporary, and refrain from adding buffers marked as temporary to the array
of loaded buffers.

While we're at it, initialize nodeBuffer->pageBlocknum to InvalidBlockNumber
and improve comments a bit. This isn't strictly necessary, but makes
debugging easier.
2012-05-18 19:38:32 +03:00
Peter Eisentraut 939ec9b8a4 Update SQL features/conformance information to SQL:2011 2012-05-17 09:50:04 +03:00
Peter Eisentraut be6d1c88a4 Change COLLATION keyword category
It was changed from unreserved to reserved as part of the COLLATION
FOR syntax, but it turns out that type_func_name_keyword is
sufficient.
2012-05-16 20:19:44 +03:00
Tom Lane 488c6dd170 Improve error message for ALTER COLUMN TYPE coercion failure.
Per recent discussion, the error message for this was actually a trifle
inaccurate, since it said "cannot be cast" which might be incorrect.
Adjust that wording, and add a HINT suggesting that a USING clause might
be needed.
2012-05-16 07:28:25 -04:00
Heikki Linnakangas 6593c5b5dc Fix bug in freespace calculation in heap_multi_insert().
If the amount of freespace on page was less than the amount reserved by
fillfactor, the calculation would underflow.

This fixes bug #6643 reported by Tomonari Katsumata.
2012-05-16 14:13:06 +03:00
Peter Eisentraut c8e086795a Remove whitespace from end of lines
pgindent and perltidy should clean up the rest.
2012-05-15 22:19:41 +03:00
Peter Eisentraut 8afb026e57 Remove stray nbsp character 2012-05-15 21:38:59 +03:00
Heikki Linnakangas d2495f272c Fix bug in to_tsquery().
We were using memcpy() to copy to a possibly overlapping memory region,
which is a no-no. Use memmove() instead.
2012-05-15 19:27:34 +03:00
Tom Lane 9b63e9869f In pgstat.c, use a timeout in WaitLatchOrSocket only on Windows.
We have no need for a timeout here really, but some broken products from
Redmond seem to lose FD_READ events occasionally, and waking up and
retrying the recv() is the only known way to work around that.  Perhaps
somebody will be motivated to figure out a better answer here; but not I.
2012-05-14 23:51:34 -04:00
Tom Lane 5a2bb06012 Revert "Add some temporary instrumentation to pgstat.c."
This reverts commit 7d88bb73f7.
That instrumentation has served its purpose.
2012-05-14 23:08:10 -04:00
Tom Lane f667747b6d Put back AC_REQUIRE([AC_STRUCT_TM]).
The BSD-ish members of the buildfarm all seem to think removing this
was a bad idea.  It looks to me like it resulted in omitting the system
header inclusion necessary to detect the fields of struct tm correctly.
2012-05-14 23:06:48 -04:00
Tom Lane e42a21b9e6 Assert that WaitLatchOrSocket callers cannot wait only for writability.
Since we have chosen to report socket EOF and error conditions via the
WL_SOCKET_READABLE flag bit, it's unsafe to wait only for
WL_SOCKET_WRITEABLE; the caller would never be notified of the socket
condition, and in some of these implementations WaitLatchOrSocket would
busy-wait until something else happens.  Add this restriction to the API
specification, and add Asserts to check that callers don't try to do that.

At some point we might want to consider adjusting the API to relax this
restriction, but until we have an actual use case for waiting on a
write-only socket, it seems premature to design a solution.
2012-05-14 16:12:28 -04:00
Peter Eisentraut ff4628f37a Remove unused AC_DEFINE symbols
ENABLE_DTRACE            unused as of a7b7b07af3
HAVE_ERR_SET_MARK        unused as of 4ed4b6c54e
HAVE_FCVT                unused as of 4553e1d80f
HAVE_STRUCT_SOCKADDR_UN  unused as of b4cea00a1f
HAVE_SYSCONF             unused as of f83356c7f5
TM_IN_SYS_TIME           never used, obsolescent per Autoconf documentation
2012-05-14 22:51:21 +03:00
Tom Lane d461d0502b For testing purposes, reinsert a timeout in pgstat.c's wait call.
Test results from buildfarm members mastodon/narwhal (Windows Server 2003)
make it look like that platform just plain loses FD_READ events
occasionally, and the only reason our previous coding seemed to work was
that it timed out every couple of seconds and retried the whole operation.
Try to verify this by reinserting a finite timeout into the pgstat loop.
This isn't meant to be a permanent patch either, just to confirm or
disprove a theory.
2012-05-14 15:03:14 -04:00
Tom Lane f1ca51549e Force pgwin32_recv into nonblock mode when called from pgstat.c.
This should get rid of the usage of pgwin32_waitforsinglesocket entirely,
and perhaps thereby remove the race condition that's evidently still
present on some versions of Windows.  The previous arrangement was a bit
unsafe anyway, since waiting at the recv() would not allow pgstat to notice
postmaster death.
2012-05-14 10:57:07 -04:00
Heikki Linnakangas f15c2eae9c Remove unnecessary pg_verifymbstr() calls from tsvector/query in functions.
The input should've been validated well before it hits the input function.
Doing so again is a waste of cycles.
2012-05-14 14:30:32 +03:00
Heikki Linnakangas 9e4637bf89 Update comments that became out-of-date with the PGXACT struct.
When the "hot" members of PGPROC were split off to separate PGXACT structs,
many PGPROC fields referred to in comments were moved to PGXACT, but the
comments were neglected in the commit. Mostly this is just a search/replace
of PGPROC with PGXACT, but the way the dummy PGPROC entries are created for
prepared transactions changed more, making some of the comments totally
bogus.

Noah Misch
2012-05-14 10:28:55 +03:00
Peter Eisentraut 64f09ca386 Remove leftovers of BeOS port
These should have been removed when the BeOS port was removed in
44f9021223.
2012-05-14 04:50:39 +03:00
Peter Eisentraut 6bf1e7668d Small punctuation editing of postgresql.conf.sample 2012-05-14 04:50:39 +03:00
Peter Eisentraut 2a7f636640 pg_ctl: Improve --help output
All other --help output has = signs between long options and their
arguments, so do it here as well.
2012-05-14 04:50:39 +03:00
Tom Lane 7d88bb73f7 Add some temporary instrumentation to pgstat.c.
Log main-loop blocking events and the results of inquiry messages.
This is to get some clarity as to what's happening on those Windows
buildfarm members that still don't like the latch-ified stats collector.
This bulks up the postmaster log a tad, so I won't leave it in place for
long.
2012-05-13 21:11:31 -04:00
Tom Lane b8347138e9 Fix DROP TABLESPACE to unlink symlink when directory is not there.
If the tablespace directory is missing entirely, we allow DROP TABLESPACE
to go through, on the grounds that it should be possible to clean up the
catalog entry in such a situation.  However, we forgot that the pg_tblspc
symlink might still be there.  We should try to remove the symlink too
(but not fail if it's no longer there), since not doing so can lead to
weird behavior subsequently, as per report from Michael Nolan.

There was some discussion of adding dependency links to prevent DROP
TABLESPACE when the catalogs still contain references to the tablespace.
That might be worth doing too, but it's an orthogonal question, and in
any case wouldn't be back-patchable.

Back-patch to 9.0, which is as far back as the logic looks like this.
We could possibly do something similar in 8.x, but given the lack of
reports I'm not sure it's worth the trouble, and anyway the case could
not arise in the form the logic is meant to cover (namely, a post-DROP
transaction rollback having resurrected the pg_tablespace entry after
some or all of the filesystem infrastructure is gone).
2012-05-13 18:06:52 -04:00
Tom Lane 966970ed63 Re-revert stats collector latch changes.
This reverts commit cb2f2873d6, restoring
the latch-ified stats collector logic.  We'll soon see if this works any
better on the Windows buildfarm machines.
2012-05-13 14:44:39 -04:00
Tom Lane b85427f227 Attempt to fix some issues in our Windows socket code.
Make sure WaitLatchOrSocket regards FD_CLOSE as a read-ready condition.
We might want to tweak this further, but it was surely wrong as-is.

Make pgwin32_waitforsinglesocket detach its private event object from the
passed socket before returning.  I suspect that failure to do so leads
to race conditions when other code (such as WaitLatchOrSocket) attaches
a different event object to the same socket.  Moreover, the existing
coding meant that repeated calls to pgwin32_waitforsinglesocket would
perform ResetEvent on an event actively connected to a socket, which
is rumored to be an unsafe practice; the WSAEventSelect documentation
appears to recommend against this, though it does not say not to do it
in so many words.

Also, uniformly use the coding pattern "WSAEventSelect(s, NULL, 0)" to
detach events from sockets, rather than passing the event in the second
parameter.  The WSAEventSelect documentation says that the second parameter
is ignored if the third is 0, so theoretically this should make no
difference.  However, elsewhere on the same reference page the use of NULL
in this context is recommended, and I have found suggestions on the net
that some versions of Windows have bugs with a non-NULL second parameter
in this usage.

Some other mostly-cosmetic cleanup, such as using the right one of
WSAGetLastError and GetLastError for reporting errors from these functions.
2012-05-13 14:35:40 -04:00
Tom Lane fd350ef395 Fix bogus declaration of local variable.
rc should be an int here, not a pgsocket.  Fairly harmless as long as
pgsocket is an integer type, but nonetheless wrong.  Error introduced
in commit 87091cb1f1.
2012-05-13 00:30:32 -04:00
Tom Lane 398b240151 Avoid unnecessary process wakeups in the log collector.
syslogger was coded to wake up once per second whether there was anything
useful to do or not.  As part of our campaign to reduce the server's idle
power consumption, change it to use a latch for waiting.  Now, in the
absence of any data to log or any signals to service, it will only wake up
at the programmed logfile rotation times (if any).
2012-05-12 19:21:54 -04:00
Tom Lane 31ad655364 Fix WaitLatchOrSocket to handle EOF on socket correctly.
When using poll(), EOF on a socket is reported with the POLLHUP not
POLLIN flag (at least on Linux).  WaitLatchOrSocket failed to check
this bit, causing it to go into a busy-wait loop if EOF occurs.
We earlier fixed the same mistake in the test for the state of the
postmaster_alive socket, but missed it for the caller-supplied socket.
Fortunately, this error is new in 9.2, since 9.1 only had a select()
based code path not a poll() based one.
2012-05-12 16:36:47 -04:00
Simon Riggs 867540b49c Ensure backwards compatibility for GetStableLatestTransactionId() 2012-05-12 13:26:10 +01:00
Peter Eisentraut afe86a9e73 Fix obsolescent C declaration syntax
gcc -Wextra/-Wold-style-declaration thinks that "inline" should go
before the function return type.
2012-05-12 12:52:02 +03:00
Tom Lane d0c231d132 Cosmetic adjustments for postmaster's handling of checkpointer.
Correct some comments, order some operations a bit more consistently.
No functional changes.
2012-05-11 17:46:37 -04:00
Peter Eisentraut 2cfb1c6f77 PL/Python: Adjust the regression tests for Python 3.3
The string representation of ImportError changed.  Remove printing
that; it's not necessary for the test.

The order in which members of a dict are printed changed.  But this
was always implementation-dependent, so we have just been lucky for a
long time.  Do the printing the hard way to ensure sorted order.
2012-05-11 23:04:47 +03:00
Robert Haas 1331cc6c1a Prevent loss of init fork when truncating an unlogged table.
Fixes bug #6635, reported by Akira Kurosawa.
2012-05-11 09:48:56 -04:00
Simon Riggs b762e8f50b Remove extraneous #include "storage/proc.h" 2012-05-11 14:46:46 +01:00
Simon Riggs b06679e012 Ensure age() returns a stable value rather than the latest value 2012-05-11 14:36:24 +01:00
Heikki Linnakangas 3652d72dd4 On GiST page split, release the locks on child pages before recursing up.
When inserting the downlinks for a split gist page, we used hold the locks
on the child pages until the insertion into the parent - and recursively its
parent if it had to be split too - were all completed. Change that so that
the locks on child pages are released after the insertion in the immediate
parent is done, before recursing further up the tree.

This reduces the number of lwlocks that are held simultaneously. Holding
many locks is bad for concurrency, and in extreme cases you can even hit
the limit of 100 simultaneously held lwlocks in a backend. If you're really
unlucky, you can hit the limit while in a critical section, which brings
down the whole system.

This fixes bug #6629 reported by Tom Forbes. Backpatch to 9.1. The page
splitting code was rewritten in 9.1, and the old code did not have this
problem.
2012-05-11 12:35:28 +03:00
Bruce Momjian ee24de4001 Revert catalog bump; was post-beta1, and unnecessary. 2012-05-10 18:44:47 -04:00
Bruce Momjian d2fe836cd2 Update comment for 'name' data type to say 63 "bytes".
Catalog version bump so everyone has the same comment for beta1.
2012-05-10 18:40:40 -04:00
Tom Lane f70fa835e0 Stamp 9.2beta1. 2012-05-10 18:35:09 -04:00
Tom Lane cb2f2873d6 Temporarily revert stats collector latch changes so we can ship beta1.
This patch reverts commit 49340037ee and some
follow-on tweaking in pgstat.c.  While the basic scheme of latch-ifying the
stats collector seems sound enough, it's failing on most Windows buildfarm
members for unknown reasons, and there's no time left to debug that before
9.2beta1.  Better to ship a beta version without this improvement.  I hope
to re-revert this once beta1 is out, though.
2012-05-10 17:26:33 -04:00
Tom Lane f40022f1ad Make WaitLatch's WL_POSTMASTER_DEATH result trustworthy; simplify callers.
Per a suggestion from Peter Geoghegan, make WaitLatch responsible for
verifying that the WL_POSTMASTER_DEATH bit it returns is truthful (by
testing PostmasterIsAlive).  Then simplify its callers, who no longer
need to do that for themselves.  Remove weasel wording about falsely-set
result bits from WaitLatch's API contract.
2012-05-10 14:34:53 -04:00
Peter Eisentraut a97207b690 PL/Python: Fix slicing support for result objects for Python 3
The old way of implementing slicing support by implementing
PySequenceMethods.sq_slice no longer works in Python 3.  You now have
to implement PyMappingMethods.mp_subscript.  Do this by simply
proxying the call to the wrapped list of result dictionaries.
Consolidate some of the subscripting regression tests.

Jan Urbański
2012-05-10 20:40:30 +03:00
Peter Eisentraut 1540d3bf4d PL/Python: Update incorrect comment
Jan Urbański
2012-05-10 20:40:30 +03:00
Tom Lane ada8fa08fc Fix Windows implementation of PGSemaphoreLock.
The original coding failed to reset ImmediateInterruptOK before returning,
which would potentially allow a subsequent query-cancel interrupt to be
accepted at an unsafe point.  This is a really nasty bug since it's so hard
to predict the consequences, but they could be unpleasant.

Also, ensure that signal handlers are serviced before this function
returns, even if the semaphore is already set.  This should make the
behavior more like Unix.

Back-patch to all supported versions.
2012-05-10 13:36:14 -04:00
Tom Lane 8ebc908c57 Improve Windows implementation of WaitLatch/WaitLatchOrSocket.
Ensure that signal handlers are serviced before this function returns.
This should make the behavior more like Unix.  Also, add some more
error checking, and make some other cosmetic improvements.

No back-patch since it's not clear whether this is fixing any live bug
that would affect 9.1.  I'm more concerned about 9.2 anyway given our
considerable recent expansions in the usage of WaitLatch.
2012-05-10 13:26:47 -04:00
Peter Eisentraut 1d158d7f98 Python 2.2 is no longer supported
It was already on its last legs, and it turns out that it was
accidentally broken in commit 89e850e6fd
and no one cared.  So remove the rest the support for it and update
the documentation to indicate that Python 2.3 is now required.
2012-05-10 20:02:57 +03:00
Magnus Hagander f33c5d471c Only attempt to show collations on servers >= 9.1.
Show a proper error message instead of a SQL error.

Josh Kupershmidt
2012-05-10 09:12:26 +02:00
Heikki Linnakangas 60a3dffb72 Fix outdated comment.
Multi-insert records observe XLOG_HEAP_INIT_PAGE flag too, as Andres Freund
pointed out.
2012-05-10 09:55:48 +03:00
Joe Conway b58bacdacb PL/pgSQL RETURN NEXT was leaking converted tuples, causing
out of memory when looping through large numbers of rows.
Flag the converted tuples to be freed. Complaint and patch
by Joe.
2012-05-09 22:57:19 -07:00
Tom Lane fd71421b01 Improve tests for postmaster death in auxiliary processes.
In checkpointer and walwriter, avoid calling PostmasterIsAlive unless
WaitLatch has reported WL_POSTMASTER_DEATH.  This saves a kernel call per
iteration of the process's outer loop, which is not all that much, but a
cycle shaved is a cycle earned.  I had already removed the unconditional
PostmasterIsAlive calls in bgwriter and pgstat in previous patches, but
forgot that WL_POSTMASTER_DEATH is supposed to be treated as untrustworthy
(per comment in unix_latch.c); so adjust those two cases to match.

There are a few other places where the same idea might be applied, but only
after substantial code rearrangement, so I didn't bother.
2012-05-10 00:54:56 -04:00
Tom Lane d3ae406f54 Further tweaking of nomenclature in checkpointer.c.
Get rid of some more naming choices that only make sense if you know that
this code used to be in the bgwriter, as well as some stray comments
referencing the bgwriter.
2012-05-10 00:01:10 -04:00
Tom Lane 6308ba05a7 Improve control logic for bgwriter hibernation mode.
Commit 6d90eaaa89 added a hibernation mode
to the bgwriter to reduce the server's idle-power consumption.  However,
its interaction with the detailed behavior of BgBufferSync's feedback
control loop wasn't very well thought out.  That control loop depends
primarily on the rate of buffer allocation, not the rate of buffer
dirtying, so the hibernation mode has to be designed to operate only when
no new buffer allocations are happening.  Also, the check for whether the
system is effectively idle was not quite right and would fail to detect
a constant low level of activity, thus allowing the bgwriter to go into
hibernation mode in a way that would let the cycle time vary quite a bit,
possibly further confusing the feedback loop.  To fix, move the wakeup
support from MarkBufferDirty and SetBufferCommitInfoNeedsSave into
StrategyGetBuffer, and prevent the bgwriter from entering hibernation mode
unless no buffer allocations have happened recently.

In addition, fix the delaying logic to remove the problem of possibly not
responding to signals promptly, which was basically caused by trying to use
the process latch's is_set flag for multiple purposes.  I can't prove it
but I'm suspicious that that hack was responsible for the intermittent
"postmaster does not shut down" failures we've been seeing in the buildfarm
lately.  In any case it did nothing to improve the readability or
robustness of the code.

In passing, express the hibernation sleep time as a multiplier on
BgWriterDelay, not a constant.  I'm not sure whether there's any value in
exposing the longer sleep time as an independently configurable setting,
but we can at least make it act like this for little extra code.
2012-05-09 23:37:10 -04:00
Peter Eisentraut 5d39807a00 Add make dependency so that postgres.bki is rebuilt in major version change
Every time since the current rule for postgres.bki was put in place
when we change the major version, people complain that their tests
fail in strange ways.  This is because the version number in
postgres.bki is not updated, because it has no dependency for that.
And you can't even force the rebuild manually if you don't happen to
know which file has the problem.  Fix that now before it will happen
again.

The only remaining problem with switching major versions, as far as
the regression tests are concerned, is that contrib needs to be
rebuilt.  But that's easily invoked, and in any case the failure modes
are more friendly if you forget that.
2012-05-09 20:45:56 +03:00
Simon Riggs 8f28789bff Rename BgWriterShmem/Request to CheckpointerShmem/Request 2012-05-09 14:23:45 +01:00
Simon Riggs bbd3ec9dce Rename BgWriterCommLock to CheckpointerCommLock 2012-05-09 14:11:48 +01:00
Simon Riggs 5829387381 Avoid xid error from age() function when run on Hot Standby 2012-05-09 13:56:24 +01:00
Tom Lane acd4c7d58b Fix an issue in recent walwriter hibernation patch.
Users of asynchronous-commit mode expect there to be a guaranteed maximum
delay before an async commit's WAL records get flushed to disk.  The
original version of the walwriter hibernation patch broke that.  Add an
extra shared-memory flag to allow async commits to kick the walwriter out
of hibernation mode, without adding any noticeable overhead in cases where
no action is needed.
2012-05-08 23:06:40 -04:00
Tom Lane 49340037ee Reduce idle power consumption of stats collector process.
Latch-ify the stats collector, so that it does not need an arbitrary wakeup
cycle to check for postmaster death.  The incremental savings in idle power
is pretty marginal, since we only had it waking every two seconds; but I
believe that this patch may also improve the collector's performance under
load, by reducing the number of kernel calls made per message when messages
are arriving constantly (we now avoid a select/poll call except when we
need to sleep).  The change also reduces the time needed for a normal
database shutdown on platforms where signals don't interrupt select().
2012-05-08 21:26:46 -04:00
Tom Lane 5461564a9d Reduce idle power consumption of walwriter and checkpointer processes.
This patch modifies the walwriter process so that, when it has not found
anything useful to do for many consecutive wakeup cycles, it extends its
sleep time to reduce the server's idle power consumption.  It reverts to
normal as soon as it's done any successful flushes.  It's still true that
during any async commit, backends check for completed, unflushed pages of
WAL and signal the walwriter if there are any; so that in practice the
walwriter can get awakened and returned to normal operation sooner than the
sleep time might suggest.

Also, improve the checkpointer so that it uses a latch and a computed delay
time to not wake up at all except when it has something to do, replacing a
previous hardcoded 0.5 sec wakeup cycle.  This also is primarily useful for
reducing the server's power consumption when idle.

In passing, get rid of the dedicated latch for signaling the walwriter in
favor of using its procLatch, since that comports better with possible
generic signal handlers using that latch.  Also, fix a pre-existing bug
with failure to save/restore errno in walwriter's signal handlers.

Peter Geoghegan, somewhat simplified by Tom
2012-05-08 20:03:26 -04:00
Peter Eisentraut db84ba65ab psql: Add variable to control keyword case in tab completion
This adds the variable COMP_KEYWORD_CASE, which controls in what case
keywords are completed.  This is partially to let users configure the
change from commit 69f4f1c357, but it
also offers more behaviors than were available before.
2012-05-08 21:06:08 +03:00
Peter Eisentraut 3420b241a7 Fix dependency tracking for src/port/%_srv.o files
Because they use their own compilation rule, they don't use the
dependency tracking logic from Makefile.global.  To make sure that
dependency tracking works anyway for the *_srv.o files, depend on
their *.o siblings as well, which do have proper dependencies.  It's a
hack that might fail someday if there is a *_srv.o without a
corresponding *.o, but it works for now (and those would probably go
into src/backend/port/ anyway).
2012-05-08 20:10:50 +03:00
Peter Eisentraut dcb2c58381 Fix misleading comments
Josh Kupershmidt
2012-05-08 19:35:22 +03:00
Peter Eisentraut 3284e03d5d Remove strdup, strtol, strtoul from libpgport
These should not be needed anymore, at least after the recent port
removals.  So let's see whether we can do without them.
2012-05-07 23:10:28 +03:00
Peter Eisentraut d7b2cd9d40 Fix pg_config.h make rule
According to the Autoconf documentation, there should be a make rule

pg_config.h: stamp-h

so that with the right setup around this, a change in pg_config.h.in
will trigger a rebuild of everything that depends on pg_config.h.  But
this doesn't always work, sometimes you need to run make twice to get
everything up to date after a change of pg_config.h.in.

The fix is to write the rule as

pg_config.h: stamp-h ;

instead (with an empty command instead of no command).  This is what
Automake-generated makefiles effectively do, so it seems safe to be on
this side.

It's not actually clear why this is (apparently) more correct.  It's
been posted to
<http://lists.gnu.org/archive/html/help-make/2012-04/msg00058.html>
without response so far.
2012-05-07 21:28:38 +03:00
Magnus Hagander 916d589a10 Make "unexpected EOF" messages DEBUG1 unless in an open transaction
"Unexpected EOF on client connection" without an open transaction
is mostly noise, so turn it into DEBUG1. With an open transaction it's
still indicating a problem, so keep those as ERROR, and change the message
to indicate that it happened in a transaction.
2012-05-07 18:50:44 +02:00
Tom Lane 71b9549d05 Overdue code review for transaction-level advisory locks patch.
Commit 62c7bd31c8 had assorted problems, most
visibly that it broke PREPARE TRANSACTION in the presence of session-level
advisory locks (which should be ignored by PREPARE), as per a recent
complaint from Stephen Rees.  More abstractly, the patch made the
LockMethodData.transactional flag not merely useless but outright
dangerous, because in point of fact that flag no longer tells you anything
at all about whether a lock is held transactionally.  This fix therefore
removes that flag altogether.  We now rely entirely on the convention
already in use in lock.c that transactional lock holds must be owned by
some ResourceOwner, while session holds are never so owned.  Setting the
locallock struct's owner link to NULL thus denotes a session hold, and
there is no redundant marker for that.

PREPARE TRANSACTION now works again when there are session-level advisory
locks, and it is also able to transfer transactional advisory locks to the
prepared transaction, but for implementation reasons it throws an error if
we hold both types of lock on a single lockable object.  Perhaps it will be
worth improving that someday.

Assorted other minor cleanup and documentation editing, as well.

Back-patch to 9.1, except that in the 9.1 branch I did not remove the
LockMethodData.transactional flag for fear of causing an ABI break for
any external code that might be examining those structs.
2012-05-04 17:44:31 -04:00
Bruce Momjian ebcaa5fcde Remove BSD/OS (BSDi) port. There are no known users upgrading to
Postgres 9.2, and perhaps no existing users either.
2012-05-03 10:58:44 -04:00
Bruce Momjian 7490c48f1e Mark git_changelog examples with the proper executable names. 2012-05-02 20:42:44 -04:00
Robert Haas 8e0c5195df Add missing parenthesis in comment. 2012-05-02 14:30:58 -04:00
Peter Eisentraut e6c2e8cb87 PL/Python: Improve test coverage
Add test cases for inline handler of plython2u (when using that
language name), and for result object element assignment.  There is
now at least one test case for every top-level functionality, except
plpy.Fatal (annoying to use in regression tests) and result object
slice retrieval and slice assignment (which are somewhat broken).
2012-05-02 21:09:03 +03:00
Peter Eisentraut 52aa334fcd PL/Python: Fix crash in functions returning SETOF and using SPI
Allocate PLyResultObject.tupdesc in TopMemoryContext, because its
lifetime is the lifetime of the Python object and it shouldn't be
freed by some other memory context, such as one controlled by SPI.  We
trust that the Python object will clean up its own memory.

Before, this would crash the included regression test case by trying
to use memory that was already freed.

reported by Asif Naeem, analysis by Tom Lane
2012-05-02 20:59:51 +03:00
Peter Eisentraut e9605a039b Even more duplicate word removal, in the spirit of the season 2012-05-02 20:56:03 +03:00
Robert Haas 0038110421 Avoid repeated CLOG access from heap_hot_search_buffer.
At the time we check whether the tuple is dead to all running
transactions, we've already verified that it isn't visible to our
scan, setting hint bits if appropriate.  So there's no need to
recheck CLOG for the all-dead test we do just a moment later.
So, add HeapTupleIsSurelyDead() to test the appropriate condition
under the assumption that all relevant hit bits are already set.

Review by Tom Lane.
2012-05-02 12:40:07 -04:00
Robert Haas 1b4998fd44 Further corrections from the department of redundancy department.
Thom Brown
2012-05-02 11:11:25 -04:00
Robert Haas e01e66f808 More duplicate word removal. 2012-05-02 09:28:16 -04:00
Heikki Linnakangas f291ccd43e Remove duplicate words in comments.
Found these with grep -r "for for ".
2012-05-02 10:20:27 +03:00
Tom Lane 50c2d6a1a6 Kill some remaining references to SVR4 and univel.
Both terms still appear in a few places, but I thought it best to leave
those alone in context.
2012-05-02 00:29:17 -04:00
Robert Haas 9b7a84f2a4 Tweak psql to print row counts when \x auto chooses non-expanded output.
Noah Misch
2012-05-01 16:05:01 -04:00
Peter Eisentraut f2f9439fbf Remove dead ports
Remove the following ports:

- dgux
- nextstep
- sunos4
- svr4
- ultrix4
- univel

These are obsolete and not worth rescuing.  In most cases, there is
circumstantial evidence that they wouldn't work anymore anyway.
2012-05-01 22:11:12 +03:00
Tom Lane 809e7e21af Converge all SQL-level statistics timing values to float8 milliseconds.
This patch adjusts the core statistics views to match the decision already
taken for pg_stat_statements, that values representing elapsed time should
be represented as float8 and measured in milliseconds.  By using float8,
we are no longer tied to a specific maximum precision of timing data.
(Internally, it's still microseconds, but we could now change that without
needing changes at the SQL level.)

The columns affected are
pg_stat_bgwriter.checkpoint_write_time
pg_stat_bgwriter.checkpoint_sync_time
pg_stat_database.blk_read_time
pg_stat_database.blk_write_time
pg_stat_user_functions.total_time
pg_stat_user_functions.self_time
pg_stat_xact_user_functions.total_time
pg_stat_xact_user_functions.self_time

The first four of these are new in 9.2, so there is no compatibility issue
from changing them.  The others require a release note comment that they
are now double precision (and can show a fractional part) rather than
bigint as before; also their underlying statistics functions now match
the column definitions, instead of returning bigint microseconds.
2012-04-30 14:03:33 -04:00
Peter Eisentraut 26471a51fc Mark ReThrowError() with attribute noreturn
All related functions were already so marked.
2012-04-30 20:23:41 +03:00
Robert Haas 0d2235a25b Remove duplicate word in comment.
Noted by Peter Geoghegan.
2012-04-30 13:14:46 -04:00
Bruce Momjian f33fe47a91 Add comments suggesting usage of git_changelog to generate release notes. 2012-04-30 11:05:34 -04:00
Tom Lane 1dd89eadcd Rename I/O timing statistics columns to blk_read_time and blk_write_time.
This seems more consistent with the pre-existing choices for names of
other statistics columns.  Rename assorted internal identifiers to match.
2012-04-29 18:13:33 -04:00
Tom Lane 309c64745e Rename track_iotiming GUC to track_io_timing.
This spelling seems significantly more readable to me.
2012-04-29 16:23:54 -04:00
Peter Eisentraut 81107282a5 Change return type of ExceptionalCondition to void and mark it noreturn
In ancient times, it was thought that this wouldn't work because of
TrapMacro/AssertMacro, but changing those to use a comma operator
appears to work without compiler warnings.
2012-04-29 21:20:14 +03:00
Peter Eisentraut 2227bb9c94 Simplify makefile rule
Instead of writing out the .c -> .o rule, use the default one, so that
dependency tracking can be used.
2012-04-29 21:20:14 +03:00
Tom Lane cdbad241f4 Clear I/O timing counters after sending them to the stats collector.
This oversight caused the reported times to accumulate in an O(N^2)
fashion the longer a backend runs.
2012-04-28 15:11:13 -04:00
Tom Lane d6f7d4fdc5 Fix printing of whole-row Vars at top level of a SELECT targetlist.
Normally whole-row Vars are printed as "tabname.*".  However, that does not
work at top level of a targetlist, because per SQL standard the parser will
think that the "*" should result in column-by-column expansion; which is
not at all what a whole-row Var implies.  We used to just print the table
name in such cases, which works most of the time; but it fails if the table
name matches a column name available anywhere in the FROM clause.  This
could lead for instance to a view being interpreted differently after dump
and reload.  Adding parentheses doesn't fix it, but there is a reasonably
simple kluge we can use instead: attach a no-op cast, so that the "*" isn't
syntactically at top level anymore.  This makes the printing of such
whole-row Vars a lot more consistent with other Vars, and may indeed fix
more cases than just the reported one; I'm suspicious that cases involving
schema qualification probably didn't work properly before, either.

Per bug report and fix proposal from Abbas Butt, though this patch is quite
different in detail from his.

Back-patch to all supported versions.
2012-04-27 19:49:18 -04:00
Bruce Momjian 993ce4e6c9 Add options to git_changelog for use in major release note creation:
--details-after
	--master-only
	--oldest-first
2012-04-27 17:15:41 -04:00
Tom Lane 537b266953 Fix syslogger's rotation disable/re-enable logic.
If it fails to open a new log file, the syslogger assumes there's something
wrong with its parameters (such as log_directory), and stops attempting
automatic time-based or size-based log file rotations.  Sending it SIGHUP
is supposed to start that up again.  However, the original coding for that
was really bogus, involving clobbering a couple of GUC variables and hoping
that SIGHUP processing would restore them.  Get rid of that technique in
favor of maintaining a separate flag showing we've turned rotation off.
Per report from Mark Kirkwood.

Also, the syslogger will automatically attempt to create the log_directory
directory if it doesn't exist, but that was only happening at startup.
For consistency and ease of use, it should do the same whenever the value
of log_directory is changed by SIGHUP.

Back-patch to all supported branches.
2012-04-27 00:12:42 -04:00
Robert Haas 3424bff90f Prevent index-only scans from returning wrong answers under Hot Standby.
The alternative of disallowing index-only scans in HS operation was
discussed, but the consensus was that it was better to treat marking
a page all-visible as a recovery conflict for snapshots that could still
fail to see XIDs on that page.  We may in the future try to soften this,
so that we simply force index scans to do heap fetches in cases where
this may be an issue, rather than throwing a hard conflict.
2012-04-26 20:00:21 -04:00
Tom Lane 7c85aa39fc Fix oversight in recent parameterized-path patch.
bitmap_scan_cost_est() has to be able to cope with a BitmapOrPath, but
I'd taken a shortcut that didn't work for that case.  Noted by Heikki.
Add some regression tests since this area is evidently under-covered.
2012-04-26 14:17:44 -04:00
Peter Eisentraut ba3e4157a7 PL/Python: Accept strings in functions returning composite types
Before 9.1, PL/Python functions returning composite types could return
a string and it would be parsed using record_in.  The 9.1 changes made
PL/Python only expect dictionaries, tuples, or objects supporting
getattr as output of composite functions, resulting in a regression
and a confusing error message, as the strings were interpreted as
sequences and the code for transforming lists to database tuples was
used.  Fix this by treating strings separately as before, before
checking for the other types.

The reason why it's important to support string to database tuple
conversion is that trigger functions on tables with composite columns
get the composite row passed in as a string (from record_out).
Without supporting converting this back using record_in, this makes it
impossible to implement pass-through behavior for these columns, as
PL/Python no longer accepts strings for composite values.

A better solution would be to fix the code that transforms composite
inputs into Python objects to produce dictionaries that would then be
correctly interpreted by the Python->PostgreSQL counterpart code.  But
that would be too invasive to backpatch to 9.1, and it is too late in
the 9.2 cycle to attempt it.  It should be revisited in the future,
though.

Reported as bug #6559 by Kirill Simonov.

Jan Urbański
2012-04-26 21:03:48 +03:00
Peter Eisentraut cc71ceab57 psql: Tab completion updates
Add/complete support for:

- ALTER DOMAIN / VALIDATE CONSTRAINT
- ALTER DOMAIN / RENAME
- ALTER DOMAIN / RENAME CONSTRAINT
- ALTER TABLE / RENAME CONSTRAINT
2012-04-26 20:07:40 +03:00
Tom Lane d6d5f67b5b Modify create_index regression test to avoid intermittent failures.
We have been seeing intermittent buildfarm failures due to a query
sometimes not using an index-only scan plan, because a background
auto-ANALYZE prevented the table's all-visible bits from being set
immediately, thereby causing the estimated cost of an index-only scan
to go up considerably.  Adjust the test case so that a bitmap index scan is
preferred instead, which serves equally well for the purpose the test case
is actually meant for.  (Of course, it would be better to eliminate the
interference from auto-ANALYZE, but I see no low-risk way to do that,
so any such fix will have to be left for 9.3 or later.)
2012-04-25 22:57:48 -04:00
Tom Lane 9fa82c9809 Fix planner's handling of RETURNING lists in writable CTEs.
setrefs.c failed to do "rtoffset" adjustment of Vars in RETURNING lists,
which meant they were left with the wrong varnos when the RETURNING list
was in a subquery.  That was never possible before writable CTEs, of
course, but now it's broken.  The executor fails to notice any problem
because ExecEvalVar just references the ecxt_scantuple for any normal
varno; but EXPLAIN breaks when the varno is wrong, as illustrated in a
recent complaint from Bartosz Dmytrak.

Since the eventual rtoffset of the subquery is not known at the time
we are preparing its plan node, the previous scheme of executing
set_returning_clause_references() at that time cannot handle this
adjustment.  Fortunately, it turns out that we don't really need to do it
that way, because all the needed information is available during normal
setrefs.c execution; we just have to dig it out of the ModifyTable node.
So, do that, and get rid of the kluge of early setrefs processing of
RETURNING lists.  (This is a little bit of a cheat in the case of inherited
UPDATE/DELETE, because we are not passing a "root" struct that corresponds
exactly to what the subplan was built with.  But that doesn't matter, and
anyway this is less ugly than early setrefs processing was.)

Back-patch to 9.1, where the problem became possible to hit.
2012-04-25 20:20:33 -04:00
Tom Lane c62b8eaae1 Fix edge-case behavior of pg_next_dst_boundary().
Due to rather sloppy thinking (on my part, I'm afraid) about the
appropriate behavior for boundary conditions, pg_next_dst_boundary() gave
undefined, platform-dependent results when the input time is exactly the
last recorded DST transition time for the specified time zone, as a result
of fetching values one past the end of its data arrays.

Change its specification to be that it always finds the next DST boundary
*after* the input time, and adjust code to match that.  The sole existing
caller, DetermineTimeZoneOffset, doesn't actually care about this
distinction, since it always uses a probe time earlier than the instant
that it does care about.  So it seemed best to me to change the API to make
the result=1 and result=0 cases more consistent, specifically to ensure
that the "before" outputs always describe the state at the given time,
rather than hacking the code to obey the previous API comment exactly.

Per bug #6605 from Sergey Burladyan.  Back-patch to all supported versions.
2012-04-25 17:26:10 -04:00
Robert Haas ca1e1a8da1 Remove prototype for nonexistent function. 2012-04-25 15:32:15 -04:00
Tom Lane 9873001e6d Another trivial comment-typo fix. 2012-04-25 14:28:58 -04:00
Peter Eisentraut 65ca8e68b7 PL/Python: Improve error messages 2012-04-25 21:11:59 +03:00
Peter Eisentraut 8bd44677df entab: Improve makefile
A few simplifications and stylistic improvements, found while grepping
around for makefile problems elsewhere.
2012-04-24 21:20:55 +03:00
Robert Haas 3ce7f18e92 Casts to or from a domain type are ignored; warn and document.
Prohibiting this outright would break dumps taken from older versions
that contain such casts, which would create far more pain than is
justified here.

Per report by Jaime Casanova and subsequent discussion.
2012-04-24 09:20:53 -04:00
Robert Haas 5d4b60f2f2 Lots of doc corrections.
Josh Kupershmidt
2012-04-23 22:43:09 -04:00
Robert Haas 7ab9b2f3b7 Rearrange lazy_scan_heap to avoid visibility map race conditions.
We must set the visibility map bit before releasing our exclusive lock
on the heap page; otherwise, someone might clear the heap page bit
before we set the visibility map bit, leading to a situation where the
visibility map thinks the page is all-visible but it's really not.

This problem has existed since 8.4, but it wasn't critical before we
had index-only scans, since the worst case scenario was that the page
wouldn't get vacuumed until the next scan_all vacuum.

Along the way, a couple of minor, related improvements: (1) if we
pause the heap scan to do an index vac cycle, release any visibility
map page we're holding, since really long-running pins are not good
for a variety of reasons; and (2) warn if we see a page that's marked
all-visible in the visibility map but not on the page level, since
that should never happen any more (it was allowed in previous
releases, but not in 9.2).
2012-04-23 22:08:06 -04:00
Robert Haas 85efd5f065 Reduce hash size for compute_array_stats, compute_tsvector_stats.
The size is only a hint, but a big hint chews up a lot of memory without
apparently improving performance much.

Analysis and patch by Noah Misch.
2012-04-23 22:05:41 -04:00
Peter Eisentraut 48658a1b81 Fix some typos
Josh Kupershmidt
2012-04-22 19:23:47 +03:00
Tom Lane 33e99153e9 Use fuzzy not exact cost comparison for the final tie-breaker in add_path.
Instead of an exact cost comparison, use a fuzzy comparison with 1e-10
delta after all other path metrics have proved equal.  This is to avoid
having platform-specific roundoff behaviors determine the choice when
two paths are really the same to our cost estimators.  Adjust the
recently-added test case that made it obvious we had a problem here.
2012-04-21 00:51:14 -04:00
Alvaro Herrera 09ff76fcdb Recast "ONLY" column CHECK constraints as NO INHERIT
The original syntax wasn't universally loved, and it didn't allow its
usage in CREATE TABLE, only ALTER TABLE.  It now works everywhere, and
it also allows using ALTER TABLE ONLY to add an uninherited CHECK
constraint, per discussion.

The pg_constraint column has accordingly been renamed connoinherit.

This commit partly reverts some of the changes in
61d81bd28d, particularly some pg_dump and
psql bits, because now pg_get_constraintdef includes the necessary NO
INHERIT within the constraint definition.

Author: Nikhil Sontakke
Some tweaks by me
2012-04-20 23:56:57 -03:00
Tom Lane 1f03630011 Adjust join_search_one_level's handling of clauseless joins.
For an initial relation that lacks any join clauses (that is, it has to be
cartesian-product-joined to the rest of the query), we considered only
cartesian joins with initial rels appearing later in the initial-relations
list.  This creates an undesirable dependency on FROM-list order.  We would
never fail to find a plan, but perhaps we might not find the best available
plan.  Noted while discussing the logic with Amit Kapila.

Improve the comments a bit in this area, too.

Arguably this is a bug fix, but given the lack of complaints from the
field I'll refrain from back-patching.
2012-04-20 20:10:46 -04:00
Tom Lane 5b7b5518d0 Revise parameterized-path mechanism to fix assorted issues.
This patch adjusts the treatment of parameterized paths so that all paths
with the same parameterization (same set of required outer rels) for the
same relation will have the same rowcount estimate.  We cache the rowcount
estimates to ensure that property, and hopefully save a few cycles too.
Doing this makes it practical for add_path_precheck to operate without
a rowcount estimate: it need only assume that paths with different
parameterizations never dominate each other, which is close enough to
true anyway for coarse filtering, because normally a more-parameterized
path should yield fewer rows thanks to having more join clauses to apply.

In add_path, we do the full nine yards of comparing rowcount estimates
along with everything else, so that we can discard parameterized paths that
don't actually have an advantage.  This fixes some issues I'd found with
add_path rejecting parameterized paths on the grounds that they were more
expensive than not-parameterized ones, even though they yielded many fewer
rows and hence would be cheaper once subsequent joining was considered.

To make the same-rowcounts assumption valid, we have to require that any
parameterized path enforce *all* join clauses that could be obtained from
the particular set of outer rels, even if not all of them are useful for
indexing.  This is required at both base scans and joins.  It's a good
thing anyway since the net impact is that join quals are checked at the
lowest practical level in the join tree.  Hence, discard the original
rather ad-hoc mechanism for choosing parameterization joinquals, and build
a better one that has a more principled rule for when clauses can be moved.
The original rule was actually buggy anyway for lack of knowledge about
which relations are part of an outer join's outer side; getting this right
requires adding an outer_relids field to RestrictInfo.
2012-04-19 15:53:47 -04:00
Robert Haas 293ec33c32 Remove bogus comment from HeapTupleSatisfiesNow.
This has been wrong for a really long time.  We don't use two-phase
locking to protect against serialization anomalies.

Per discussion on pgsql-hackers about 2011-03-07; original report
by Dan Ports.
2012-04-18 11:50:45 -04:00
Robert Haas 4a6fab03f2 Finish rename of FastPathStrongLocks to FastPathStrongRelationLocks.
Commit 8e5ac74c12 tried to do this renaming,
but I relied on gcc to tell me where I needed to make changes, instead of
grep.

Noted by Jeff Davis.
2012-04-18 11:29:34 -04:00
Robert Haas 53c5b869b4 Tighten up error recovery for fast-path locking.
The previous code could cause a backend crash after BEGIN; SAVEPOINT a;
LOCK TABLE foo (interrupted by ^C or statement timeout); ROLLBACK TO
SAVEPOINT a; LOCK TABLE foo, and might have leaked strong-lock counts
in other situations.

Report by Zoltán Böszörményi; patch review by Jeff Davis.
2012-04-18 11:17:30 -04:00
Robert Haas ab77b2da8b Fix incorrect comment in SetBufferCommitInfoNeedsSave().
Noah Misch spotted the fact that the old comment is in fact incorrect, due
to memory ordering hazards.
2012-04-18 10:55:40 -04:00
Robert Haas e93c0b820f After PageSetAllVisible, use MarkBufferDirty.
Previously, we used SetBufferCommitInfoNeedsSave, but that's really
intended for dirty-marks we can theoretically afford to lose, such as
hint bits.  As for 9.2, the PD_ALL_VISIBLE mustn't be lost in this
way, since we could then end up with a heap page that isn't
all-visible and a visibility map page that is all visible, causing
index-only scans to return wrong answers.
2012-04-18 10:49:37 -04:00
Robert Haas b5eccaef2c Fix copyfuncs/equalfuncs support for ReassignOwnedStmt.
Noah Misch
2012-04-18 10:45:18 -04:00
Robert Haas 53bbc681ca Fix various infelicities in node functions.
Mostly, this consists of adding support for fields which exist in the
structure but aren't handled by copy/equal/outfuncs; but the create
foreign table case can actually produce garbage output.

Noah Misch
2012-04-18 10:43:16 -04:00
Peter Eisentraut 1fd832ddff psql: Add tab completion for CREATE/ALTER ROLE name WITH
Previously, the use of the optional key word WITH was not supported.

Josh Kupershmidt
2012-04-18 16:55:01 +03:00
Andrew Dunstan 1b37a8c3cc Don't override arguments set via options with positional arguments.
A number of utility programs were rather careless about paremeters
that can be set via both an option argument and a positional
argument. This leads to results which can violate the Principal
Of Least Astonishment. These changes refuse to use positional
arguments to override settings that have been made via positional
arguments. The changes are backpatched to all live branches.
2012-04-17 18:30:34 -04:00
Heikki Linnakangas fe546f3da6 Don't wait for the commit record to be replicated if we wrote no WAL.
When using synchronous replication, we waited for the commit record to be
replicated, but if we our transaction didn't write any other WAL records,
that's not required because we don't even flush the WAL locally to disk in
that case. This lead to long waits when committing a transaction that only
modified a temporary table. Bug spotted by Thom Brown.
2012-04-17 16:28:31 +03:00
Peter Eisentraut a33fcd7e79 Fix typo
Kyotaro HORIGUCHI
2012-04-16 15:36:40 +03:00
Heikki Linnakangas 49440fff08 Install plpgsql.h to to include/server at "make install".
The header file is needed by any module that wants to use the PL/pgSQL
instrumentation plugin interface. Most notably, the pldebugger plugin needs
this. With this patch, it can be built using pgxs, without having the full
server source tree available.
2012-04-16 13:03:16 +03:00
Peter Eisentraut 0f48e06751 PL/Python: Improve documentation of nrows() method
Clarify that nrows() is the number of rows processed, versus the
number of rows returned, which can be obtained using len.  Also add
tests about that.
2012-04-16 11:30:32 +03:00
Peter Eisentraut c03523ed3f PL/Python: Fix crash when colnames() etc. called without result set
The result object methods colnames() etc. would crash when called
after a command that did not produce a result set.  Now they throw an
exception.

discovery and initial patch by Jean-Baptiste Quenot
2012-04-15 20:23:08 +03:00
Tatsuo Ishii 4efbb7d04f Add missing descriptions about '--timeout' and '--mode' to help
message. They are already implemented in the source code.
Suggestions about the message formatting from Tom Lane.
2012-04-15 09:17:12 +09:00
Robert Haas ea6a2d8d47 Rename synchronous_commit='write' to 'remote_write'.
Fujii Masao, per discussion on pgsql-hackers
2012-04-14 10:53:22 -04:00
Robert Haas 4a2d7ad76f pg_size_pretty(numeric)
The output of the new pg_xlog_location_diff function is of type numeric,
since it could theoretically overflow an int8 due to signedness; this
provides a convenient way to format such values.

Fujii Masao, with some beautification by me.
2012-04-14 08:07:25 -04:00
Tom Lane e54b10a62d Remove the "last ditch" code path in join_search_one_level().
So far as I can tell, it is no longer possible for this heuristic to do
anything useful, because the new weaker definition of
have_relevant_joinclause means that any relation with a joinclause must be
considered joinable to at least one other relation.  It would still be
possible for the code block to be entered, for example if there are join
order restrictions that prevent any join of the current level from being
formed; but in that case it's just a waste of cycles to attempt to form
cartesian joins, since the restrictions will still apply.

Furthermore, IMO the existence of this code path can mask bugs elsewhere;
we would have noticed the problem with cartesian joins a lot sooner if
this code hadn't compensated for it in the simplest case.

Accordingly, let's remove it and see what happens.  I'm committing this
separately from the prerequisite changes in have_relevant_joinclause,
just to make the question easier to revisit if there is some fault in
my logic.
2012-04-13 16:07:18 -04:00
Tom Lane e3ffd05b02 Weaken the planner's tests for relevant joinclauses.
We should be willing to cross-join two small relations if that allows us
to use an inner indexscan on a large relation (that is, the potential
indexqual for the large table requires both smaller relations).  This
worked in simple cases but fell apart as soon as there was a join clause
to a fourth relation, because the existence of any two-relation join clause
caused the planner to not consider clauseless joins between other base
relations.  The added regression test shows an example case adapted from
a recent complaint from Benoit Delbosc.

Adjust have_relevant_joinclause, have_relevant_eclass_joinclause, and
has_relevant_eclass_joinclause to consider that a join clause mentioning
three or more relations is sufficient grounds for joining any subset of
those relations, even if we have to do so via a cartesian join.  Since such
clauses are relatively uncommon, this shouldn't affect planning speed on
typical queries; in fact it should help a bit, because the latter two
functions in particular get significantly simpler.

Although this is arguably a bug fix, I'm not going to risk back-patching
it, since it might have currently-unforeseen consequences.
2012-04-13 16:07:17 -04:00
Peter Eisentraut c0cc526e8b Rename bytea_agg to string_agg and add delimiter argument
Per mailing list discussion, we would like to keep the bytea functions
parallel to the text functions, so rename bytea_agg to string_agg,
which already exists for text.

Also, to satisfy the rule that we don't want aggregate functions of
the same name with a different number of arguments, add a delimiter
argument, just like string_agg for text already has.
2012-04-13 21:36:59 +03:00
Peter Eisentraut 64e1309c76 Consistently quote encoding and locale names in messages 2012-04-13 20:37:07 +03:00
Robert Haas 61167bfaf2 Fix typo in comment. 2012-04-13 08:54:13 -04:00
Robert Haas 5630eddf1e Update lazy_scan_heap header comment.
The previous comment described how things worked in PostgreSQL 8.2
and prior.
2012-04-13 08:51:19 -04:00
Tom Lane 732bfa2448 Fix cost estimation for indexscan filter conditions.
cost_index's method for estimating per-tuple costs of evaluating filter
conditions (a/k/a qpquals) was completely wrong in the presence of derived
indexable conditions, such as range conditions derived from a LIKE clause.
This was largely masked in common cases as a result of all simple operator
clauses having about the same costs, but it could show up in a big way when
dealing with functional indexes containing expensive functions, as seen for
example in bug #6579 from Istvan Endredy.  Rejigger the calculation to give
sane answers when the indexquals aren't a subset of the baserestrictinfo
list.  As a side benefit, we now do the calculation properly for cases
involving join clauses (ie, parameterized indexscans), which we always
overestimated before.

There are still cases where this is an oversimplification, such as clauses
that can be dropped because they are implied by a partial index's
predicate.  But we've never accounted for that in cost estimates before,
and I'm not convinced it's worth the cycles to try to do so.
2012-04-11 20:24:17 -04:00
Tom Lane 880bfc3287 Silently ignore any nonexistent schemas that are listed in search_path.
Previously we attempted to throw an error or at least warning for missing
schemas, but this was done inconsistently because of implementation
restrictions (in many cases, GUC settings are applied outside transactions
so that we can't do system catalog lookups).  Furthermore, there were
exceptions to the rule even in the beginning, and we'd been poking more
and more holes in it as time went on, because it turns out that there are
lots of use-cases for having some irrelevant items in a common search_path
value.  It seems better to just adopt a philosophy similar to what's always
been done with Unix PATH settings, wherein nonexistent or unreadable
directories are silently ignored.

This commit also fixes the documentation to point out that schemas for
which the user lacks USAGE privilege are silently ignored.  That's always
been true but was previously not documented.

This is mostly in response to Robert Haas' complaint that 9.1 started to
throw errors or warnings for missing schemas in cases where prior releases
had not.  We won't adopt such a significant behavioral change in a back
branch, so something different will be needed in 9.1.
2012-04-11 12:02:50 -04:00
Alvaro Herrera b035cb9db7 Accept postgres:// URIs in libpq connection functions
postgres:// URIs are an attempt to "stop the bleeding" in this general
area that has been said to occur due to external projects adopting their
own syntaxes.  The syntaxes supported by this patch:

 postgres://[user[:pwd]@][unix-socket][:port[/dbname]][?param1=value1&...]
 postgres://[user[:pwd]@][net-location][:port][/dbname][?param1=value1&...]

should be enough to cover most interesting cases without having to
resort to "param=value" pairs, but those are provided for the cases that
need them regardless.

libpq documentation has been shuffled around a bit, to avoid stuffing
all the format details into the PQconnectdbParams description, which was
already a bit overwhelming.  The list of keywords has moved to its own
subsection, and the details on the URI format live in another subsection.

This includes a simple test program, as requested in discussion, to
ensure that interesting corner cases continue to work appropriately in
the future.

Author: Alexander Shulgin
Some tweaking by Álvaro Herrera, Greg Smith, Daniel Farina, Peter Eisentraut
Reviewed by Robert Haas, Alexey Klyukin (offlist), Heikki Linnakangas,
Marko Kreen, and others

Oh, it also supports postgresql:// but that's probably just an accident.
2012-04-11 04:33:51 -03:00
Tom Lane 3769fa5fc6 Make pg_tablespace_location(0) return the database's default tablespace.
This definition is convenient when applying the function to the
reltablespace column of pg_class, since that's what zero means there;
and it doesn't interfere with any other plausible use of the function.
Per gripe from Bruce Momjian.
2012-04-10 21:43:14 -04:00
Peter Eisentraut eb821b91c8 NLS: Initialize Project-Id-Version field by xgettext
Since xgettext provides options to do this now, we might as well use
them.
2012-04-10 21:26:17 +03:00
Peter Eisentraut 6b8c99c386 psql: Improve tab completion of WITH
Only match when WITH is the first word, as WITH may appear in many
other contexts.

Josh Kupershmidt
2012-04-10 20:35:39 +03:00
Tom Lane 0d9819f7e3 Measure epoch of timestamp-without-time-zone from local not UTC midnight.
This patch reverts commit 191ef2b407
and thereby restores the pre-7.3 behavior of EXTRACT(EPOCH FROM
timestamp-without-tz).  Per discussion, the more recent behavior was
misguided on a couple of grounds: it makes it hard to get a
non-timezone-aware epoch value for a timestamp, and it makes this one
case dependent on the value of the timezone GUC, which is incompatible
with having timestamp_part() labeled as immutable.

The other behavior is still available (in all releases) by explicitly
casting the timestamp to timestamp with time zone before applying EXTRACT.

This will need to be called out as an incompatible change in the 9.2
release notes.  Although having mutable behavior in a function marked
immutable is clearly a bug, we're not going to back-patch such a change.
2012-04-10 12:04:42 -04:00
Tom Lane 65fd91333e Fix an Assert that turns out to be reachable after all.
estimate_num_groups() gets unhappy with
	create table empty();
	select * from empty except select * from empty e2;
I can't see any actual use-case for such a query (and the table is illegal
per SQL spec), but it seems like a good idea that it not cause an assert
failure.
2012-04-09 11:58:24 -04:00
Tom Lane d515365a61 Don't bother copying empty support arrays in a zero-column MergeJoin.
The case could not arise when this code was originally written, but it can
now (since we made zero-column MergeJoins work for the benefit of FULL JOIN
ON TRUE).  I don't think there is any actual bug here, but we might as well
treat it consistently with other uses of COPY_POINTER_FIELD().  Per comment
from Ashutosh Bapat.
2012-04-09 11:41:54 -04:00
Robert Haas 3ae5133b1c Teach SLRU code to avoid replacing I/O-busy pages.
Patch by me; review by Tom Lane and others.
2012-04-08 23:05:55 -04:00
Heikki Linnakangas 03529a3ff9 set_stack_base() no longer needs to be called in PostgresMain.
This was a thinko in previous commit. Now that stack base pointer is now set
in PostmasterMain and SubPostmasterMain, it doesn't need to be set in
PostgresMain anymore.
2012-04-08 19:39:12 +03:00
Heikki Linnakangas ef3883d130 Do stack-depth checking in all postmaster children.
We used to only initialize the stack base pointer when starting up a regular
backend, not in other processes. In particular, autovacuum workers can run
arbitrary user code, and without stack-depth checking, infinite recursion
in e.g an index expression will bring down the whole cluster.

The comment about PL/Java using set_stack_base() is not yet true. As the
code stands, PL/java still modifies the stack_base_ptr variable directly.
However, it's been discussed in the PL/Java mailing list that it should be
changed to use the function, because PL/Java is currently oblivious to the
register stack used on Itanium. There's another issues with PL/Java, namely
that the stack base pointer it sets is not really the base of the stack, it
could be something close to the bottom of the stack. That's a separate issue
that might need some further changes to this code, but that's a different
story.

Backpatch to all supported releases.
2012-04-08 19:07:55 +03:00
Tom Lane 7feecedcce Fix incorrect make maintainer-clean rule. 2012-04-07 18:16:50 -04:00
Tom Lane 95b9c333b2 Further adjustment of comment about qsort_tuple. 2012-04-07 17:48:40 -04:00
Tom Lane a25ef7a5f6 Remove useless variable to suppress compiler warning. 2012-04-07 16:44:43 -04:00
Bruce Momjian d24ac36f4f Stamp libraries versions for 9.2 (better late than never). 2012-04-07 16:19:43 -04:00
Tom Lane 0ab4db52c0 Fix misleading output from gin_desc().
XLOG_GIN_UPDATE_META_PAGE and XLOG_GIN_DELETE_LISTPAGE records were printed
with a list link field labeled as "blkno", which was confusing, especially
when the link was empty (InvalidBlockNumber).  Print the metapage block
number instead, since that's what's actually being updated.  We could
include the link values too as a separate field, but not clear it's worth
the trouble.

Back-patch to 8.4 where the dubious code was added.
2012-04-06 18:10:21 -04:00
Tom Lane 17b985b1a0 Fix broken comparetup_datum code.
Commit 337b6f5ecf contained the entirely
fanciful assumption that it had made comparetup_datum unreachable.
Reported and patched by Takashi Yamamoto.

Fix up some not terribly accurate/useful comments from that commit, too.
2012-04-06 16:58:50 -04:00
Tom Lane cea49fe82f Dept of second thoughts: improve the API for AnalyzeForeignTable.
If we make the initially-called function return the table physical-size
estimate, acquire_inherited_sample_rows will be able to use that to
allocate numbers of samples among child tables, when the day comes that
we want to support foreign tables in inheritance trees.
2012-04-06 16:04:10 -04:00
Tom Lane 263d9de66b Allow statistics to be collected for foreign tables.
ANALYZE now accepts foreign tables and allows the table's FDW to control
how the sample rows are collected.  (But only manual ANALYZEs will touch
foreign tables, for the moment, since among other things it's not very
clear how to handle remote permissions checks in an auto-analyze.)

contrib/file_fdw is extended to support this.

Etsuro Fujita, reviewed by Shigeru Hanada, some further tweaking by me.
2012-04-06 15:02:35 -04:00
Simon Riggs 8cb53654db Add DROP INDEX CONCURRENTLY [IF EXISTS], uses ShareUpdateExclusiveLock 2012-04-06 10:21:40 +01:00
Robert Haas 21cc529698 checkopint -> checkpoint
Report by Guillaume Lelarge.
2012-04-05 21:37:33 -04:00
Robert Haas 662ca285a6 Put back code inadvertently deleted from exit_nicely.
Report by Andrew Dunstan.
2012-04-05 21:37:33 -04:00
Peter Eisentraut 05261ab624 NLS: Use msgmerge/xgettext --no-wrap and --sort-by-file
The option --no-wrap prevents wars with (most?) editors about proper
line wrapping.  --sort-by-file ensures consistent file order, for
easier diffing.
2012-04-05 22:28:13 +03:00