Commit Graph

21473 Commits

Author SHA1 Message Date
Magnus Hagander 4c8e20f815 Track walsender state in shared memory and expose in pg_stat_replication 2011-01-11 21:25:28 +01:00
Magnus Hagander 47a5f3e9da Add missing function prototype, for consistency 2011-01-11 21:12:12 +01:00
Tom Lane e6dce4e439 Adjust basebackup.c to suppress compiler warnings.
Some versions of gcc complain about "variable `tablespaces' might be
clobbered by `longjmp' or `vfork'" with the original coding.  Fix by
moving the PG_TRY block into a separate subroutine.
2011-01-11 13:41:13 -05:00
Tom Lane 9d1ac2f5fa Tweak create_index_paths()'s test for whether to consider a bitmap scan.
Per my note of a couple days ago, create_index_paths would refuse to
consider any path at all for GIN indexes if the selectivity estimate came
out as 1.0; not even if you tried to force it with enable_seqscan.  While
this isn't really a bad outcome in practice, it could be annoying for
testing purposes.  Adjust the test for "is this path only useful for
sorting" so that it doesn't fire on paths with nil pathkeys, which will
include all GIN paths.
2011-01-11 12:13:02 -05:00
Magnus Hagander b7ebda9d8c Reset walsender ps title in the main loop
When in streaming mode we can never get out, so it will never
be required, but after a base backup (or other operations)
we can get back to the loop, so the title needs to be cleared.
2011-01-11 10:04:54 +01:00
Magnus Hagander 2e36343f82 Set process title to indicate base backup is running 2011-01-10 21:53:18 +01:00
Heikki Linnakangas dc1305ce5f Leave temporary files out of streaming base backups. 2011-01-10 19:42:05 +02:00
Magnus Hagander 0eb59c4591 Backend support for streaming base backups
Add BASE_BACKUP command to walsender, allowing it to stream a
base backup to the client (in tar format). The syntax is still
far from ideal, that will be fixed in the switch to use a proper
grammar for walsender.

No client included yet, will come as a separate commit.

Magnus Hagander and Heikki Linnakangas
2011-01-10 14:04:19 +01:00
Magnus Hagander 4448917d51 Split pg_start_backup() and pg_stop_backup() into two pieces
Move the actual functionality into a separate function that's
easier to call internally, and change the SQL-callable function
to be a wrapper calling this.

Also create a pg_abort_backup() function, only callable internally,
that does only the most vital parts of pg_stop_backup(), making it
safe(r) to call from error handlers.
2011-01-09 21:00:28 +01:00
Heikki Linnakangas ca63029eac Fix crash in the new GiST insertion code, when an update splits the root page.
This bug was exercised by contrib/intarray/bench, as noted by Tom Lane.
2011-01-09 21:36:22 +02:00
Tom Lane 52fd2d65a3 Fix up core tsquery GIN support for new extractQuery API.
No need for the empty-prefix-match kluge to force a full scan anymore.
2011-01-09 14:34:50 -05:00
Tom Lane 304845075c Use array_contains_nulls instead of ARR_HASNULL on user-supplied arrays.
This applies the fix for bug #5784 to remaining places where we wish
to reject nulls in user-supplied arrays.  In all these places, there's
no reason not to allow a null bitmap to be present, so long as none of
the current elements are actually null.

I did not change some other places where we are looking at system catalog
entries or aggregate transition values, as the presence of a null bitmap
in such an array would be suspicious.
2011-01-09 13:09:07 -05:00
Magnus Hagander 361418be7c Ensure the directory for gram.h is created on win32
Result of bad testing of my last commit.
2011-01-09 17:01:15 +01:00
Magnus Hagander 3457514c2d Properly install gram.h on MSVC builds
This file is now needed by pgAdmin builds, which started
failing since it was missing in the installer builds.
2011-01-09 15:31:48 +01:00
Magnus Hagander db4d22d0ef Add pgreadlink() on Windows to read junction points
Add support for reading back information about the symbolic
links we've created with pgsymlink(), which are actually
Junction Points. Just like pgsymlink() can only create directory
symlinks, pgreadlink() can only read directory symlinks.
2011-01-09 15:09:19 +01:00
Michael Meskes 1066dbfb85 There is no need to have to identical functions in ecpg thus removing one of them. 2011-01-09 12:47:43 +01:00
Tom Lane adf328c0e1 Add array_contains_nulls() function in arrayfuncs.c.
This will support fixing contrib/intarray (and probably other places)
so that they don't have to fail on arrays that contain a null bitmap
but no live null entries.
2011-01-08 20:26:14 -05:00
Tom Lane 4d1b76e49e Fix up gincostestimate for new extractQuery API.
The only reason this wasn't crashing while testing the core anyarray
operators was that it was disabled for those cases because of passing the
wrong type information to get_opfamily_proc :-(.  So fix that too, and make
it insist on finding the support proc --- in hindsight, silently doing
nothing is not as sane a coping mechanism as all that.
2011-01-08 20:26:13 -05:00
Michael Meskes 833a2b57bc In ecpg's parser removed a fixed length limit for constants defining an array dimension. 2011-01-08 23:04:50 +01:00
Tom Lane 7e2f906201 Remove pg_am.amindexnulls.
The only use we have had for amindexnulls is in determining whether an
index is safe to cluster on; but since the addition of the amclusterable
flag, that usage is pretty redundant.

In passing, clean up assorted sloppiness from the last patch that touched
pg_am.h: Natts_pg_am was wrong, and ambuildempty was not documented.
2011-01-08 16:08:05 -05:00
Tom Lane 56a57473a9 Refactor GIN's handling of duplicate search entries.
The original coding could combine duplicate entries only when they
originated from the same qual condition.  In particular it could not
combine cases where multiple qual conditions all give rise to full-index
scan requests, which is an expensive case well worth optimizing.  Refactor
so that duplicates are recognized across all the quals.
2011-01-08 14:48:08 -05:00
Bruce Momjian d8d3d2a4f3 Fix pg_upgrade of large object permissions by preserving pg_auth.oid,
which is stored in pg_largeobject_metadata.

No backpatch to 9.0 because you can't migrate from 9.0 to 9.0 with the
same catversion (because of tablespace conflict), and a pre-9.0
migration to 9.0 has not large object permissions to migrate.
2011-01-07 21:59:29 -05:00
Bruce Momjian 2896c87ce4 Force pg_upgrade's to preserve pg_class.oid, not pg_class.relfilenode.
Toast tables have identical pg_class.oid and pg_class.relfilenode, but
for clarity it is good to preserve the pg_class.oid.

Update comments regarding what is preserved, and do some
variable/function renaming for clarity.
2011-01-07 21:26:13 -05:00
Tom Lane a032d50128 Fix the built-in GIN support procedure declarations in pg_proc.h.
Add more "internal" arguments so that these pg_proc entries reflect the
current preferred API.  This is purely a cosmetic change, since GIN doesn't
actually consult the pg_proc entry when calling a support function.
Accordingly, no catversion bump.
2011-01-07 20:40:48 -05:00
Tom Lane 73912e7fbd Fix GIN to support null keys, empty and null items, and full index scans.
Per my recent proposal(s).  Null key datums can now be returned by
extractValue and extractQuery functions, and will be stored in the index.
Also, placeholder entries are made for indexable items that are NULL or
contain no keys according to extractValue.  This means that the index is
now always complete, having at least one entry for every indexed heap TID,
and so we can get rid of the prohibition on full-index scans.  A full-index
scan is implemented much the same way as partial-match scans were already:
we build a bitmap representing all the TIDs found in the index, and then
drive the results off that.

Also, introduce a concept of a "search mode" that can be requested by
extractQuery when the operator requires matching to empty items (this is
just as cheap as matching to a single key) or requires a full index scan
(which is not so cheap, but it sure beats failing or giving wrong answers).
The behavior remains backward compatible for opclasses that don't return
any null keys or request a non-default search mode.

Using these features, we can now make the GIN index opclass for anyarray
behave in a way that matches the actual anyarray operators for &&, <@, @>,
and = ... which it failed to do before in assorted corner cases.

This commit fixes the core GIN code and ginarrayprocs.c, updates the
documentation, and adds some simple regression test cases for the new
behaviors using the array operators.  The tsearch and contrib GIN opclass
support functions still need to be looked over and probably fixed.

Another thing I intend to fix separately is that this is pretty inefficient
for cases where more than one scan condition needs a full-index search:
we'll run duplicate GinScanEntrys, each one of which builds a large bitmap.
There is some existing logic to merge duplicate GinScanEntrys but it needs
refactoring to make it work for entries belonging to different scan keys.

Note that most of gin.h has been split out into a new file gin_private.h,
so that gin.h doesn't export anything that's not supposed to be used by GIN
opclasses or the rest of the backend.  I did quite a bit of other code
beautification work as well, mostly fixing comments and choosing more
appropriate names for things.
2011-01-07 19:16:24 -05:00
Robert Haas 9b4271deb9 Document pg_stat_replication, bump catversion since that was overlooked.
Itagaki Takahiro, edited by me.
2011-01-07 11:06:55 -05:00
Robert Haas a9f72b4083 Improve recovery.conf.sample comments.
Jehan-Guillaume de Rorthais, with some additional wordsmithing by me.
2011-01-07 11:01:25 -05:00
Itagaki Takahiro a755ea33ae New system view pg_stat_replication displays activity of wal sender processes.
Itagaki Takahiro and Simon Riggs.
2011-01-07 20:35:38 +09:00
Bruce Momjian 46d28820b6 Improve C comments about backend variables set by pg_upgrade_support
functions.
2011-01-06 22:45:36 -05:00
Tom Lane 6c596c29a3 Update sequence_1.out for recent changes in sequence regression test. 2011-01-06 10:58:32 -05:00
Bruce Momjian 5cff5b5779 Clarify pg_upgrade's creation of the map file structure. Also clean
up pg_dump's calling of pg_upgrade_support functions.
2011-01-05 11:37:08 -05:00
Magnus Hagander 66a8a0428d Give superusers REPLIACTION permission by default
This can be overriden by using NOREPLICATION on the CREATE ROLE
statement, but by default they will have it, making it backwards
compatible and "less surprising" (given that superusers normally
override all checks).
2011-01-05 14:24:17 +01:00
Itagaki Takahiro 14158f25cd Improve psql tab completion for CREATE/ALTER ROLE [NO]REPLICATION.
Missing support for VALID UNTIL in CREATE ROLE is also added.
2011-01-04 17:56:01 +09:00
Robert Haas 7f60be72b0 Fix crash in ALTER OPERATOR CLASS/FAMILY .. SET SCHEMA.
In the previous coding, the parser emitted a List containing a C string,
which is no good, because copyObject() can't handle it.

Dimitri Fontaine
2011-01-03 22:08:55 -05:00
Robert Haas dc8a14311a Update comments in RecordTransactionCommit() to mention unlogged tables. 2011-01-03 10:29:22 -05:00
Magnus Hagander 77745cc7f1 Bump catversion, forgot in previous commit. 2011-01-03 12:50:30 +01:00
Magnus Hagander 40d9e94bd7 Add views and functions to monitor hot standby query conflicts
Add the view pg_stat_database_conflicts and a column to pg_stat_database,
and the underlying functions to provide the information.
2011-01-03 12:46:03 +01:00
Magnus Hagander c0e96b49e5 perltidy run on the MSVC build system
Forgot this with previuos commit, line it up so it's easier to
submit (readable) patches against the MSVC build system.
2011-01-03 10:44:56 +01:00
Peter Eisentraut 39b8843296 Implement remaining fields of information_schema.sequences view
Add new function pg_sequence_parameters that returns a sequence's start,
minimum, maximum, increment, and cycle values, and use that in the view.
(bug #5662; design suggestion by Tom Lane)

Also slightly adjust the view's column order and permissions after review of
SQL standard.
2011-01-02 15:15:21 +02:00
Robert Haas e657b55e66 Fix typo.
Noted by Magnus Hagander.
2011-01-02 07:26:10 -05:00
Robert Haas 0d692a0dc9 Basic foreign table support.
Foreign tables are a core component of SQL/MED.  This commit does
not provide a working SQL/MED infrastructure, because foreign tables
cannot yet be queried.  Support for foreign table scans will need to
be added in a future patch.  However, this patch creates the necessary
system catalog structure, syntax support, and support for ancillary
operations such as COMMENT and SECURITY LABEL.

Shigeru Hanada, heavily revised by Robert Haas
2011-01-01 23:48:11 -05:00
Robert Haas d7acf6cc4a Fix pg_dump support for security labels on columns.
Along the way, correct an erroneous comment.
2011-01-01 17:44:28 -05:00
Peter Eisentraut 6a208aa404 Allow casting a table's row type to the table's supertype if it's a typed table
This is analogous to the existing facility that allows casting a row type to a
supertable's row type.
2011-01-01 23:04:14 +02:00
Bruce Momjian 92a73d2190 Add #include <time.h> to pg_ctl.c to fix compiler warning. 2011-01-01 15:55:36 -05:00
Bruce Momjian 5d950e3b0c Stamp copyrights for year 2011. 2011-01-01 13:18:15 -05:00
Bruce Momjian 30aeda4394 Include the first valid listen address in pg_ctl to improve server start
"wait" detection and add postmaster start time to help determine if the
postmaster is actually using the specified data directory.
2010-12-31 17:25:02 -05:00
Tom Lane 39c8dd6620 Invert and rename flag variable to improve code readability.
No change in functionality.  Per discussion with Robert.
2010-12-31 11:59:38 -05:00
Tom Lane 7b46401557 Move symbols for ExecMergeJoin's state machine into nodeMergejoin.c.
There's no reason for these values to be known anywhere else.  After
doing this, executor/execdefs.h is vestigial and can be removed.
2010-12-30 22:12:40 -05:00
Tom Lane f4e4b32743 Support RIGHT and FULL OUTER JOIN in hash joins.
This is advantageous first because it allows us to hash the smaller table
regardless of the outer-join type, and second because hash join can be more
flexible than merge join in dealing with arbitrary join quals in a FULL
join.  For merge join all the join quals have to be mergejoinable, but hash
join will work so long as there's at least one hashjoinable qual --- the
others can be any condition.  (This is true essentially because we don't
keep per-inner-tuple match flags in merge join, while hash join can do so.)

To do this, we need a has-it-been-matched flag for each tuple in the
hashtable, not just one for the current outer tuple.  The key idea that
makes this practical is that we can store the match flag in the tuple's
infomask, since there are lots of bits there that are of no interest for a
MinimalTuple.  So we aren't increasing the size of the hashtable at all for
the feature.

To write this without turning the hash code into even more of a pile of
spaghetti than it already was, I rewrote ExecHashJoin in a state-machine
style, similar to ExecMergeJoin.  Other than that decision, it was pretty
straightforward.
2010-12-30 20:26:08 -05:00
Alvaro Herrera 55573990ca Avoid unnecessary public struct declaration in slru.h
Instead, declare a public wrapper of the sole function using it for
external callers, so that they don't have to always pass a NULL
argument.

Author: Kevin Grittner
2010-12-30 12:09:17 -03:00
Robert Haas d2bc1c9907 Bump XLOG_PAGE_MAGIC.
The unlogged tables patch (commit 53dbc27c62,
2010-12-29) should have done this, since it changes the format of an
XLOG_SMGR_CREATE record.
2010-12-29 07:19:21 -05:00
Robert Haas 53dbc27c62 Support unlogged tables.
The contents of an unlogged table are WAL-logged; thus, they are not
available on standby servers and are truncated whenever the database
system enters recovery.  Indexes on unlogged tables are also unlogged.
Unlogged GiST indexes are not currently supported.
2010-12-29 06:48:53 -05:00
Magnus Hagander 9b8aff8c19 Add REPLICATION privilege for ROLEs
This privilege is required to do Streaming Replication, instead of
superuser, making it possible to set up a SR slave that doesn't
have write permissions on the master.

Superuser privileges do NOT override this check, so in order to
use the default superuser account for replication it must be
explicitly granted the REPLICATION permissions. This is backwards
incompatible change, in the interest of higher default security.
2010-12-29 11:05:03 +01:00
Tom Lane f2ba1e994c Avoid unexpected conversion overflow in planner for distant date values.
The "date" type supports a wider range of dates than int64 timestamps do.
However, there is pre-int64-timestamp code in the planner that assumes that
all date values can be converted to timestamp with impunity.  Fortunately,
what we really need out of the conversion is always a double (float8)
value; so even when the date is out of timestamp's range it's possible to
produce a sane answer.  All we need is a code path that doesn't try to
force the result into int64.  Per trouble report from David Rericha.

Back-patch to all supported versions.  Although this is surely a corner
case, there's not much point in advertising a date range wider than
timestamp's if we will choke on such values in unexpected places.
2010-12-28 22:49:57 -05:00
Tom Lane 81a530a65e Fix ill-advised placement of PGRES_COPY_BOTH enum value.
It must be added at the end of the ExecStatusType enum to avoid ABI
breakage compared to previous libpq versions.  Noted by Magnus.
2010-12-28 11:02:10 -05:00
Bruce Momjian b4d3792daa Another fix for larger postmaster.pid files. 2010-12-28 09:34:46 -05:00
Bruce Momjian bada44a2a2 Fix code to properly pull out shared memory key now that the
postmaster.pid file is larger than in previous major versions.
This is a bug introduced when I added lines to the file recently.
2010-12-27 23:11:33 -05:00
Tom Lane f79136439f Remove -fno-operator-names switch from cpluspluscheck.
No longer needed now that bitand() and bitor() have been renamed.
2010-12-27 15:03:24 -05:00
Tom Lane 84fc571395 Rename the C functions bitand(), bitor() to bit_and(), bit_or().
This is to avoid use of the C++ keywords "bitand" and "bitor" in
the header file utils/varbit.h.  Note the functions' SQL-level
names are not changed, only their C-level names.

In passing, make some comments in varbit.c conform to project-standard
layout.
2010-12-27 14:57:41 -05:00
Tom Lane 8c61f81b31 Rearrange cpluspluscheck to check just one .h file at a time.
This is slower than the original coding but avoids the problem of
including files in an unpredictable order.  Aside from being more
trustworthy, we can get rid of some exclusions that were formerly
made for what turn out to be ordering or re-inclusion problems.

I also modified it to include libpq's exported files in the check.
ecpg should be included as well, but I'm unclear on which ecpg .h
files are meant to be included by clients.
2010-12-27 12:51:44 -05:00
Tom Lane 37b61a69f3 Fix failure of executor/hashjoin.h to compile standalone.
Noted while experimenting with cpluspluscheck.
2010-12-27 12:20:09 -05:00
Tom Lane a977db6f1c Tweak cpluspluscheck to avoid directly #include'ing gram.h.
gram.h has ordering dependencies, which are satisfied when it's included
from gramparse.h, but might not be if it's pulled in directly.
2010-12-27 11:36:52 -05:00
Tom Lane 275411912d Fix ill-chosen use of "private" as an argument and struct field name.
"private" is a keyword in C++, so this breaks the poorly-enforced policy
that header files should be include-able in C++ code.  Per report from
Craig Ringer and some investigation with cpluspluscheck.
2010-12-27 11:26:19 -05:00
Robert Haas 63676ebff4 Corrections to patch adding SQL/MED error codes.
My previous commit, 85cff3ce7f on
2010-12-25, failed to update errcodes.sgml or plerrcodes.h.  This patch
corrects that oversight, per a gripe from Tom Lane, and also corrects
a typographical error.
2010-12-26 21:35:25 -05:00
Andrew Dunstan a534728afb Only build in crashdump support on Windows if there's a working dbghelp.h. 2010-12-26 10:34:47 -05:00
Robert Haas 85cff3ce7f Add foreign data wrapper error code values for SQL/MED.
Extracted from a much larger patch by Shigeru Hanada.
2010-12-25 13:57:39 -05:00
Andrew Dunstan 04ee0db6b2 Allow vpath builds and regression tests to succeed on Mingw. Backpatch to release 8.4 - earlier releases would require more changes and it's not worth the trouble. 2010-12-24 13:31:28 -05:00
Bruce Momjian 5000472112 Remove quotes from boolean recovery.conf.sample parameters, now that the
quotes are not required.  This now matches postgresql.conf's
specification of booleans.
2010-12-24 11:51:51 -05:00
Bruce Momjian 075354ad1b Improve "pg_ctl -w start" server detection by writing the postmaster
port and socket directory into postmaster.pid, and have pg_ctl read from
that file, for use by PQping().
2010-12-24 09:45:52 -05:00
Michael Meskes 727a5a1620 Added rule to ecpg lexer to accept "Unicode surrogate pair in extended quoted
string". This is not really needed because the string gets copied to the output
untranslated anyway, but by adding this rule the lexer stays in sync with the
backend lexer.
2010-12-23 20:37:42 +01:00
Heikki Linnakangas 9de3aa65f0 Rewrite the GiST insertion logic so that we don't need the post-recovery
cleanup stage to finish incomplete inserts or splits anymore. There was two
reasons for the cleanup step:

1. When a new tuple was inserted to a leaf page, the downlink in the parent
needed to be updated to contain (ie. to be consistent with) the new key.
Updating the parent in turn might require recursively updating the parent of
the parent. We now handle that by updating the parent while traversing down
the tree, so that when we insert the leaf tuple, all the parents are already
consistent with the new key, and the tree is consistent at every step.

2. When a page is split, we need to insert the downlink for the new right
page(s), and update the downlink for the original page to not include keys
that moved to the right page(s). We now handle that by setting a new flag,
F_FOLLOW_RIGHT, on the non-rightmost pages in the split. When that flag is
set, scans always follow the rightlink, regardless of the NSN mechanism used
to detect concurrent page splits. That way the tree is consistent right after
split, even though the downlink is still missing. This is very similar to the
way B-tree splits are handled. When the downlink is inserted in the parent,
the flag is cleared. To keep the insertion algorithm simple, when an
insertion sees an incomplete split, indicated by the F_FOLLOW_RIGHT flag, it
finishes the split before doing anything else.

These changes allow removing the whole "invalid tuple" mechanism, but I
retained the scan code to still follow invalid tuples correctly. While we
don't create any such tuples anymore, we want to handle them gracefully in
case you pg_upgrade a GiST index that has them. If we encounter any on an
insert, though, we just throw an error saying that you need to REINDEX.

The issue that got me into doing this is that if you did a checkpoint while
an insert or split was in progress, and the checkpoint finishes quickly so
that there is no WAL record related to the insert between RedoRecPtr and the
checkpoint record, recovery from that checkpoint would not know to finish
the incomplete insert. IOW, we have the same issue we solved with the
rm_safe_restartpoint mechanism during normal operation too. It's highly
unlikely to happen in practice, and this fix is far too large to backpatch,
so we're just going to live with in previous versions, but this refactoring
fixes it going forward.

With this patch, you don't get the annoying
'index "FOO" needs VACUUM or REINDEX to finish crash recovery' notices
anymore if you crash at an unfortunate moment.
2010-12-23 16:21:47 +02:00
Magnus Hagander de9a4c27fe Add PQlibVersion() function to libpq
This function is like the PQserverVersion() function except
it returns the version of libpq, making it possible for a client
program or driver to determine which version of libpq is in
use at runtime, and not just at link time.

Suggested by Harald Armin Massa and several others.
2010-12-22 14:23:56 +01:00
Robert Haas 32ba2b5160 Use memcmp() rather than strncmp() when shorter string length is known.
It appears that this will be faster for all but the shortest strings;
at least one some platforms, memcmp() can use word-at-a-time comparisons.

Noah Misch, somewhat pared down.
2010-12-21 22:11:40 -05:00
Robert Haas c5160b7eec Fix typos.
Andreas Karlsson
2010-12-21 17:58:53 -05:00
Robert Haas 24ecde7742 Work around unfortunate getppid() behavior on BSD-ish systems.
On MacOS X, and apparently also on other BSD-derived systems, attaching
a debugger causes getppid() to return the pid of the debugging process
rather than the actual parent PID.  As a result, debugging the
autovacuum launcher, startup process, or WAL sender on such systems
causes it to exit, because the previous coding of PostmasterIsAlive()
detects postmaster death by testing whether getppid() == PostmasterPid.

Work around that behavior by checking the return value of getppid()
more carefully.  If it's PostmasterPid, the postmaster must be alive;
if it's 1, assume the postmaster is dead.  If it's any other value,
assume we've been debugged and fall through to the less-reliable
kill() test.

Review by Tom Lane.
2010-12-21 06:30:32 -05:00
Robert Haas f6a0863e3c Allow transactions that don't write WAL to commit asynchronously.
This case can arise if a transaction has written data, but only to
temporary tables.  Loss of the commit record in case of a crash won't
matter, because the temporary tables will be lost anyway.

Reviewed by Heikki Linnakangas and Simon Riggs.
2010-12-20 12:59:33 -05:00
Magnus Hagander d382828f6e Remove thread dumping constant that requires newer Platform SDK
Since we're not multithreaded it only provides marginally useful
information, and it does require a newer version of the Platform SDK
than we target. We may want to reconsider this in the future along
with a fix for MinGW.
2010-12-19 21:32:58 +01:00
Tom Lane 1b19e2c0ba Fix up handling of simple-form CASE with constant test expression.
eval_const_expressions() can replace CaseTestExprs with constants when
the surrounding CASE's test expression is a constant.  This confuses
ruleutils.c's heuristic for deparsing simple-form CASEs, leading to
Assert failures or "unexpected CASE WHEN clause" errors.  I had put in
a hack solution for that years ago (see commit
514ce7a331 of 2006-10-01), but bug #5794
from Peter Speck shows that that solution failed to cover all cases.

Fortunately, there's a much better way, which came to me upon reflecting
that Peter's "CASE TRUE WHEN" seemed pretty redundant: we can "simplify"
the simple-form CASE to the general form of CASE, by simply omitting the
constant test expression from the rebuilt CASE construct.  This is
intuitively valid because there is no need for the executor to evaluate
the test expression at runtime; it will never be referenced, because any
CaseTestExprs that would have referenced it are now replaced by constants.
This won't save a whole lot of cycles, since evaluating a Const is pretty
cheap, but a cycle saved is a cycle earned.  In any case it beats kluging
ruleutils.c still further.  So this patch improves const-simplification
and reverts the previous change in ruleutils.c.

Back-patch to all supported branches.  The bug exists in 8.1 too, but it's
out of warranty.
2010-12-19 15:30:44 -05:00
Tom Lane abc1026269 Fix erroneous parsing of tsquery input "... & !(subexpression) | ..."
After parsing a parenthesized subexpression, we must pop all pending
ANDs and NOTs off the stack, just like the case for a simple operand.
Per bug #5793.

Also fix clones of this routine in contrib/intarray and contrib/ltree,
where input of types query_int and ltxtquery had the same problem.

Back-patch to all supported versions.
2010-12-19 12:48:34 -05:00
Magnus Hagander dcb09b595f Support for collecting crash dumps on Windows
Add support for collecting "minidump" style crash dumps on
Windows, by setting up an exception handling filter. Crash
dumps will be generated in PGDATA/crashdumps if the directory
is created (the existance of the directory is used as on/off
switch for the generation of the dumps).

Craig Ringer and Magnus Hagander
2010-12-19 16:45:28 +01:00
Bruce Momjian 7e95337d58 Properly print the IP number and "localhost" for failed localhost
connections when the server is down, on Win32.
2010-12-18 11:26:17 -05:00
Magnus Hagander 4754dbf4c3 Make GUC variables for syslog and SSL always visible
Make the variables visible (but not used) even when
support is not compiled in.
2010-12-18 16:53:59 +01:00
Alvaro Herrera 3026027ec3 set_ps_display when calling functions via fastpath
This improves tag output by log_line_prefix
2010-12-17 18:51:22 -03:00
Alvaro Herrera b68193c0c7 Remove unnecessary definition for autovacuum in SignalSomeChildren. 2010-12-17 15:59:19 -03:00
Robert Haas 8bd4b89e24 Try to save a kernel call in ResolveRecoveryConflictWithVirtualXIDs.
If there's no work to be done, just exit quickly, before initialization.
2010-12-17 11:32:02 -05:00
Robert Haas 611fed3712 Reset 'ps' display just once when resolving VXID conflicts.
This prevents the word "waiting" from briefly disappearing from the ps
status line when ResolveRecoveryConflictWithVirtualXIDs begins a new
iteration of the outer loop.

Along the way, remove some useless pgstat_report_waiting() calls;
the startup process doesn't appear in pg_stat_activity.

Fujii Masao
2010-12-17 08:30:57 -05:00
Tom Lane 14ed7735f5 Improve comments around startup_hacks() code.
These comments were not updated when we added the EXEC_BACKEND
mechanism for Windows, even though it rendered them inaccurate.

Also unify two unnecessarily-separate #ifdef __alpha code blocks.
2010-12-16 17:57:57 -05:00
Tom Lane 61b53695fb Remove optreset from src/port/ implementations of getopt and getopt_long.
We don't actually need optreset, because we can easily fix the code to
ensure that it's cleanly restartable after having completed a scan over the
argv array; which is the only case we need to restart in.  Getting rid of
it avoids a class of interactions with the system libraries and allows
reversion of my change of yesterday in postmaster.c and postgres.c.

Back-patch to 8.4.  Before that the getopt code was a bit different anyway.
2010-12-16 16:23:05 -05:00
Alvaro Herrera cd1fefa973 Avoid clobbering errno, per comment from Tom. 2010-12-16 17:15:37 -03:00
Alvaro Herrera 83c759ea0e Fix inconsequential FILE pointer leakage 2010-12-16 16:45:11 -03:00
Alvaro Herrera e359b8496d Add some minor missing error checks 2010-12-16 12:23:07 -03:00
Alvaro Herrera 16ca75baeb Simplify SignalSomeChildren(BACKEND_TYPE_ALL) to SignalChildren() 2010-12-16 12:23:07 -03:00
Bruce Momjian 48da2b87e3 Fix crash caused by NULL lookup when reporting IP address of failed
libpq connection, per report from Magnus.  This happens only on GIT
master and only on Win32 because that is the platform where "" maps to
an IP address (localhost).
2010-12-16 10:13:43 -05:00
Tom Lane 5cdd65f324 Fix up getopt() reset management so it works on recent mingw.
The mingw people don't appear to care about compatibility with non-GNU
versions of getopt, so force use of our own copy of getopt on Windows.
Also, ensure that we make use of optreset when using our own copy.

Per report from Andrew Dunstan.  Back-patch to all versions supported
on Windows.
2010-12-15 23:50:41 -05:00
Robert Haas 290f1603b4 Some copy editing of pg_read_binary_file() patch. 2010-12-15 21:02:31 -05:00
Itagaki Takahiro 03db44eae3 Add pg_read_binary_file() and whole-file-at-once versions of pg_read_file().
One of the usages of the binary version is to read files in a different
encoding from the server encoding.

Dimitri Fontaine and Itagaki Takahiro.
2010-12-16 06:56:28 +09:00
Robert Haas 34c70c7ac4 Instrument checkpoint sync calls.
Greg Smith, reviewed by Jeff Janes
2010-12-14 09:26:19 -05:00
Robert Haas 9878e295dc Improved tab completion for views with triggers.
Allow INSERT INTO, UPDATE, and DELETE FROM to be completed with
either the name of a table (as before) or the name of a view with
an appropriate INSTEAD OF rule.

Along the way, allow CREATE TRIGGER to be completed with INSTEAD OF,
as well as BEFORE and AFTER.

David Fetter, reviewed by Itagaki Takahiro
2010-12-13 22:46:55 -05:00
Robert Haas d368e1a2a7 Allow plugins to suppress inlining and hook function entry/exit/abort.
This is intended as infrastructure to allow an eventual SE-Linux plugin to
support trusted procedures.

KaiGai Kohei
2010-12-13 19:15:53 -05:00
Tom Lane f5e4f743e6 Update time zone data files to tzdata release 2010o: DST law changes in
Fiji and Samoa.  Historical corrections for Hong Kong.
2010-12-13 12:45:31 -05:00
Robert Haas 5f7b58fad8 Generalize concept of temporary relations to "relation persistence".
This commit replaces pg_class.relistemp with pg_class.relpersistence;
and also modifies the RangeVar node type to carry relpersistence rather
than istemp.  It also removes removes rd_istemp from RelationData and
instead performs the correct computation based on relpersistence.

For clarity, we add three new macros: RelationNeedsWAL(),
RelationUsesLocalBuffers(), and RelationUsesTempNamespace(), so that we
can clarify the purpose of each check that previous depended on
rd_istemp.

This is intended as infrastructure for the upcoming unlogged tables
patch, as well as for future possible work on global temporary tables.
2010-12-13 12:34:26 -05:00
Tom Lane 0c90442355 Reset all database-level stats in pgstat_recv_resetcounter().
We were failing to zero out some pg_stat_database counters that have
been added since the initial pgstats coding.  This is a bug, but not
back-patching the fix since changing this behavior in a minor release
seems a cure worse than the disease.

Report and patch by Tomas Vondra.
2010-12-12 15:09:53 -05:00
Tom Lane 5132ad8bdf Make S_IRGRP etc available in mingw builds as well as MSVC.
(Hm, I wonder whether BCC defines them either...)

Also label dangling endifs a bit better in this area.
2010-12-12 13:43:44 -05:00
Tom Lane 1319002e2e Provide a complete set of file-permission-bit macros in win32.h.
My previous patch exposed the fact that we didn't have these.  Those
hard-wired octal constants were actually wrong on Windows, not just
inconsistent.
2010-12-11 13:11:18 -05:00
Robert Haas d3d414696f Allow bidirectional copy messages in streaming replication mode.
Fujii Masao.  Review by Alvaro Herrera, Tom Lane, and myself.
2010-12-11 09:27:37 -05:00
Magnus Hagander 20f3964291 Add required new port files to MSVC builds. 2010-12-11 14:19:08 +01:00
Tom Lane 671199929d Move a couple of initdb's subroutines into src/port/.
mkdir_p and check_data_dir will be useful in CREATE TABLESPACE, since we
have agreed that that command should handle subdirectory creation just like
initdb creates the PGDATA directory.  Push them into src/port/ so that they
are available to both initdb and the backend.  Rename to pg_mkdir_p and
pg_check_dir, just to be on the safe side.  Add FreeBSD's copyright notice
to pgmkdirp.c, since that's where the code came from originally (this
really should have been in initdb.c).  Very marginal code/comment cleanup.
2010-12-10 19:42:44 -05:00
Tom Lane 04f4e10cfc Use symbolic names not octal constants for file permission flags.
Purely cosmetic patch to make our coding standards more consistent ---
we were doing symbolic some places and octal other places.  This patch
fixes all C-coded uses of mkdir, chmod, and umask.  There might be some
other calls I missed.  Inconsistency noted while researching tablespace
directory permissions issue.
2010-12-10 17:35:33 -05:00
Tom Lane 244407a710 Fix efficiency problems in tuplestore_trim().
The original coding in tuplestore_trim() was only meant to work efficiently
in cases where each trim call deleted most of the tuples in the store.
Which, in fact, was the pattern of the original usage with a Material node
supporting mark/restore operations underneath a MergeJoin.  However,
WindowAgg now uses tuplestores and it has considerably less friendly
trimming behavior.  In particular it can attempt to trim one tuple at a
time off a large tuplestore.  tuplestore_trim() had O(N^2) runtime in this
situation because of repeatedly shifting its tuple pointer array.  Fix by
avoiding shifting the array until a reasonably large number of tuples have
been deleted.  This can waste some pointer space, but we do still reclaim
the tuples themselves, so the percentage wastage should be pretty small.

Per Jie Li's report of slow percent_rank() evaluation.  cume_dist() and
ntile() would certainly be affected as well, along with any other window
function that has a moving frame start and requires reading substantially
ahead of the current row.

Back-patch to 8.4, where window functions were introduced.  There's no
need to tweak it before that.
2010-12-10 11:33:38 -05:00
Tom Lane 663fc32e26 Eliminate O(N^2) behavior in parallel restore with many blobs.
With hundreds of thousands of TOC entries, the repeated searches in
reduce_dependencies() become the dominant cost.  Get rid of that searching
by constructing reverse-dependency lists, which we can do in O(N) time
during the fix_dependencies() preprocessing.  I chose to store the reverse
dependencies as DumpId arrays for consistency with the forward-dependency
representation, and keep the previously-transient tocsByDumpId[] array
around to locate actual TOC entry structs quickly from dump IDs.

While this fixes the slow case reported by Vlad Arkhipov, there is still
a potential for O(N^2) behavior with sufficiently many tables:
fix_dependencies itself, as well as mark_create_done and
inhibit_data_for_failed_table, are doing repeated searches to deal with
table-to-table-data dependencies.  Possibly this work could be extended
to deal with that, although the latter two functions are also used in
non-parallel restore where we currently don't run fix_dependencies.

Another TODO is that we fail to parallelize restore of multiple blobs
at all.  This appears to require changes in the archive format to fix.

Back-patch to 9.0 where the problem was reported.  8.4 has potential issues
as well; but since it doesn't create a separate TOC entry for each blob,
it's at much less risk of having enough TOC entries to cause real problems.
2010-12-09 13:03:11 -05:00
Simon Riggs 9975c683b1 Self review of previous patch. Fix assumption that xmax >= xmin. 2010-12-09 10:20:49 +00:00
Simon Riggs b9075a6d2f Reduce spurious Hot Standby conflicts from never-visible records.
Hot Standby conflicts only with tuples that were visible at
some point. So ignore tuples from aborted transactions or for
tuples updated/deleted during the inserting transaction when
generating the conflict transaction ids.

Following detailed analysis and test case by Noah Misch.
Original report covered btree delete records, correctly observed
by Heikki Linnakangas that this applies to other cases also.
Fix covers all sources of cleanup records via common code.
2010-12-09 09:41:47 +00:00
Tom Lane 576477e73c Force default wal_sync_method to be fdatasync on Linux.
Recent versions of the Linux system header files cause xlogdefs.h to
believe that open_datasync should be the default sync method, whereas
formerly fdatasync was the default on Linux.  open_datasync is a bad
choice, first because it doesn't actually outperform fdatasync (in fact
the reverse), and second because we try to use O_DIRECT with it, causing
failures on certain filesystems (e.g., ext4 with data=journal option).
This part of the patch is largely per a proposal from Marti Raudsepp.
More extensive changes are likely to follow in HEAD, but this is as much
change as we want to back-patch.

Also clean up confusing code and incorrect documentation surrounding the
fsync_writethrough option.  Those changes shouldn't result in any actual
behavioral change, but I chose to back-patch them anyway to keep the
branches looking similar in this area.

In 9.0 and HEAD, also do some copy-editing on the WAL Reliability
documentation section.

Back-patch to all supported branches, since any of them might get used
on modern Linux versions.
2010-12-08 20:01:09 -05:00
Simon Riggs e620ee35b2 Optimize commit_siblings in two ways to improve group commit.
First, avoid scanning the whole ProcArray once we know there
are at least commit_siblings active; second, skip the check
altogether if commit_siblings = 0.

Greg Smith
2010-12-08 18:48:03 +00:00
Heikki Linnakangas 5a031a5556 Fix bugs in the hot standby known-assigned-xids tracking logic. If there's
an old transaction running in the master, and a lot of transactions have
started and finished since, and a WAL-record is written in the gap between
the creating the running-xacts snapshot and WAL-logging it, recovery will fail
with "too many KnownAssignedXids" error. This bug was reported by
Joachim Wieland on Nov 19th.

In the same scenario, when fewer transactions have started so that all the
xids fit in KnownAssignedXids despite the first bug, a more serious bug
arises. We incorrectly initialize the clog code with the oldest still running
transaction, and when we see the WAL record belonging to a transaction with
an XID larger than one that committed already before the checkpoint we're
recovering from, we zero the clog page containing the already committed
transaction, leading to data loss.

In hindsight, trying to track xids in the known-assigned-xids array before
seeing the running-xacts record was too complicated. To fix that, hold
XidGenLock while the running-xacts snapshot is taken and WAL-logged. That
ensures that no transaction can begin or end in that gap, so that in recvoery
we know that the snapshot contains all transactions running at that point in
WAL.
2010-12-07 09:23:30 +01:00
Tom Lane 8b56928097 Add a stack overflow check to copyObject().
There are some code paths, such as SPI_execute(), where we invoke
copyObject() on raw parse trees before doing parse analysis on them.  Since
the bison grammar is capable of building heavily nested parsetrees while
itself using only minimal stack depth, this means that copyObject() can be
the front-line function that hits stack overflow before anything else does.
Accordingly, it had better have a check_stack_depth() call.  I did a bit of
performance testing and found that this slows down copyObject() by only a
few percent, so the hit ought to be negligible in the context of complete
processing of a query.

Per off-list report from Toshihide Katayama.  Back-patch to all supported
branches.
2010-12-06 22:55:43 -05:00
Andrew Dunstan af1a614ec6 Allow the low level COPY routines to read arbitrary numbers of fields.
This doesn't involve any user-visible change in behavior, but will be
useful when the COPY routines are exposed to allow their use by Foreign
Data Wrapper routines, which will be able to use these routines to read
irregular CSV files, for example.
2010-12-06 15:31:55 -05:00
Heikki Linnakangas 95e42a2c29 Fix two typos, by Fujii Masao. 2010-12-06 12:38:05 +01:00
Peter Eisentraut 951d786121 Put only single space after "Sort Method:", for consistency 2010-12-06 13:35:47 +02:00
Tom Lane d1001a78ce Reduce memory consumption inside inheritance_planner().
Avoid eating quite so much memory for large inheritance trees, by
reclaiming the space used by temporary copies of the original parsetree and
range table, as well as the workspace needed during planning.  The cost is
needing to copy the finished plan trees out of the child memory context.
Although this looks like it ought to slow things down, my testing shows
it actually is faster, apparently because fewer interactions with malloc()
are needed and/or we can do the work within a more readily cacheable amount
of memory.  That result might be platform-dependent, but I'll take it.

Per a gripe from John Papandriopoulos, in which it was pointed out that the
memory consumption actually grew as O(N^2) for sufficiently many child
tables, since we were creating N copies of the N-element range table.
2010-12-05 15:10:28 -05:00
Tom Lane d1f5a92e18 Fix two small bugs in new gistget.c logic.
1. Complain, rather than silently doing nothing, if an "invalid" tuple
is found on a leaf page.  Per off-list discussion with Heikki.

2. Fix oversight in code that removes a GISTSearchItem from the search
queue: we have to reset lastHeap if this was the last heap item in the
parent GISTSearchTreeItem.  Otherwise subsequent additions will do the
wrong thing.  This was probably masked in early testing because in typical
cases the parent item would now be completely empty and would be deleted on
next call.  You'd need a queued non-leaf page at exactly the same distance
as a heap tuple to expose the bug.
2010-12-04 13:47:08 -05:00
Peter Eisentraut 387e468b82 Make output width consistent for all ways of invoking a regression test
run_schedule() and run_single_test() were using different output widths, which
would show up in bigcheck/bigtest, for example.
2010-12-04 17:34:48 +02:00
Tom Lane e194a942f9 Update comment to match later code changes. 2010-12-04 03:21:49 -05:00
Tom Lane b576757d7e Add external documentation for KNNGIST. 2010-12-03 23:49:06 -05:00
Tom Lane 04910a3ad5 Put back gistgettuple's check for backwards scan request.
On reflection it's a bad idea for the KNNGIST patch to have removed that.
We don't want it silently returning incorrect answers.
2010-12-03 22:43:01 -05:00
Tom Lane 554506871b KNNGIST, otherwise known as order-by-operator support for GIST.
This commit represents a rather heavily editorialized version of
Teodor's builtin_knngist_itself-0.8.2 and builtin_knngist_proc-0.8.1
patches.  I redid the opclass API to add a separate Distance method
instead of turning the Consistent method into an illogical mess,
fixed some bit-rot in the rbtree interfaces, and generally worked over
the code style and comments.

There's still no non-code documentation to speak of, but I'll work on
that separately.  Some contrib-module changes are also yet to come
(right now, point <-> point is the only KNN-ified operator).

Teodor Sigaev and Tom Lane
2010-12-03 20:53:29 -05:00
Robert Haas 5ef6c91383 Remove now-outdated mention of quotes being required in recovery.conf.
Noted by Itagaki Takahiro.
2010-12-03 09:00:18 -05:00
Robert Haas 970a18687f Use GUC lexer for recovery.conf parsing.
This eliminates some crufty, special-purpose code and, as a non-trivial
side benefit, allows recovery.conf parameters to be unquoted.

Dimitri Fontaine, with review and cleanup by Alvaro Herrera, Itagaki
Takahiro, and me.
2010-12-03 08:56:44 -05:00
Heikki Linnakangas 9cea52a5a3 Remove misleading comments. Move _Clone and _DeClone functions before
the "END OF FORMAT CALLBACKS" comment, because they are format callbacks too.
2010-12-03 14:58:24 +02:00
Itagaki Takahiro fd223c7407 Remove unnecessary string null-termination in pg_convert.
We can directly verify the unterminated input with pg_verify_mbstr_len.
2010-12-03 12:00:27 +09:00
Tom Lane d583f10b7e Create core infrastructure for KNNGIST.
This is a heavily revised version of builtin_knngist_core-0.9.  The
ordering operators are no longer mixed in with actual quals, which would
have confused not only humans but significant parts of the planner.
Instead, ordering operators are carried separately throughout planning and
execution.

Since the API for ambeginscan and amrescan functions had to be changed
anyway, this commit takes the opportunity to rationalize that a bit.
RelationGetIndexScan no longer forces a premature index_rescan call;
instead, callers of index_beginscan must call index_rescan too.  Aside from
making the AM-side initialization logic a bit less peculiar, this has the
advantage that we do not make a useless extra am_rescan call when there are
runtime key values.  AMs formerly could not assume that the key values
passed to amrescan were actually valid; now they can.

Teodor Sigaev and Tom Lane
2010-12-02 20:51:37 -05:00
Alvaro Herrera d7e5d151da Move private struct declaration to compress_io.c
Keep only the typedef in the header file.
2010-12-02 17:45:13 -03:00
Alvaro Herrera 0025b76f4f Remove trailing whitespace 2010-12-02 17:45:13 -03:00
Alvaro Herrera d67a39c326 Remove useless struct declaration 2010-12-02 17:45:12 -03:00
Alvaro Herrera 7f4a7af2fd Silence compiler 2010-12-02 17:45:12 -03:00
Heikki Linnakangas bf9aa490db Refactor the pg_dump zlib code from pg_backup_custom.c to a separate file,
to make it easier to reuse that code. There is no user-visible changes.

This is in preparation for the patch to add a new archive format, a directory,
to perform a custom-like dump but with each table being dumped to a separate
file (that in turn is a prerequisite for parallel pg_dump). This also makes it
easier to add new compression methods in the future, and makes the
pg_backup_custom.c code easier to read, when the compression-related code is
factored out.

Joachim Wieland, with heavy editorialization by me.
2010-12-02 21:39:03 +02:00
Tom Lane 225f0aa3df Prevent inlining a SQL function with multiple OUT parameters.
There were corner cases in which the planner would attempt to inline such
a function, which would result in a failure at runtime due to loss of
information about exactly what the result record type is.  Fix by disabling
inlining when the function's recorded result type is RECORD.  There might
be some sub-cases where inlining could still be allowed, but this is a
simple and backpatchable fix, so leave refinements for another day.
Per bug #5777 from Nate Carson.

Back-patch to all supported branches.  8.1 happens to avoid a core-dump
here, but it still does the wrong thing.
2010-12-01 00:53:18 -05:00
Tom Lane c0b5fac701 Simplify and speed up mapping of index opfamilies to pathkeys.
Formerly we looked up the operators associated with each index (caching
them in relcache) and then the planner looked up the btree opfamily
containing such operators in order to build the btree-centric pathkey
representation that describes the index's sort order.  This is quite
pointless for btree indexes: we might as well just use the index's opfamily
information directly.  That saves syscache lookup cycles during planning,
and furthermore allows us to eliminate the relcache's caching of operators
altogether, which may help in reducing backend startup time.

I added code to plancat.c to perform the same type of double lookup
on-the-fly if it's ever faced with a non-btree amcanorder index AM.
If such a thing actually becomes interesting for production, we should
replace that logic with some more-direct method for identifying the
corresponding btree opfamily; but it's not worth spending effort on now.

There is considerably more to do pursuant to my recent proposal to get rid
of sort-operator-based representations of sort orderings, but this patch
grabs some of the low-hanging fruit.  I'll look at the remainder of that
work after the current commitfest.
2010-11-29 12:30:43 -05:00
Simon Riggs ed78384acd Move call to GetTopTransactionId() earlier in LockAcquire(),
removing an infrequently occurring race condition in Hot Standby.
An xid must be assigned before a lock appears in shared memory,
rather than immediately after, else GetRunningTransactionLocks()
may see InvalidTransactionId, causing assertion failures during
lock processing on standby.

Bug report and diagnosis by Fujii Masao, fix by me.
2010-11-29 01:08:02 +00:00
Bruce Momjian 1f48290a9d In libpq/Makefile, use OBJS += as a way to break up long link lines into
something that can be documented.
2010-11-27 11:03:23 -05:00
Tom Lane 49cd8a3f81 On further testing, PQping also needs an explicit check for AUTH_REQ.
The pg_fe_sendauth code might fail if it can't handle the authentication
request message type --- if so, ping should still say the server is up.
2010-11-27 02:11:45 -05:00
Tom Lane db96e1ccfc Rewrite PQping to be more like what we agreed to last week.
Basically, we want to distinguish all cases where the connection was
not made from those where it was.  A convenient proxy for this is to
see if we got a message with a SQLSTATE code back from the postmaster.
This presumes that the postmaster will always send us a SQLSTATE in
a failure message, which is true for 7.4 and later postmasters in
every case except fork failure.  (We could possibly complicate the
postmaster code to do something about that, but it seems not worth
the trouble, especially since pg_ctl's response for that case should
be to keep waiting anyway.)

If we did get a SQLSTATE from the postmaster, there are basically only
two cases, as per last week's discussion: ERRCODE_CANNOT_CONNECT_NOW
and everything else.  Any other error code implies that the postmaster
is in principle willing to accept connections, it just didn't like or
couldn't handle this particular request.  We want to make a special
case for ERRCODE_CANNOT_CONNECT_NOW so that "pg_ctl start -w" knows
it should keep waiting.

In passing, pick names for the enum constants that are a tad less
likely to present collision hazards in future.
2010-11-27 01:30:34 -05:00
Tom Lane be3b666eb8 Clean up IPv4 vs IPv6 bogosity in connectFailureMessage().
Newly added code was supposing that "struct sockaddr_in" applies to IPv6.
2010-11-26 19:16:39 -05:00
Tom Lane 3840bc0847 Fix portability issues in new src/port/inet_net_ntop.c file.
1. Don't #include postgres.h in a frontend build.

2. Don't assume that the backend's symbol PGSQL_AF_INET6 has anything to do
with the constant that will be used by system library functions (because,
in point of fact, it usually doesn't).  Fortunately, PGSQL_AF_INET is equal
to AF_INET, so we can just cater for both sets of values in one case
construct without fear of conflict.
2010-11-26 18:00:26 -05:00
Robert Haas 55109313f9 Add more ALTER <object> .. SET SCHEMA commands.
This adds support for changing the schema of a conversion, operator,
operator class, operator family, text search configuration, text search
dictionary, text search parser, or text search template.

Dimitri Fontaine, with assorted corrections and other kibitzing.
2010-11-26 17:31:54 -05:00
Tom Lane 1d9a0abec1 Remove bogus use of PGDLLIMPORT.
That macro should be attached to extern declarations, not actual
definitions of variables.
2010-11-26 17:05:29 -05:00
Bruce Momjian e6e38b4ac2 Add inet_net_ntop.c as needed by MSVC, per Magnus. 2010-11-26 14:39:13 -05:00
Bruce Momjian f2eba413db Use conn->raddr consistently for non-connect libpq error reporting. 2010-11-26 13:26:13 -05:00
Bruce Momjian bad8277f13 Update comment that says we only report last libpq connection failure,
per Peter.
2010-11-26 11:52:03 -05:00
Bruce Momjian ed51bd4968 Use only addr_cur when reporting connection failures in libpq. 2010-11-26 11:49:35 -05:00
Bruce Momjian 4f6deef2fb Abandon use of Makefile variables in libpq/Makefile because MSVC scrapes
the OBJS lines from that file.

Cleanup where possible.
2010-11-26 11:10:26 -05:00
Bruce Momjian a9b02ec654 In libpq/Makefile, merge PERM_PGPORT and OPT_PGPORT into a single
Makefile variable PGPORT, for clarity.
2010-11-26 10:22:09 -05:00
Bruce Momjian 5f4b3d750b Improve pg_ctl "cannot connect" spacing, per Tom, and wording. 2010-11-26 10:04:18 -05:00
Bruce Momjian 4646e0cef7 Improve pg_ctl "cannot connect" warning, per suggestion from Magnus. 2010-11-25 14:38:20 -05:00
Bruce Momjian 742ac738c3 For libpq/Makefile OPT_PGPORT, remove .o extension after we test
configure's LIBOBJS.  Should fix buildfarm failures.
2010-11-25 13:19:31 -05:00
Bruce Momjian afd7d9adca Add PQping and PQpingParams to libpq to allow detection of the server's
status, including a status where the server is running but refuses a
postgres connection.

Have pg_ctl use this new function.  This fixes the case where pg_ctl
reports that the server is not running (cannot connect) but in fact it
is running.
2010-11-25 13:09:38 -05:00
Bruce Momjian 212a1c7b0b Fix getaddrinfo() in pgport to use proper parameters, as detected by
Win32 buildfarm members.
2010-11-25 12:56:59 -05:00
Bruce Momjian c6978ecd6f Restructure how libpq includes external C files, for clarity. 2010-11-25 12:51:40 -05:00
Robert Haas cc1ed40d57 Object access hook framework, with post-creation hook.
After a SQL object is created, we provide an opportunity for security
or logging plugins to get control; for example, a security label provider
could use this to assign an initial security label to newly created
objects.  The basic infrastructure is (hopefully) reusable for other types
of events that might require similar treatment.

KaiGai Kohei, with minor adjustments.
2010-11-25 11:50:13 -05:00
Robert Haas 2d1e426650 Add inet_net_ntop.c to .gitignore. 2010-11-25 00:12:25 -05:00
Robert Haas c2281ac87c Remove belt-and-suspenders guards against buffer pin leaks.
Forcibly releasing all leftover buffer pins should be unnecessary now
that we have a robust ResourceOwner mechanism, and it significantly
increases the cost of process shutdown.  Instead, in an assert-enabled
build, assert that no pins are held; in a non-assert-enabled build, do
nothing.
2010-11-25 00:06:46 -05:00
Bruce Momjian 58dfb07b5d Properly add new inet_net_ntop file to libpq Makefile. 2010-11-24 21:58:47 -05:00
Bruce Momjian ba11258ccb When reporting the server as not responding, if the hostname was
supplied, also print the IP address.  This allows IPv4 and IPv6 failures
to be distinguished.  Also useful when a hostname resolves to multiple
IP addresses.

Also, remove use of inet_ntoa() and use our own inet_net_ntop() in all
places, including in libpq, because it is thread-safe.
2010-11-24 17:04:19 -05:00
Tom Lane 725d52d0c2 Create the system catalog infrastructure needed for KNNGIST.
This commit adds columns amoppurpose and amopsortfamily to pg_amop, and
column amcanorderbyop to pg_am.  For the moment all the entries in
amcanorderbyop are "false", since the underlying support isn't there yet.

Also, extend the CREATE OPERATOR CLASS/ALTER OPERATOR FAMILY commands with
[ FOR SEARCH | FOR ORDER BY sort_operator_family ] clauses to allow the new
columns of pg_amop to be populated, and create pg_dump support for dumping
that information.

I also added some documentation, although it's perhaps a bit premature
given that the feature doesn't do anything useful yet.

Teodor Sigaev, Robert Haas, Tom Lane
2010-11-24 14:22:17 -05:00
Peter Eisentraut f2a4278330 Propagate ALTER TYPE operations to typed tables
This adds RESTRICT/CASCADE flags to ALTER TYPE ... ADD/DROP/ALTER/
RENAME ATTRIBUTE to control whether to alter typed tables as well.
2010-11-23 22:50:17 +02:00
Peter Eisentraut fc946c39ae Remove useless whitespace at end of lines 2010-11-23 22:34:55 +02:00
Robert Haas 44475e782f Centralize some ALTER <whatever> .. SET SCHEMA checks.
Any flavor of ALTER <whatever> .. SET SCHEMA fails if (1) the object
is already in the new schema, (2) either the old or new schema is
a temp schema, or (3) either the old or new schema is the TOAST schema.

Extraced from a patch by Dimitri Fontaine, with additional hacking by me.
2010-11-22 19:53:34 -05:00
Alvaro Herrera 5272d79875 Remove GucContext parameter from ParseConfigFile 2010-11-22 19:00:31 -03:00
Robert Haas 95dacf8593 Put back accidentally-deleted quote_literal() regression tests. 2010-11-21 20:46:54 -05:00
Robert Haas 506070be34 Bump catversion. Should have done this as part of format(text) patch. 2010-11-21 06:34:42 -05:00
Robert Haas 7504870778 Add new SQL function, format(text).
Currently, three conversion format specifiers are supported: %s for a
string, %L for an SQL literal, and %I for an SQL identifier.  The latter
two are deliberately designed not to overlap with what sprintf() already
supports, in case we want to add more of sprintf()'s functionality here
later.

Patch by Pavel Stehule, heavily revised by me.  Reviewed by Jeff Janes
and, in earlier versions, by Itagaki Takahiro and Tom Lane.
2010-11-20 22:33:27 -05:00
Tom Lane 89a368418c Further cleanup of indxpath logic related to IndexOptInfo.opfamily array.
We no longer need the terminating zero entry in opfamily[], so get rid of
it.  Also replace assorted ad-hoc looping logic with simple for and foreach
constructs.  This code is now noticeably more readable than it was an hour
ago; credit to Robert for seeing that it could be simplified.
2010-11-20 15:07:16 -05:00
Robert Haas 99bc012d51 Minor cleanup of indxpath.c.
Eliminate some superfluous notational complexity around
match_clause_to_indexcol(), and rip out the DoneMatchingIndexKeys
crock.
2010-11-20 13:57:27 -05:00
Tom Lane d1d8462d99 Assorted further cleanup for integer-conversion patch.
Avoid depending on LL notation, which is likely to not work in pre-C99
compilers; don't pointlessly use INT32_MIN/INT64_MIN in code that has
the numerical value hard-wired into it anyway; remove some gratuitous
style inconsistencies between pg_ltoa and pg_lltoa; fix int2 test case
so it actually tests int2.
2010-11-20 12:09:36 -05:00
Robert Haas 4343c0e546 Expose quote_literal_cstr() from core.
This eliminates the need for inefficient implementions of this
functionality in both contrib/dblink and contrib/tablefunc, so remove
them.  The upcoming patch implementing an in-core format() function
will also require this functionality.

In passing, add some regression tests.
2010-11-20 10:04:48 -05:00
Robert Haas e8bf683fbe Update int8-exp-three-digits.out to match new contents of int8.out. 2010-11-20 07:06:52 -05:00
Robert Haas 815810ed31 Attempt to fix breakage caused by signed integer conversion patch.
Use INT_MIN rather than INT32_MIN as we do elsewhere in the code, and
try to work around nonexistence of INT64_MIN if necessary.  Adjust the
new regression tests to something hopefully saner, per observation by
Tom Lane.
2010-11-20 01:09:26 -05:00
Tom Lane b58c25055e Fix leakage of cost_limit when multiple autovacuum workers are active.
When using default autovacuum_vac_cost_limit, autovac_balance_cost relied
on VacuumCostLimit to contain the correct global value ... but after the
first time through in a particular worker process, it didn't, because we'd
trashed it in previous iterations.  Depending on the state of other autovac
workers, this could result in a steady reduction of the effective
cost_limit setting as a particular worker processed more and more tables,
causing it to go slower and slower.  Spotted by Simon Poole (bug #5759).
Fix by saving and restoring the GUC variables in the loop in do_autovacuum.

In passing, improve a few comments.

Back-patch to 8.3 ... the cost rebalancing code has been buggy since it was
put in.
2010-11-19 22:29:44 -05:00
Robert Haas 4fc115b2e9 Speed up conversion of signed integers to C strings.
A hand-coded implementation turns out to be much faster than calling
printf().  In passing, add a few more regresion tests.

Andres Freund, with assorted, mostly cosmetic changes.
2010-11-19 22:13:11 -05:00
Tom Lane 0f61d4dd1b Improve relation width estimation for subqueries.
As per the ancient comment for set_rel_width, it really wasn't much good
for relations that aren't plain tables: it would never find any stats and
would always fall back on datatype-based estimates, which are often pretty
silly.  Fix that by copying up width estimates from the subquery planning
process.

At some point we might want to do this for CTEs too, but that would be a
significantly more invasive patch because the sub-PlannerInfo is no longer
accessible by the time it's needed.  I refrained from doing anything about
that, partly for fear of breaking the unmerged CTE-related patches.

In passing, also generate less bogus width estimates for whole-row Vars.

Per a gripe from Jon Nelson.
2010-11-19 17:31:50 -05:00
Tom Lane fe24d78161 Improve plpgsql's error reporting for no-such-column cases.
Given a column reference foo.bar, where there is a composite plpgsql
variable foo but it doesn't contain a column bar, the pre-9.0 coding would
immediately throw a "record foo has no field bar" error.  In 9.0 the parser
hook instead falls through to let the core parser see if it can resolve the
reference.  If not, you get a complaint about "missing FROM-clause entry
for table foo", which while in some sense correct isn't terribly helpful.
Complicate things a bit so that we can throw the old error message if
neither the core parser nor the hook are able to resolve the column
reference, while not changing the behavior in any other case.
Per bug #5757 from Andrey Galkin.
2010-11-18 17:07:15 -05:00
Alvaro Herrera 6cc2deb86e Add pg_describe_object function
This function is useful to obtain textual descriptions of objects as
stored in pg_depend.
2010-11-18 17:06:19 -03:00
Tom Lane 48c348f86c Dept of second thoughts: don't try to push LIMIT below a SRF.
If we have Limit->Result->Sort, the Result might be projecting a tlist
that contains a set-returning function.  If so, it's possible for the
SRF to sometimes return zero rows, which means we could need to fetch
more than N rows from the Sort in order to satisfy LIMIT N.
So top-N sorting cannot be used in this scenario.
2010-11-18 11:54:34 -05:00
Heikki Linnakangas ecf70b916b Remove unused parameter. Patch by Shigeru Hanada. 2010-11-18 10:05:17 +02:00
Tom Lane 6fbc323c80 Further fallout from the MergeAppend patch.
Fix things so that top-N sorting can be used in child Sort nodes of a
MergeAppend node, when there is a LIMIT and no intervening joins or
grouping.  Actually doing this on the executor side isn't too bad,
but it's a bit messier to get the planner to cost it properly.
Per gripe from Robert Haas.

In passing, fix an oversight in the original top-N-sorting patch:
query_planner should not assume that a LIMIT can be used to make an
explicit sort cheaper when there will be grouping or aggregation in
between.  Possibly this should be back-patched, but I'm not sure the
mistake is serious enough to be a real problem in practice.
2010-11-18 00:30:10 -05:00
Tom Lane 511e902b51 Make TRUNCATE ... RESTART IDENTITY restart sequences transactionally.
In the previous coding, we simply issued ALTER SEQUENCE RESTART commands,
which do not roll back on error.  This meant that an error between
truncating and committing left the sequences out of sync with the table
contents, with potentially bad consequences as were noted in a Warning on
the TRUNCATE man page.

To fix, create a new storage file (relfilenode) for a sequence that is to
be reset due to RESTART IDENTITY.  If the transaction aborts, we'll
automatically revert to the old storage file.  This acts just like a
rewriting ALTER TABLE operation.  A penalty is that we have to take
exclusive lock on the sequence, but since we've already got exclusive lock
on its owning table, that seems unlikely to be much of a problem.

The interaction of this with usual nontransactional behaviors of sequence
operations is a bit weird, but it's hard to see what would be completely
consistent.  Our choice is to discard cached-but-unissued sequence values
both when the RESTART is executed, and at rollback if any; but to not touch
the currval() state either time.

In passing, move the sequence reset operations to happen before not after
any AFTER TRUNCATE triggers are fired.  The previous ordering was not
logically sensible, but was forced by the need to minimize inconsistency
if the triggers caused an error.  Transactional rollback is a much better
solution to that.

Patch by Steve Singer, rather heavily adjusted by me.
2010-11-17 16:42:18 -05:00
Peter Eisentraut cfad144f89 Additional fixes for parallel make
Add some additional dependencies to constrain the build order to prevent
parallel make from failing.  In the case of src/Makefile, this is likely to be
too complicated to be worth maintaining, so just add .NOTPARALLEL to get the
old for-loop-like behavior.

More fine-tuning might be necessary for some platforms or configurations.
2010-11-17 08:08:41 +02:00
Andrew Dunstan b7fcf68e86 Require VALUE keyword when extending an enum type. Based on a patch from Alvaro Herrera. 2010-11-16 22:18:33 -05:00
Magnus Hagander 4acf99b2f3 Send paramHandle to subprocesses as 64-bit on Win64
The handle to the shared memory segment containing startup
parameters was sent as 32-bit even on 64-bit systems. Since
HANDLEs appear to be allocated sequentially this shouldn't
be a problem until we reach 2^32 open handles in the postmaster,
but a 64-bit value should be sent across as 64-bit, and not
zero out the top 32 bits.

Noted by Tom Lane.
2010-11-16 12:40:56 +01:00
Heikki Linnakangas 2edc5cd493 The GiST scan algorithm uses LSNs to detect concurrent pages splits, but
temporary indexes are not WAL-logged. We used a constant LSN for temporary
indexes, on the assumption that we don't need to worry about concurrent page
splits in temporary indexes because they're only visible to the current
session. But that assumption is wrong, it's possible to insert rows and
split pages in the same session, while a scan is in progress. For example,
by opening a cursor and fetching some rows, and INSERTing new rows before
fetching some more.

Fix by generating fake increasing LSNs, used in place of real LSNs in
temporary GiST indexes.
2010-11-16 11:32:21 +02:00
Tom Lane add0ea88e7 Fix aboriginal mistake in plpython's set-returning-function support.
We must stay in the function's SPI context until done calling the iterator
that returns the set result.  Otherwise, any attempt to invoke SPI features
in the python code called by the iterator will malfunction.  Diagnosis and
patch by Jan Urbanski, per bug report from Jean-Baptiste Quenot.

Back-patch to 8.2; there was no support for SRFs in previous versions of
plpython.
2010-11-15 14:26:55 -05:00
Robert Haas 3134d8863e Add new buffers_backend_fsync field to pg_stat_bgwriter.
This new field counts the number of times that a backend which writes a
buffer out to the OS must also fsync() it.  This happens when the
bgwriter fsync request queue is full, and is generally detrimental to
performance, so it's good to know when it's happening.  Along the way,
log a new message at level DEBUG1 whenever we fail to hand off an fsync,
so that the problem can also be seen in examination of log files
(if the logging level is cranked up high enough).

Greg Smith, with minor tweaks by me.
2010-11-15 12:42:59 -05:00
Robert Haas 8d70ed84ba Remove outdated comments from the regression test files.
Since 2004, int2 and int4 operators do detect overflow; this was fixed by
commit 4171bb869f.

Extracted from a larger patch by Andres Freund.
2010-11-15 11:00:20 -05:00
Robert Haas 20cf8ae478 Fix copy-and-pasteo a little more completely.
copydir.c is no longer in src/port
2010-11-15 10:10:58 -05:00
Alvaro Herrera ae4b17edee Fix copy-and-pasteo. 2010-11-15 11:52:56 -03:00
Simon Riggs 52010027ef Avoid spurious Hot Standby conflicts from btree delete records.
Similar conflicts were already avoided for related record types.
Massive over-caution resulted in a usability bug. Clear theoretical
basis for doing this is now confirmed by me.
Request to remove from Heikki (twice), over-caution by me.
2010-11-15 09:30:13 +00:00
Tom Lane 357edc9a99 Adjust comments about what's needed to avoid make 3.80 bug.
... based on further tracing through that code.
2010-11-15 01:00:48 -05:00
Robert Haas 5ccbc3d802 Correct poor grammar in comment. 2010-11-14 23:10:45 -05:00
Robert Haas 5aa446c961 Cleanup various comparisons with the constant "true".
Itagaki Takahiro, with slight modifications.
2010-11-14 21:03:48 -05:00
Tom Lane 3892a2d861 Fix canAcceptConnections() bugs introduced by replication-related patches.
We must not return any "okay to proceed" result code without having checked
for too many children, else we might fail later on when trying to add the
new child to one of the per-child state arrays.  It's not clear whether
this oversight explains Stefan Kaltenbrunner's recent report, but it could
certainly produce a similar symptom.

Back-patch to 8.4; the logic was not broken before that.
2010-11-14 15:57:37 -05:00
Tom Lane 1bd2012149 Work around make 3.80 bug with long expansions of $(eval).
3.80 breaks if the expansion of $(eval) is long enough to require expansion
of its internal variable_buffer.  For the purposes of $(recurse) that means
it'll work so long as no single evaluation of _create_recursive_target
produces more than 195 bytes.  We can manage that by looping over
subdirectories outside the call instead of complicating the generated rule.
This coding is simpler and more readable anyway.

Or at least, this works for me.  We'll see if the buildfarm likes it.
2010-11-14 12:50:06 -05:00
Tom Lane 2138c701a3 Add missing outfuncs.c support for struct InhRelation.
This is needed to support debug_print_parse, per report from Jon Nelson.
Cursory testing via the regression tests suggests we aren't missing
anything else.
2010-11-13 00:36:14 -05:00
Andrew Dunstan 52e2c12288 Attempt to fix MSVC builds broken by parallel make changes. 2010-11-12 22:55:15 -05:00
Robert Haas 11e482c350 Move copydir() prototype into its own header file.
Having this in src/include/port.h makes no sense, now that copydir.c lives
in src/backend/strorage rather than src/port.  Along the way, remove an
obsolete comment from contrib/pg_upgrade that makes reference to the old
location.
2010-11-12 16:39:53 -05:00
Tom Lane d7304244e2 Fix old oversight in const-simplification of COALESCE() expressions.
Once we have found a non-null constant argument, there is no need to
examine additional arguments of the COALESCE.  The previous coding got it
right only if the constant was in the first argument position; otherwise
it tried to simplify following arguments too, leading to unexpected
behavior like this:

regression=# select coalesce(f1, 42, 1/0) from int4_tbl;
ERROR:  division by zero

It's a minor corner case, but a bug is a bug, so back-patch all the way.
2010-11-12 15:17:30 -05:00
Peter Eisentraut 19e231bbda Improved parallel make support
Replace for loops in makefiles with proper dependencies.  Parallel
make can now span across directories.  Also, make -k and make -q work
properly.

GNU make 3.80 or newer is now required.
2010-11-12 22:15:16 +02:00
Heikki Linnakangas e356743f3e Add missing support for removing foreign data wrapper / server privileges
belonging to a user at DROP OWNED BY. Foreign data wrappers and servers
don't do anything useful yet, which is why no-one has noticed, but since we
have them, seems prudent to fix this. Per report from Chetan Suttraway.
Backpatch to 9.0, 8.4 has the same problem but this patch didn't apply
there so I'm not going to bother.
2010-11-12 15:29:23 +02:00
Heikki Linnakangas 542bdb2146 Fix bug introduced by the recent patch to check that the checkpoint redo
location read from backup label file can be found: wasShutdown was set
incorrectly when a backup label file was found.

Jeff Davis, with a little tweaking by me.
2010-11-11 19:32:11 +02:00
Tom Lane b0f2d681bd Fix line_construct_pm() for the case of "infinite" (DBL_MAX) slope.
This code was just plain wrong: what you got was not a line through the
given point but a line almost indistinguishable from the Y-axis, although
not truly vertical.  The only caller that tries to use this function with
m == DBL_MAX is dist_ps_internal for the case where the lseg is horizontal;
it would end up producing the distance from the given point to the place
where the lseg's line crosses the Y-axis.  That function is used by other
operators too, so there are several operators that could compute wrong
distances from a line segment to something else.  Per bug #5745 from
jindiax.

Back-patch to all supported branches.
2010-11-10 16:52:24 -05:00
Robert Haas 7ba6e4f0e0 Add monitoring function pg_last_xact_replay_timestamp.
Fujii Masao, with a little wordsmithing by me.
2010-11-09 22:52:19 -05:00
Itagaki Takahiro 844ed5dc97 Don't use __declspec (dllimport) for PGDLLEXPORT to reduce warnings
by gcc version 4 on mingw and cygwin. We don't use dllexport here
because dllexport and dllwrap don't work well together.
2010-11-10 12:17:43 +09:00
Tom Lane 80fb2c1f40 Repair memory leakage while ANALYZE-ing complex index expressions.
The general design of memory management in Postgres is that intermediate
results computed by an expression are not freed until the end of the tuple
cycle.  For expression indexes, ANALYZE has to re-evaluate each expression
for each of its sample rows, and it wasn't bothering to free intermediate
results until the end of processing of that index.  This could lead to very
substantial leakage if the intermediate results were large, as in a recent
example from Jakub Ouhrabka.  Fix by doing ResetExprContext for each sample
row.  This necessitates adding a datumCopy step to ensure that the final
expression value isn't recycled too.  Some quick testing suggests that this
change adds at worst about 10% to the time needed to analyze a table with
an expression index; which is annoying, but seems a tolerable price to pay
to avoid unexpected out-of-memory problems.

Back-patch to all supported branches.
2010-11-09 11:55:13 -05:00
Heikki Linnakangas 000efc3dfd In rewriteheap.c (used by VACUUM FULL and CLUSTER), calculate the tuple
length stored in the line pointer the same way it's calculated in the normal
heap_insert() codepath. As noted by Jeff Davis, the length stored by
raw_heap_insert() included padding but the one stored by the normal codepath
did not. While the mismatch seems to be harmless, inconsistency isn't good,
and the normal codepath has received a lot more testing over the years.

Backpatch to 8.3 where the heap rewrite code was introduced.
2010-11-09 17:48:14 +02:00
Tom Lane 54428dbe90 Fix error handling in temp-file deletion with log_temp_files active.
The original coding in FileClose() reset the file-is-temp flag before
unlinking the file, so that if control came back through due to an error,
it wouldn't try to unlink the file twice.  This was correct when written,
but when the log_temp_files feature was added, the logging action was put
in between those two steps.  An error occurring during the logging action
--- such as a query cancel --- would result in the unlink not getting done
at all, as in recent report from Michael Glaesemann.

To fix this, make sure that we do both the stat and the unlink before doing
anything that could conceivably CHECK_FOR_INTERRUPTS.  There is a judgment
call here, which is which log message to emit first: if you can see only
one, which should it be?  I chose to log unlink failure at the risk of
losing the log_temp_files log message --- after all, if the unlink does
fail, the temp file is still there for you to see.

Back-patch to all versions that have log_temp_files.  The code was OK
before that.
2010-11-08 22:14:48 -05:00
Alvaro Herrera 854ae8c3a6 Fix permanent memory leak in autovacuum launcher
get_database_list was uselessly allocating its output data, along some
created along the way, in a permanent memory context.  This didn't
matter when autovacuum was a single, short-lived process, but now that
the launcher is permanent, it shows up as a permanent leak.

To fix, make get_database list allocate its output data in the caller's
context, which is in charge of freeing it when appropriate; and the
memory leaked by heap_beginscan et al is allocated in a throwaway
transaction context.
2010-11-08 18:35:42 -03:00
Tom Lane 947d0c862c Use appendrel planning logic for top-level UNION ALL structures.
Formerly, we could convert a UNION ALL structure inside a subquery-in-FROM
into an appendrel, as a side effect of pulling up the subquery into its
parent; but top-level UNION ALL always caused use of plan_set_operations().
That didn't matter too much because you got an Append-based plan either
way.  However, now that the appendrel code can do things with MergeAppend,
it's worthwhile to hack up the top-level case so it also uses appendrels.

This is a bit of a stopgap; but going much further than this will require
a major rewrite of the planner's set-operations support, which I'm not
prepared to undertake now.  For the moment let's grab the low-hanging fruit.
2010-11-08 15:15:02 -05:00
Tom Lane 543d22fc74 Prevent invoking I/O conversion casts via functional/attribute notation.
PG 8.4 added a built-in feature for casting pretty much any data type to
string types (text, varchar, etc).  We allowed this to work in any of the
historically-allowed syntaxes: CAST(x AS text), x::text, text(x), or
x.text.  However, multiple complaints have shown that it's too easy to
invoke such casts unintentionally in the latter two styles, particularly
field selection.  To cure the problem with the narrowest possible change
of behavior, disallow use of I/O conversion casts from composite types to
string types via functional/attribute syntax.  The new functionality is
still available via cast syntax.

In passing, document the equivalence of functional and attribute syntax
in a more visible place.
2010-11-07 13:03:19 -05:00
Tom Lane e43fb604d6 Implement an "S" option for psql's \dn command.
\dn without "S" now hides all pg_XXX schemas as well as information_schema.
Thus, in a bare database you'll only see "public".  ("public" is considered
a user schema, not a system schema, mainly because it's droppable.)
Per discussion back in late September.
2010-11-06 21:41:14 -04:00
Tom Lane d7a2ce4905 Add support for detecting register-stack overrun on IA64.
Per recent investigation, the register stack can grow faster than the
regular stack depending on compiler and choice of options.  To avoid
crashes we must check both stacks in check_stack_depth().

Since this is poorly-tested code, committing only to HEAD for the
moment ... but we might want to consider back-patching later.
2010-11-06 19:36:29 -04:00
Tom Lane dd1c781903 Make get_stack_depth_rlimit() handle RLIM_INFINITY more sanely.
Rather than considering this result as meaning "unknown", report LONG_MAX.
This won't change what superusers can set max_stack_depth to, but it will
cause InitializeGUCOptions() to set the built-in default to 2MB not 100kB.
The latter seems like a fairly unreasonable interpretation of "infinity".
Per my investigation of odd buildfarm results as well as an old complaint
from Heikki.

Since this should persuade all the buildfarm animals to use a reasonable
stack depth setting during "make check", revert previous patch that dumbed
down a recursive regression test to only 5 levels.
2010-11-06 16:50:18 -04:00
Tom Lane 6736916f5f Include the current value of max_stack_depth in stack depth complaints.
I'm mainly interested in finding out what it is on buildfarm machines,
but including the active value in the message seems like good practice
in any case.  Add the info to the HINT, not the ERROR string, so as not
to change the regression tests' expected output.
2010-11-04 17:15:38 -04:00
Tom Lane 09211659d9 Use appendStringInfoString() where appropriate in elog.c.
The nominally equivalent call appendStringInfo(buf, "%s", str) can be
significantly slower when str is large.  In particular, the former usage in
EVALUATE_MESSAGE led to O(N^2) behavior when collecting a large number of
context lines, as I found out while testing recursive functions.  The other
changes are just neatnik-ism and seem unlikely to save anything meaningful,
but a cycle shaved is a cycle earned.
2010-11-04 15:28:35 -04:00
Tom Lane 034967bdcb Reimplement planner's handling of MIN/MAX aggregate optimization.
Per my recent proposal, get rid of all the direct inspection of indexes
and manual generation of paths in planagg.c.  Instead, set up
EquivalenceClasses for the aggregate argument expressions, and let the
regular path generation logic deal with creating paths that can satisfy
those sort orders.  This makes planagg.c a bit more visible to the rest
of the planner than it was originally, but the approach is basically a lot
cleaner than before.  A major advantage of doing it this way is that we get
MIN/MAX optimization on inheritance trees (using MergeAppend of indexscans)
practically for free, whereas in the old way we'd have had to add a whole
lot more duplicative logic.

One small disadvantage of this approach is that MIN/MAX aggregates can no
longer exploit partial indexes having an "x IS NOT NULL" predicate, unless
that restriction or something that implies it is specified in the query.
The previous implementation was able to use the added "x IS NOT NULL"
condition as an extra predicate proof condition, but in this version we
rely entirely on indexes that are considered usable by the main planning
process.  That seems a fair tradeoff for the simplicity and functionality
gained.
2010-11-04 12:01:17 -04:00
Tom Lane 0abc8fdd4d Reduce recursion depth in recently-added regression test.
Some buildfarm members fail the test with the original depth of 10 levels,
apparently because they are running at the minimum max_stack_depth setting
of 100kB and using ~ 10k per recursion level.  While it might be
interesting to try to figure out why they're eating so much stack, it isn't
likely that any fix for that would be back-patchable.  So just change the
test to recurse only 5 levels.  The extra levels don't prove anything
correctness-wise anyway.
2010-11-03 13:41:46 -04:00
Tom Lane 70a0160b07 Use only one hash entry for all instances of a pltcl trigger function.
Like plperl and unlike plpgsql, there isn't any cached state that could
depend on exactly which relation the trigger is being fired for.  So we
can use just one hash entry for all relations, which might save a little
something.

Alex Hunsaker
2010-11-03 12:26:55 -04:00
Tom Lane 61d6dd0c03 Fix adjust_semi_join to be more cautious about clauseless joins.
It was reporting that these were fully indexed (hence cheap), when of
course they're the exact opposite of that.  I'm not certain if the case
would arise in practice, since a clauseless semijoin is hard to produce
in SQL, but if it did happen we'd make some dumb decisions.
2010-11-02 18:45:36 -04:00
Tom Lane 9f376e146b Ensure an index that uses a whole-row Var still depends on its table.
We failed to record any dependency on the underlying table for an index
declared like "create index i on t (foo(t.*))".  This would create trouble
if the table were dropped without previously dropping the index.  To fix,
simplify some overly-cute code in index_create(), accepting the possibility
that sometimes the whole-table dependency will be redundant.  Also document
this hazard in dependency.c.  Per report from Kevin Grittner.

In passing, prevent a core dump in pg_get_indexdef() if the index's table
can't be found.  I came across this while experimenting with Kevin's
example.  Not sure it's a real issue when the catalogs aren't corrupt, but
might as well be cautious.

Back-patch to all supported versions.
2010-11-02 17:15:07 -04:00
Michael Meskes 35d5d962e1 Some cleanup in ecpg code:
Use bool as type for booleans instead of int.
Do not implicitely cast size_t to int.
Make the compiler stop complaining about unused variables by adding an empty statement.
2010-11-02 18:12:01 +01:00
Heikki Linnakangas 8c843fff2d Bootstrap WAL to begin at segment logid=0 logseg=1 (000000010000000000000001)
rather than 0/0, so that we can safely use 0/0 as an invalid value. This is a
more future-proof fix for the corner-case bug in streaming replication that
was fixed yesterday. We had a similar corner-case bug with log/seg 0/0 back in
February as well. Avoiding 0/0 as a valid value should prevent bugs like that
in the future. Per Tom Lane's idea.

Back-patch to 9.0. Since this only affects bootstrapping, it makes no
difference to existing installations. We don't need to worry about the
bug in existing installations, because if you've managed to get past the
initial base backup already, you won't hit the bug in the future either.
2010-11-02 11:39:48 +02:00
Tom Lane 0811ff2063 Avoid using a local FunctionCallInfoData struct in ExecMakeFunctionResult
and related routines.

We already had a redundant FunctionCallInfoData struct in FuncExprState,
but were using that copy only in set-returning-function cases, to avoid
keeping function evaluation state in the expression tree for the benefit
of plpgsql's "simple expression" logic.  But of course that didn't work
anyway.  Given the recent fixes in plpgsql there is no need to have two
separate behaviors here.  Getting rid of the local FunctionCallInfoData
structs should make things a little faster (because we don't need to do
InitFunctionCallInfoData each time), and it also makes for a noticeable
reduction in stack space consumption during recursive calls.
2010-11-01 13:54:21 -04:00
Heikki Linnakangas 931b6db39b Fix corner-case bug in tracking of latest removed WAL segment during
streaming replication. We used log/seg 0/0 to indicate that no WAL segments
have been removed since startup, but 0/0 is a valid value for the very first
WAL segment after initdb. To make that disambiguous, store
(latest removed WAL segment + 1) in the global variable.

Per report from Matt Chesler, also reproduced by Greg Smith.
2010-11-01 10:05:15 +02:00
Tom Lane 76b12e0af7 Revert removal of trigger flag from plperl function hash key.
As noted by Jan Urbanski, this flag is in fact needed to ensure that the
function's input/result conversion functions are set up as expected.

Add a regression test to discourage anyone from making same mistake
in future.
2010-10-31 11:42:51 -04:00
Tom Lane 186cbbda8f Provide hashing support for arrays.
The core of this patch is hash_array() and associated typcache
infrastructure, which works just about exactly like the existing support
for array comparison.

In addition I did some work to ensure that the planner won't think that an
array type is hashable unless its element type is hashable, and similarly
for sorting.  This includes adding a datatype parameter to op_hashjoinable
and op_mergejoinable, and adding an explicit "hashable" flag to
SortGroupClause.  The lack of a cross-check on the element type was a
pre-existing bug in mergejoin support --- but it didn't matter so much
before, because if you couldn't sort the element type there wasn't any good
alternative to failing anyhow.  Now that we have the alternative of hashing
the array type, there are cases where we can avoid a failure by being picky
at the planner stage, so it's time to be picky.

The issue of exactly how to combine the per-element hash values to produce
an array hash is still open for discussion, but the rest of this is pretty
solid, so I'll commit it as-is.
2010-10-30 21:56:11 -04:00
Tom Lane bfd3f37be3 Fix comparisons of pointers with zero to compare with NULL instead.
Per C standard, these are semantically the same thing; but saying NULL
when you mean NULL is good for readability.

Marti Raudsepp, per results of INRIA's Coccinelle.
2010-10-29 15:51:52 -04:00
Tom Lane 48a1fb2390 Oops, missed one fix for EquivalenceClass rearrangement.
Now that we're expecting a mergeclause's left_ec/right_ec to persist from
the initial assignments, we can't just blithely zero these out when
transforming such a clause in adjust_appendrel_attrs.  But really it should
be okay to keep the parent's values, since a child table's derived Var
ought to be equivalent to the parent Var for all EquivalenceClass purposes.
(Indeed, I'm wondering whether we couldn't find a way to dispense with
add_child_rel_equivalences altogether.  But this is wrong in any case.)
2010-10-29 14:44:49 -04:00
Tom Lane 14231a41a9 Avoid creation of useless EquivalenceClasses during planning.
Zoltan Boszormenyi exhibited a test case in which planning time was
dominated by construction of EquivalenceClasses and PathKeys that had no
actual relevance to the query (and in fact got discarded immediately).
This happened because we generated PathKeys describing the sort ordering of
every index on every table in the query, and only after that checked to see
if the sort ordering was relevant.  The EC/PK construction code is O(N^2)
in the number of ECs, which is all right for the intended number of such
objects, but it gets out of hand if there are ECs for lots of irrelevant
indexes.

To fix, twiddle the handling of mergeclauses a little bit to ensure that
every interesting EC is created before we begin path generation.  (This
doesn't cost anything --- in fact I think it's a bit cheaper than before
--- since we always eventually created those ECs anyway.)  Then, if an
index column can't be found in any pre-existing EC, we know that that sort
ordering is irrelevant for the query.  Instead of creating a useless EC,
we can just not build a pathkey for the index column in the first place.
The index will still be considered if it's useful for non-order-related
reasons, but we will think of its output as unsorted.
2010-10-29 11:52:50 -04:00
Heikki Linnakangas f184de351d Give a more specific error message if you try to COMMIT, ROLLBACK or COPY
FROM STDIN in PL/pgSQL. We alread did this for dynamic EXECUTE statements,
ie. "EXECUTE 'COMMIT'", but not otherwise.
2010-10-29 11:44:54 +03:00
Andrew Dunstan 6c3c7b533e Allow generic record arguments to plperl functions 2010-10-28 20:48:12 -04:00
Peter Eisentraut a3d40e9fb5 Add tab completion for psql \dg and \z
Josh Kupershmidt
2010-10-28 23:05:28 +03:00
Peter Eisentraut 299591d1a2 Make \? output of \dg and \du the same
The previous wording might have suggested that \du only showed login roles
and \dg only group roles, but that is no longer the case.

proposed by Josh Kupershmidt
2010-10-28 23:01:45 +03:00
Tom Lane 37e0a01654 Save a few cycles in plpgsql simple-expression initialization.
Instead of using ExecPrepareExpr, call ExecInitExpr.  The net change here
is that we don't apply expression_planner() to the expression tree.  There
is no need to do so, because that tree is extracted from a fully planned
plancache entry, so all the needed work is already done.  This reduces
the setup costs by about a factor of 2 according to some simple tests.
Oversight noted while fooling around with the simple-expression code for
previous fix.
2010-10-28 13:29:13 -04:00
Tom Lane 8ce22dd4c5 Fix plpgsql's handling of "simple" expression evaluation.
In general, expression execution state trees aren't re-entrantly usable,
since functions can store private state information in them.
For efficiency reasons, plpgsql tries to cache and reuse state trees for
"simple" expressions.  It can get away with that most of the time, but it
can fail if the state tree is dirty from a previous failed execution (as
in an example from Alvaro) or is being used recursively (as noted by me).

Fix by tracking whether a state tree is in use, and falling back to the
"non-simple" code path if so.  This results in a pretty considerable speed
hit when the non-simple path is taken, but the available alternatives seem
even more unpleasant because they add overhead in the simple path.  Per
idea from Heikki.

Back-patch to all supported branches.
2010-10-28 13:02:12 -04:00
Tom Lane e6721c6e16 Previous patch had no detectable virtue other than being a one-liner.
Try to make the code look self-consistent again, so it doesn't confuse
future developers.
2010-10-27 15:26:24 -04:00
Heikki Linnakangas 869af50fcf Fix long-standing segfault when accept() or one of the calls made right
after accepting a connection fails, and the server is compiled with GSSAPI
support. Report and patch by Alexander V. Chernikov, bug #5731.
2010-10-27 20:07:13 +03:00
Tom Lane 35d8940152 Fix up some oversights in psql's Unicode-escape support.
Original patch failed to include new exclusive states in a switch that
needed to include them; and also was guilty of very fuzzy thinking
about how to handle error cases.  Per bug #5729 from Alan Choi.
2010-10-26 22:25:19 -04:00
Robert Haas 20709f8136 Add a client authentication hook.
KaiGai Kohei, with minor cleanup of the comments by me.
2010-10-26 21:20:38 -04:00
Robert Haas 1fea0c05eb Minor fixups for psql's process_file() function.
- Avoid closing stdin, since we didn't open it.  Previously multiple
inclusions of stdin would be terminated with a single quit, now a separate
quit is needed for each invocation. Previous behavior also accessed stdin
after it was fclose()d, which is undefined behavior per ANSI C.

- Properly restore pset.inputfile, since the caller expects to be able
to free that memory.

Marti Raudsepp
2010-10-26 19:35:33 -04:00
Robert Haas 3579a94d6a Fix dumb typo in SECURITY LABEL error message.
Report by Peter Eisentraut.
2010-10-26 14:55:18 -04:00
Heikki Linnakangas 0c6293dd03 Before removing backup_label and irrevocably changing pg_control file, check
that WAL file containing the checkpoint redo-location can be found. This
avoids making the cluster irrecoverable if the redo location is in an earlie
WAL file than the checkpoint record.

Report, analysis and patch by Jeff Davis, with small changes by me.
2010-10-26 21:43:52 +03:00
Peter Eisentraut a87d212636 Add missing newlines at end of files 2010-10-26 20:11:43 +03:00