Commit Graph

9587 Commits

Author SHA1 Message Date
Alvaro Herrera d43b085d57 Separate snapshot management code from tuple visibility code, create a
snapmgmt.c file for the former.  The header files have also been reorganized
in three parts: the most basic snapshot definitions are now in a new file
snapshot.h, and the also new snapmgmt.h keeps the definitions for snapmgmt.c.
tqual.h has been reduced to the bare minimum.

This patch is just a first step towards managing live snapshots within a
transaction; there is no functionality change.

Per my proposal to pgsql-patches on 20080318191940.GB27458@alvh.no-ip.org and
subsequent discussion.
2008-03-26 16:20:48 +00:00
Tom Lane 220db7ccd8 Simplify and standardize conversions between TEXT datums and ordinary C
strings.  This patch introduces four support functions cstring_to_text,
cstring_to_text_with_len, text_to_cstring, and text_to_cstring_buffer, and
two macros CStringGetTextDatum and TextDatumGetCString.  A number of
existing macros that provided variants on these themes were removed.

Most of the places that need to make such conversions now require just one
function or macro call, in place of the multiple notational layers that used
to be needed.  There are no longer any direct calls of textout or textin,
and we got most of the places that were using handmade conversions via
memcpy (there may be a few still lurking, though).

This commit doesn't make any serious effort to eliminate transient memory
leaks caused by detoasting toasted text objects before they reach
text_to_cstring.  We changed PG_GETARG_TEXT_P to PG_GETARG_TEXT_PP in a few
places where it was easy, but much more could be done.

Brendan Jurd and Tom Lane
2008-03-25 22:42:46 +00:00
Neil Conway 1d812a98b4 Add a new tuplestore API function, tuplestore_putvalues(). This is
identical to tuplestore_puttuple(), except it operates on arrays of
Datums + nulls rather than a fully-formed HeapTuple. In several places
that use the tuplestore API, this means we can avoid creating a
HeapTuple altogether, saving a copy.
2008-03-25 19:26:54 +00:00
Tom Lane fd791e7b5a When a relation has been proven empty by constraint exclusion, propagate that
knowledge up through any joins it participates in.  We were doing that already
in some special cases but not in the general case.  Also, defend against zero
row estimates for the input relations in cost_mergejoin --- this fix may have
eliminated the only scenario in which that can happen, but be safe.  Per
report from Alex Solovey.
2008-03-24 21:53:04 +00:00
Tom Lane 2a346725ba Use new errdetail_log() mechanism to provide a less klugy way of reporting
large numbers of dependencies on a role that couldn't be dropped.
Per a comment from Alvaro.
2008-03-24 19:47:35 +00:00
Tom Lane 32b58d0220 Fix various infelicities that have snuck into usage of errdetail() and
friends.  Avoid double translation of some messages, ensure other messages
are exposed for translation (and make them follow the style guidelines),
avoid unsafe passing of an unpredictable message text as a format string.
2008-03-24 19:12:49 +00:00
Tom Lane 9b8e1eb375 Adjust the recent patch for reporting of deadlocked queries so that we report
query texts only to the server log.  This eliminates the issue of possible
leaking of security-sensitive data in other sessions' queries.  Since the
log is presumed secure, we can now log the queries of all sessions involved
in the deadlock, whether or not they belong to the same user as the one
reporting the failure.
2008-03-24 18:22:36 +00:00
Tom Lane 05fc744b96 Add a new ereport auxiliary function errdetail_log(), which works the same as
errdetail except the string goes only to the server log, replacing the normal
errdetail there.  This provides a reasonably clean way of dealing with error
details that are too security-sensitive or too bulky to send to the client.

This commit just adds the infrastructure --- actual uses to follow.
2008-03-24 18:08:47 +00:00
Tom Lane 598b97dc9b Avoid a useless tuple copy within nodeMaterial. Neil Conway 2008-03-23 00:54:04 +00:00
Tom Lane 7de81124d5 Create a function quote_nullable(), which works the same as quote_literal()
except that it returns the string 'NULL', rather than a SQL null, when called
with a null argument.  This is often a much more useful behavior for
constructing dynamic queries.  Add more discussion to the documentation
about how to use these functions.

Brendan Jurd
2008-03-23 00:24:20 +00:00
Tom Lane 19595835c3 Refactor to_char/to_date formatting code; primarily, replace DCH_processor
with two new functions DCH_to_char and DCH_from_char that have less confusing
APIs.  Brendan Jurd
2008-03-22 22:32:19 +00:00
Tatsuo Ishii 325c0a39e4 Add server side lo_import(filename, oid) function. 2008-03-22 01:55:14 +00:00
Tom Lane 58a8285542 Remove TypeName struct's timezone flag, which has been write-only storage
for a very long time --- in current usage it's entirely redundant with the
name field.
2008-03-21 22:41:48 +00:00
Tom Lane 20e82a7c0b Give an explicit error for serial[], rather than silently ignoring
the array decoration as the code had been doing.
2008-03-21 22:10:56 +00:00
Tom Lane 4b7ae4afae Report the current queries of all backends involved in a deadlock
(if they'd be visible to the current user in pg_stat_activity).

This might look like it's subject to race conditions, but it's actually
pretty safe because at the time DeadLockReport() is constructing the
report, we haven't yet aborted our transaction and so we can expect that
everyone else involved in the deadlock is still blocked on some lock.
(There are corner cases where that might not be true, such as a statement
timeout triggering in another backend before we finish reporting; but at
worst we'd report a misleading activity string, so it seems acceptable
considering the usefulness of reporting the queries.)

Original patch by Itagaki Takahiro, heavily modified by me.
2008-03-21 21:08:31 +00:00
Bruce Momjian fca9fff41b More README src cleanups. 2008-03-21 13:23:29 +00:00
Tom Lane 2d0583a166 Get rid of a bunch of #ifdef HAVE_INT64_TIMESTAMP conditionals by inventing
a new typedef TimeOffset to represent an intermediate time value.  It's
either int64 or double as appropriate, and in most usages will be measured
in microseconds or seconds the same as Timestamp.  We don't call it
Timestamp, though, since the value doesn't necessarily represent an absolute
time instant.

Warren Turkal
2008-03-21 01:31:43 +00:00
Tom Lane 6b0706ac33 Arrange for an explicit cast applied to an ARRAY[] constructor to be applied
directly to all the member expressions, instead of the previous implementation
where the ARRAY[] constructor would infer a common element type and then we'd
coerce the finished array after the fact.  This has a number of benefits,
one being that we can allow an empty ARRAY[] construct so long as its
element type is specified by such a cast.

Brendan Jurd, minor fixes by me.
2008-03-20 21:42:48 +00:00
Alvaro Herrera 8759b79d0f Add a couple of missing FreeQueryDesc calls. Noticed while testing a
framework to keep track of snapshots in use.
2008-03-20 20:05:56 +00:00
Bruce Momjian 4e228447aa Make source code READMEs more consistent. Add CVS tags to all README files. 2008-03-20 17:55:15 +00:00
Heikki Linnakangas f4b7624eb0 Add the missing cyrillic "Yo" characters ('e' and 'E' with two dots) to the
ISO_8859-5 <-> MULE_INTERNAL conversion tables.

This was discovered when trying to convert a string containing those characters
from ISO_8859-5 to Windows-1251, because we use MULE_INTERNAL/KOI8R as an
intermediate encoding between those two.

While the missing "Yo" was just an omission in the conversion tables, there are
a few other characters like the "Numero" sign ("No" as a single character) that
exists in all the other cyrillic encodings (win1251, ISO_8859-5 and cp866), but
not in KOI8R. Added comments about that.

Patch by Sergey Burladyan. Back-patch to 7.4.
2008-03-20 10:30:04 +00:00
Alvaro Herrera 470c6c12a1 Remove another useless snapshot creation. 2008-03-19 21:14:20 +00:00
Tom Lane 5507b22dfc Support ALTER TYPE RENAME. Petr Jelinek 2008-03-19 18:38:30 +00:00
Alvaro Herrera a9686591d7 We no longer need a snapshot set after opening the finishing transaction: this
is redundant because autovacuum now always analyzes a single table per
transaction.
2008-03-19 14:18:21 +00:00
Tom Lane 965a2a191a Fix regexp substring matching (substring(string from pattern)) for the corner
case where there is a match to the pattern overall but the user has specified
a parenthesized subexpression and that subexpression hasn't got a match.
An example is substring('foo' from 'foo(bar)?').  This should return NULL,
since (bar) isn't matched, but it was mistakenly returning the whole-pattern
match instead (ie, 'foo').  Per bug #4044 from Rui Martins.

This has been broken since the beginning; patch in all supported versions.
The old behavior was sufficiently inconsistent that it's impossible to believe
anyone is depending on it.
2008-03-19 02:40:37 +00:00
Tom Lane 0d49838df6 Arrange to "inline" SQL functions that appear in a query's FROM clause,
are declared to return set, and consist of just a single SELECT.  We
can replace the FROM-item with a sub-SELECT and then optimize much as
if we were dealing with a view.  Patch from Richard Rowell, cleaned up
by me.
2008-03-18 22:04:14 +00:00
Peter Eisentraut 8c87cc370f Catch all errors in for and while loops in makefiles. Don't ignore any
errors in any commands, including in various clean targets that have so far
been handled inconsistently.  make -i is available to ignore all errors in
a consistent and official way.
2008-03-18 16:24:50 +00:00
Alvaro Herrera d54bb24cdd Move elog(DEBUG4) call outside the locked area, per suggestion from Tom Lane. 2008-03-18 12:36:43 +00:00
Tom Lane 8e850b9159 Advance multiple array keys rightmost-first instead of leftmost-first
during a bitmap index scan.  This cannot affect the query results
(since we're just dumping the TIDs into a bitmap) but it might offer
some advantage in locality of access to the index.  Per Greg Stark.
2008-03-18 03:54:52 +00:00
Peter Eisentraut a7b7b07af3 Enable probes to work with Mac OS X Leopard and other OSes that will
support DTrace in the future.

Switch from using DTRACE_PROBEn macros to the dynamically generated macros.
Use "dtrace -h" to create a header file that contains the dynamically
generated macros to be used in the source code instead of the DTRACE_PROBEn
macros.  A dummy header file is generated for builds without DTrace support.

Author: Robert Lor <Robert.Lor@sun.com>
2008-03-17 19:44:41 +00:00
Peter Eisentraut e7115a224a We need to rebuild objfiles.txt when one of the subdirectories' objfiles.txt
changed in case a new file got added.
2008-03-17 18:24:56 +00:00
Magnus Hagander 7cbfa7565e Fix postgres --describe-config for guc enums, breakage noted by Alvaro.
While at it, rename option lookup functions to make names clearer, per
discussion with Tom.
2008-03-17 17:45:09 +00:00
Tom Lane 164899db1c Revert thinko introduced into prefix_selectivity() by my recent patch:
make_greater_string needs the < procedure not the >= one.  Spotted by
Peter.
2008-03-17 17:13:54 +00:00
Alvaro Herrera 23057f51f5 Move ProcState definition into sinvaladt.c from sinvaladt.h, since it's not
needed anywhere after my previous patch.  Noticed by Tom Lane.

Also, remove #include <signal.h> from sinval.c.
2008-03-17 11:50:27 +00:00
Tom Lane 0c5962c054 Grab some low-hanging fruit in the new hash index build code.
oprofile shows that a nontrivial amount of time is being spent in
repeated calls to index_getprocinfo, which really only needs to be
called once.  So do that, and inline _hash_datum2hashkey to make it
work.
2008-03-17 03:45:36 +00:00
Tom Lane 32846f8152 Fix TransactionIdIsCurrentTransactionId() to use binary search instead of
linear search when checking child-transaction XIDs.  This makes for an
important speedup in transactions that have large numbers of children,
as in a recent example from Craig Ringer.  We can also get rid of an
ugly kluge that represented lists of TransactionIds as lists of OIDs.

Heikki Linnakangas
2008-03-17 02:18:55 +00:00
Tom Lane 787eba734b When creating a large hash index, pre-sort the index entries by estimated
bucket number, so as to ensure locality of access to the index during the
insertion step.  Without this, building an index significantly larger than
available RAM takes a very long time because of thrashing.  On the other
hand, sorting is just useless overhead when the index does fit in RAM.
We choose to sort when the initial index size exceeds effective_cache_size.

This is a revised version of work by Tom Raney and Shreya Bhargava.
2008-03-16 23:15:08 +00:00
Alvaro Herrera ec6550c6c0 Modify interactions between sinval.c and sinvaladt.c. The code that actually
deals with the queue, including locking etc, is all in sinvaladt.c.  This means
that the struct definition of the queue, and the queue pointer, are now
internal "implementation details" inside sinvaladt.c.

Per my proposal dated 25-Jun-2007 and followup discussion.
2008-03-16 19:47:34 +00:00
Magnus Hagander a3f66eac01 Some cleanups of enum-guc code, per comments from Tom. 2008-03-16 16:42:44 +00:00
Tom Lane c9a1cc694a Change hash index creation so that rather than always establishing exactly
two buckets at the start, we create a number of buckets appropriate for the
estimated size of the table.  This avoids a lot of expensive bucket-split
actions during initial index build on an already-populated table.

This is one of the two core ideas of Tom Raney and Shreya Bhargava's patch
to reduce hash index build time.  I'm committing it separately to make it
easier for people to test the effects of this separately from the effects
of their other core idea (pre-sorting the index entries by bucket number).
2008-03-15 20:46:31 +00:00
Tom Lane 4873c96ff3 Fix inappropriately-timed memory context switch in autovacuum_do_vac_analyze.
This accidentally failed to fail before 8.3, because the context we were
switching back to was long-lived anyway; but it sure looks risky as can be
now.  Well spotted by Pavan Deolasee.
2008-03-14 23:49:28 +00:00
Alvaro Herrera adc4e1e635 Fix vacuum so that autovacuum is really not cancelled when doing an emergency
job (i.e. to prevent Xid wraparound problems.)  Bug reported by ITAGAKI
Takahiro in 20080314103837.63D3.52131E4D@oss.ntt.co.jp, though I didn't use his
patch.
2008-03-14 17:25:59 +00:00
Tom Lane 5e00913daf Fix varstr_cmp's special case for UTF8 encoding on Windows so that strings
that are reported as "equal" by wcscoll() are checked to see if they really
are bitwise equal, and are sorted per strcmp() if not.  We made this happen
a couple of years ago in the regular code path, but it unaccountably got
left out of the Windows/UTF8 case (probably brain fade on my part at the
time).  As in the prior set of changes, affected users may need to reindex
indexes on textual columns.

Backpatch as far as 8.2, which is the oldest release we are still supporting
on Windows.
2008-03-13 18:31:56 +00:00
Tom Lane 3e701a04fe Fix heap_page_prune's problem with failing to send cache invalidation
messages if the calling transaction aborts later on.  Collapsing out line
pointer redirects is a done deal as soon as we complete the page update,
so syscache *must* be notified even if the VACUUM FULL as a whole doesn't
complete.  To fix, add some functionality to inval.c to allow the pending
inval messages to be sent immediately while heap_page_prune is still
running.  The implementation is a bit chintzy: it will only work in the
context of VACUUM FULL.  But that's all we need now, and it can always be
extended later if needed.  Per my trouble report of a week ago.
2008-03-13 18:00:32 +00:00
Tom Lane da6a707c29 Fix pg_plan_queries() to restore the previous setting of ActiveSnapshot
(probably NULL) before exiting.  Up to now it's just left the variable as it
set it, which means that after we're done processing the current client
message, ActiveSnapshot is probably pointing at garbage (because this function
is typically run in MessageContext which will get reset).  There doesn't seem
to have been any code path in which that mattered before 8.3, but now the
plancache module might try to use the stale value if the next client message
is a Bind for a prepared statement that is in need of replanning.  Per report
from Alex Hunsaker.
2008-03-12 23:58:27 +00:00
Tom Lane 033eb1581b Fix LISTEN/NOTIFY race condition reported by Laurent Birtz, by postponing
pg_listener modifications commanded by LISTEN and UNLISTEN until the end
of the current transaction.  This allows us to hold the ExclusiveLock on
pg_listener until after commit, with no greater risk of deadlock than there
was before.  Aside from fixing the race condition, this gets rid of a
truly ugly kludge that was there before, namely having to ignore
HeapTupleBeingUpdated failures during NOTIFY.  There is a small potential
incompatibility, which is that if a transaction issues LISTEN or UNLISTEN
and then looks into pg_listener before committing, it won't see any resulting
row insertion or deletion, where before it would have.  It seems unlikely
that anyone would be depending on that, though.

This patch also disallows LISTEN and UNLISTEN inside a prepared transaction.
That case had some pretty undesirable properties already, such as possibly
allowing pg_listener entries to be made for PIDs no longer present, so
disallowing it seems like a better idea than trying to maintain the behavior.
2008-03-12 20:11:46 +00:00
Tom Lane 611b4393f2 Make TransactionIdIsInProgress check transam.c's single-item XID status cache
before it goes groveling through the ProcArray.  In situations where the same
recently-committed transaction ID is checked repeatedly by tqual.c, this saves
a lot of shared-memory searches.  And it's cheap enough that it shouldn't
hurt noticeably when it doesn't help.
Concept and patch by Simon, some minor tweaking and comment-cleanup by Tom.
2008-03-11 20:20:35 +00:00
Tom Lane f0828b2fc3 Provide a build-time option to store large relations as single files, rather
than dividing them into 1GB segments as has been our longtime practice.  This
requires working support for large files in the operating system; at least for
the time being, it won't be the default.

Zdenek Kotala
2008-03-10 20:06:27 +00:00
Tom Lane 9d966c2e32 Fix unportable coding of new error message, per Kris Jurka. 2008-03-10 12:57:05 +00:00
Magnus Hagander 52a8d4f8f7 Implement enum type for guc parameters, and convert a couple of existing
variables to it. More need to be converted, but I wanted to get this in
before it conflicts with too much...

Other than just centralising the text-to-int conversion for parameters,
this allows the pg_settings view to contain a list of available options
and allows an error hint to show what values are allowed.
2008-03-10 12:55:13 +00:00