fill factor has been exceeded. We usually run with ffactor == 1, but
the way the test was coded, it wouldn't split a bucket until the actual
fill factor reached 2.0, because of use of integer division. Change
from > to >= so that it will split more aggressively when the table
starts to get full.
few cycles during transaction exit. A typical session probably
wouldn't have as many as half a dozen portals open at once, so the
original value of 64 seems far larger than needed.
no need for it to be nearly as big as the global hash table, and since
it's not in shared memory it can grow if it does need to be bigger.
By reducing the size, we speed up hash_seq_search(), which saves a
significant fraction of subtransaction entry/exit overhead.
to the original List; per report from Sebastian BÎck. I think this is
the last such bug --- I examined every lcons() call in the backend and
the rest seem OK --- but it's nervous-making that we're still finding
'em so many months after the List rewrite went in.
collector until the transaction commits. Per recent discussion, this
should avoid confusing autovacuum when an updating transaction runs for
a long time.
this is to avoid scenarios where incoming backends find no live copies
of a database's row because the only live copy is in an as-yet-unwritten
shared buffer, which they can't see. Also, use FlushRelationBuffers()
for forcing out pg_database, instead of the much more expensive BufferSync().
There's no need to write out pages belonging to other relations.
Rather than using ReadBuffer() to increment the reference count on an
already-pinned buffer, we should use IncrBufferRefCount() as it is
faster and does not require acquiring the BufMgrLock.
even uglier than it was already :-(. Also, on Windows only, use temporary
shared memory segments instead of ordinary files to pass over critical
variable values from postmaster to child processes. Magnus Hagander
more than 65K columns, or when the created table has more than 65K columns
due to adding inherited columns from parent relations. Fix a similar
crash when processing SELECT queries with more than 65K target list
entries. In all three cases we would eventually detect the error and
elog, but the check was being made too late.
We don't really want to start a new SPI connection, just keep using the old
one; otherwise we have memory management problems as illustrated by
John Kennedy's bug report of today. This requires a bit of a hack to
ensure the SPI stack state is properly restored, but then again what we
were doing before was a hack too, strictly speaking. Add a regression
test to cover this case.
plain SUSET instead. Also delay processing of options received in
client connection request until after we know if the user is a superuser,
so that SUSET values can be set that way by legitimate superusers.
Per recent discussion.
buffer is valid, as ReadBuffer() will elog on error. Most of the call
sites of ReadBuffer() got this right, but this patch fixes those call
sites that did not.
> pg specific, like "PostgreSQL.1". I have not done this since a new compile
> would not detect a running old beta. But now would be the time (or never).
Zeugswetter Andreas
shared memory segment ID. If we can't access the existing shmem segment,
it must not be relevant to our data directory. If we can access it,
then attach to it and check for an actual match to the data directory.
This should avoid some cases of failure-to-restart-after-boot without
introducing any significant risk of failing to detect a still-running
old backend.
estimates when combining the estimates for a range query. As pointed out
by Miquel van Smoorenburg, the existing check for an impossible combined
result would quite possibly fail to detect one default and one non-default
input. It seems better to use the default range query estimate in such
cases. To do so, add a check for an estimate of exactly DEFAULT_INEQ_SEL.
This is a bit ugly because it introduces additional coupling between
clauselist_selectivity and scalarltsel/scalargtsel, but it's not like
there wasn't plenty already...
working as intended --- for some reason, FROM a.b.c was getting
parsed as if it were a function name and not a qualified name.
I think there must be a bug in bison, because it should have
complained that the grammar was ambiguous. Anyway, fix it along
the same lines previously used for func_name vs columnref, and get
rid of the right-recursion in attrs that seems to have confused
bison.
type-and-length coercion function, make sure that the coercion function
is told the correct typmod. Fixes Kris Jurka's example of a domain
over bit(N).
clause implicitly whenever one is not given explicitly. Remove concept
of a schema having an associated tablespace, and simplify the rules for
selecting a default tablespace for a table or index. It's now just
(a) explicit TABLESPACE clause; (b) default_tablespace if that's not an
empty string; (c) database's default. This will allow pg_dump to use
SET commands instead of tablespace clauses to determine object locations
(but I didn't actually make it do so). All per recent discussions.
to DAY precision or coarser; leave the timezone alone when precision is
HOUR or less. This avoids surprises for inputs near a DST transition
time, as per example from Matthew Gabeler-Lee. (The only reason we
recalculate at all is so that outputs that are supposed to represent
days will come out as local midnight, and that's not relevant for sub-day
precision.)
use it, as per my proposal of yesterday. This gives us a means of
determining the zone offset to impute to an unlabeled timestamp that
is both efficient and reliable, unlike all our previous tries involving
mktime() and localtime(). The behavior for invalid or ambiguous times
at a DST transition is fixed to be really and truly "assume standard
time", fixing a bug that has come and gone repeatedly but was back
again in 7.4. (There is some ongoing discussion about whether we should
raise an error instead, but for the moment I'll make it do what it was
previously intended to do.)
of HeapTupleSatisfiesItself() to trigger a hint-bit update on the tuple:
if the row was updated or deleted by a subtransaction of my own transaction
that was later rolled back. This cannot occur in pre-8.0 of course, so
the hint-bit patch applied a couple weeks ago is OK for existing releases.
But for 8.0 it seems we had better fix things so that RI_FKey_check can
pass the correct buffer number to HeapTupleSatisfiesItself. Accordingly,
add fields to the TriggerData struct to carry the buffer ID(s) for the
old and new tuple(s). There are other possible solutions but this one
seems cleanest; it will allow other AFTER-trigger functions to safely
do tqual.c calls if they want to. Put new fields at end of struct so
that there is no API breakage.
We can't regurgitate the unconverted string as I first thought, because
the elog.c mechanisms will assume the error message data is in the server
encoding and attempt a reverse conversion. Eventually it might be worth
providing a short-circuit path to support this, but for now the simplest
solution is to abandon trying to report back the line contents after a
conversion failure. Per bug report from Sil Lee, 27-Oct-2004.
'recycled log files' and 'removed log files' messages from DEBUG1 to
DEBUG2, replacing them with a count of files added/removed/recycled in
the checkpoint end message, as per suggestion from Simon Riggs.
files and directories. This ensures that the bgwriter will close any open
file references it is holding for files therein, which is needed for the
rmdir() to succeed. Andrew Dunstan and Tom Lane.
try to display it as a reference to the underlying column instead. This
is a legitimate substitution (it wouldn't be for a named join) and it
fixes some cases where the display would otherwise be ambiguous. Per
example from Sim Zacks.
in all cases when keep_buf = true. This allows ANALYZE's inner loop to
use heap_release_fetch, which saves multiple buffer lookups for the same
page and avoids overestimation of cost by the vacuum cost mechanism.
returning a NULL pointer (some callers remembered to check the return
value, but some did not -- it is safer to just bail out).
Also, cleanup pgstat.c to use elog(ERROR) rather than elog(LOG) followed
by exit().
examinable by non-superusers, and use it to protect the recently-added
GUC variables for data directory and config files. For now I have only
flagged those variables that could be used to deduce something about
the server's filesystem layout, but possibly we should also mark vars
related to logging settings and other admin-only information?
at the top level of the column's old default expression before adding
an implicit coercion to the new column type. This seems to satisfy the
principle of least surprise, as per discussion of bug #1290.
NO ACTION check is deferrable. This seems to be a closer approximation
to what the SQL spec says than what we were doing before, and it prevents
some anomalous behaviors that are possible now that triggers can fire
during the execution of PL functions.
Stephan Szabo.
to make life cushy for the JDBC driver. Centralize the decision-making
that affects this by inventing a get_type_func_class() function, rather
than adding special cases in half a dozen places.
an oversize message, per suggestion from Oliver Jowett. I'm a bit
dubious that this is a real problem, since the client likely doesn't
have any more space available than the server, but it's not hard to
make it behave according to the protocol intention.
This does not disable the bgwriter process: it still has to wake up often
enough to collect fsync requests from backends in a timely fashion. But
it responds to the recent gripe about not being able to prevent the disk
from being spun up constantly.
specifies a new default tablespace and the template database already has
some tables in that tablespace. There isn't any way to solve this fully
without modifying the clone database's pg_class contents, so for now the
best we can do is issue a better error message.
only covered the case of assigning "", and failed to recognize that
actually setlocale(LC_MESSAGES,...) does not work at all on this platform.
Magnus Hagander, some code prettification by Tom Lane.
pins at end of transaction, and reduce AtEOXact_Buffers to an Assert
cross-check that this was done correctly. When not USE_ASSERT_CHECKING,
AtEOXact_Buffers is a complete no-op. This gets rid of an O(NBuffers)
bottleneck during transaction commit/abort, which recent testing has shown
becomes significant above a few tens of thousands of shared buffers.
(if any) currently waited for by LockBufferForCleanup(), which is all
that we were using it for anymore. Saves some space and eliminates
proportional-to-NBuffers slowdown in UnlockBuffers().
http://archives.postgresql.org/pgsql-hackers/2004-10/msg00464.php.
This fix is intended to be permanent: it moves the responsibility for
calling SetBufferCommitInfoNeedsSave() into the tqual.c routines,
eliminating the requirement for callers to test whether t_infomask changed.
Also, tighten validity checking on buffer IDs in bufmgr.c --- several
routines were paranoid about out-of-range shared buffer numbers but not
about out-of-range local ones, which seems a tad pointless.
C:\msys\1.0\home\y-asaba>pg_ctl -D data restart
waiting for postmaster to shut down...LOG: received smart shutdown
request.
LOG: shutting down
LOG: database system is shut down
done
postmaster stopped
postmaster starting
C:\msys\1.0\home\y-asaba>postmaster.exe: invalid argument: "'-D'"
Try "postmaster.exe --help" for more information.
Yoshiyuki Asaba
- remove another senseless "extern" keyword that was applied to a
function definition
- change a foo more function signatures from "some_type foo()" to
"some_type foo(void)"
- rewrite another K&R style function definition
- make the type of the "action" function pointer in the KeyWord struct
in src/backend/utils/adt/formatting.c more precise
for scanning one term of an OR clause if the index's predicate is implied
by that same OR clause term (possibly in conjunction with top-level WHERE
clauses). Per recent example from Dawid Kuroczko,
http://archives.postgresql.org/pgsql-performance/2004-10/msg00095.php
Also, fix a very long-standing bug in index predicate testing, namely the
bizarre ordering of decomposition of predicate and restriction clauses.
AFAICS the correct way is to break down the predicate all the way, and
then for each component term see if you can prove it from the entire
restriction set. The original coding had a purely-implementation-artifact
distinction between ANDing at the top level and ANDing below that, and
proceeded to get the decomposition order wrong everywhere below the top
level, with the result that even slightly complicated AND/OR predicates
could not be proven. For instance, given
create index foop on foo(f2) where f1=42 or f1=1
or (f1 = 11 and f2 = 55);
the old code would fail to match this index to the query
select * from foo where f1 = 11 and f2 = 55;
when it obviously ought to match.
parent table's tablespace, as per gripe from Michael Kleiser. Choose
a more plausible column order for this view and pg_tables. Update
documentation of these views, which was missed in original patch.
- replace some function signatures of the form "some_type foo()" with
"some_type foo(void)"
- replace a few instances of a literal 0 being used as a NULL pointer;
there are more instances of this in the code, but I just fixed a few
- in src/backend/utils/mb/wstrncmp.c, replace K&R style function
declarations with ANSI style, remove use of 'register' keyword
- remove an "extern" modifier that was applied to a function definition
(rather than a declaration)
The vars are renamed to data_directory, config_file, hba_file, and
ident_file, and are guaranteed to be set to accurate absolute paths
during postmaster startup.
This commit does not yet do anything about hiding path values from
non-superusers.
Refactor code into something reasonably understandable, cause
use of the feature to not fail in standalone backends or in
EXEC_BACKEND case, fix sloppy guc.c table entries, make the
documentation minimally usable.
list elements comma-separated instead of barfing. This allows elimination
of half a dozen redundant copies of that behavior, and also makes the
world safe again for pg_get_expr() applied to pg_index.indexprs, per gripe
from Alexander Zhiltsov.
columns. The returned tuple needs to have appropriate NULL columns
inserted so that it actually matches the declared rowtype. It seemed
convenient to use a JunkFilter for this, so I made some cleanups and
simplifications in the JunkFilter code to allow it to support this
additional functionality. (That in turn exposed a latent bug in
nodeAppend.c, which is that it was returning a tuple slot whose
descriptor didn't match its data.) Also, move check_sql_fn_retval
out of pg_proc.c and into functions.c, where it seems to more naturally
belong.
* We are not sure how much precision is in tv_usec, so we
* swap the high and low 16 bits of 'later' and XOR them with
* 'earlier'. On the off chance that the result is 0, we
* loop until it isn't.
Greg Stark
different backends get a reasonably wide set of initial seeds even if
gettimeofday returns tv_usec values with only a few bits of precision.
Per recent discussion.
* Links with -leay32 and -lssleay32 instead of crypto and ssl. On win32,
"crypto and ssl" is only used for static linking.
* Initializes SSL in the backend and not just in the postmaster. We
cannot pass the SSL context from the postmaster through the parameter
file, because it contains function pointers.
* Split one error check in be-secure.c. Previously we could not tell
which of three calls actually failed. The previous code also returned
incorrect error messages if SSL_accept() failed - that function needs to
use SSL_get_error() on the return value, can't just use the error queue.
* Since the win32 implementation uses non-blocking sockets "behind the
scenes" in order to deliver signals correctly, implements a version of
SSL_accept() that can handle this. Also, add a wait function in case
SSL_read or SSL_write() needs more data.
Magnus Hagander
running contains VACUUM or a similar command that will internally start
and commit transactions. In such a case, the original caller values of
CurrentMemoryContext and CurrentResourceOwner will point to objects that
will be destroyed by the internal commit. We must restore these pointers
to point to the newly-manufactured transaction context and resource owner,
rather than possibly pointing to deleted memory.
Also tweak xact.c so that AbortTransaction and AbortSubTransaction
forcibly restore a sane value for CurrentResourceOwner, much as they
have always done for CurrentMemoryContext. I'm not certain this is
necessary but I'm feeling paranoid today.
Responds to Sean Chittenden's bug report of 4-Oct.
bigint variants). Clean up some inconsistencies in error message wording.
Fix scanint8 to allow trailing whitespace in INT64_MIN case. Update
int8-exp-three-digits.out, which seems to have been ignored by the last
couple of people to modify the int8 regression test, and remove
int8-exp-three-digits-win32.out which is thereby exposed as redundant.
from Sebastian Böck. The fix involves being more consistent about
when rangetable entries are copied or modified. Someday we really
need to fix this stuff to not scribble on its input data structures
in the first place...
This seems the cleanest way of fixing its lack of a shutdown callback,
which was preventing it from working correctly in a query that didn't
run it to completion. Per bug report from Szima GÄbor.
must be stale. Tweak example startup scripts to not use pg_ctl but launch
the postmaster directly, thereby ensuring that only the postmaster's direct
parent shell will be a postgres-owned process. In combination these should
fix the longstanding problem of the postmaster sometimes refusing to start
during reboot because it thinks the old lockfile is not stale.
of locking used by REINDEX. REINDEX needs only ShareLock on the parent
table, same as CREATE INDEX, plus an exclusive lock on the specific index
being processed.
to unreserved keyword, use ereport not elog, assign a separate error code
for 'could not obtain lock' so that applications will be able to detect
that case cleanly.
now are supposed to take some kind of lock on an index whenever you
are going to access the index contents, rather than relying only on a
lock on the parent table.
a separate production func_expr. This allows us to accept all these
variants in the backwards-compatible syntax for creating a functional
index; which beats documenting exactly which things work and which don't.
Interestingly, it also seems to make the generated state machine a little
bit smaller.
(1) Replace while loop with the new forboth() construct in
parser/analyze.c
(2) Replace lcons() with lappend() in SearchCatCacheList(). Since these
now have the same performance, there is no reason to prefer lcons() in
this case, and using lappend() leads to cleaner code.
(3) Improve the name of the second parameter to for_each_cell()
and hopefully improve code clarity while at it. One intentional
semantics change: a backslashed space will not be treated as removable
trailing whitespace, as the prior coding would do. ISTM that if it
wouldn't be considered removable leading whitespace, it shouldn't be
stripped at the end either.
setting is valid must ignore that state and permit the assignment anyway
when source is PGC_S_OVERRIDE. Otherwise they may disallow a rollback
at transaction abort, which is The Wrong Thing. Per example from
Michael Fuhr 12-Sep-04.
when a function that returns a single tuple (not a setof tuple) returns
NULL. This seems to be the most consistent behavior. It would have
taken a bit less code to make it return an empty table (zero rows) but
ISTM a non-SETOF function ought always return exactly one row. Per
bug report from Ivan-Sun1.
was large enough to be batched and the tuples fell into a batch where
there were no inner tuples at all. Thanks to Xiaoyu Wang for finding a
test case that exposed this long-standing bug.
creating a new tuple. This is just for debugging sanity, though, since
nothing should be paying any attention to xmax when the HEAP_XMAX_INVALID
bit is set.
pg_subtrans --- what we need is the oldest xmin of any snapshot in use
in the current top transaction. Introduce a new variable TransactionXmin
to play this role. Fixes intermittent regression failure reported by
Neil Conway.
as per recent discussions. Invent SubTransactionIds that are managed like
CommandIds (ie, counter is reset at start of each top transaction), and
use these instead of TransactionIds to keep track of subtransaction status
in those modules that need it. This means that a subtransaction does not
need an XID unless it actually inserts/modifies rows in the database.
Accordingly, don't assign it an XID nor take a lock on the XID until it
tries to do that. This saves a lot of overhead for subtransactions that
are only used for error recovery (eg plpgsql exceptions). Also, arrange
to release a subtransaction's XID lock as soon as the subtransaction
exits, in both the commit and abort cases. This avoids holding many
unique locks after a long series of subtransactions. The price is some
additional overhead in XactLockTableWait, but that seems acceptable.
Finally, restructure the state machine in xact.c to have a more orthogonal
set of states for subtransactions.
mode see a fresh snapshot for each command in the function, rather than
using the latest interactive command's snapshot. Also, suppress fresh
snapshots as well as CommandCounterIncrement inside STABLE and IMMUTABLE
functions, instead using the snapshot taken for the most closely nested
regular query. (This behavior is only sane for read-only functions, so
the patch also enforces that such functions contain only SELECT commands.)
As per my proposal of 6-Sep-2004; I note that I floated essentially the
same proposal on 19-Jun-2002, but that discussion tailed off without any
action. Since 8.0 seems like the right place to be taking possibly
nontrivial backwards compatibility hits, let's get it done now.
((Snapshot) NULL) can no longer be confused with a valid snapshot,
as per my recent suggestion. Define a macro InvalidSnapshot for 0.
Use InvalidSnapshot instead of SnapshotAny as the do-nothing special
case for heap_update and heap_delete crosschecks; this seems a little
cleaner even though the behavior is really the same.
rather than when returning to the idle loop. This makes no particular
difference for interactively-issued queries, but it makes a big difference
for queries issued within functions: trigger execution now occurs before
the calling function is allowed to proceed. This responds to numerous
complaints about nonintuitive behavior of foreign key checking, such as
http://archives.postgresql.org/pgsql-bugs/2004-09/msg00020.php, and
appears to be required by the SQL99 spec.
Also take the opportunity to simplify the data structures used for the
pending-trigger list, rename them for more clarity, and squeeze out a
bit of space.
status. In particular, I see no reason for deferredTriggerCheckState
to make an explicit entry to note that a particular trigger has its
default state --- that just clutters a list that should normally be
empty or very short. I have plans to revise this module much more
heavily, but this is a simple separable improvement.
Asserts would lead to a server core dump if an error occurred while
trying to abort a failed subtransaction (thereby leading to re-execution
of whatever parts of AbortSubTransaction had already run). This of course
does not prevent such an error from creating an infinite loop, but at
least we don't make the situation worse. Responds to an open item on
the subtransactions to-do list.
elog() emulation code always calls errstart with ERROR error level.
This means that a recursive error call triggered by elog would do
MemoryContextReset(ErrorContext), whether or not this was actually
appropriate. I'm surprised we haven't seen this in the field...
Messages of less than ERROR severity should never be promoted (this
fixes Gaetano Mendola's problem with a COMMERROR becoming a PANIC,
and is obvious in hindsight anyway). Do all promotion in errstart
not errfinish, to ensure that output decisions are made correctly;
the former coding could suppress logging of promoted errors, which
doesn't seem like a good idea. Eliminate some redundant code too.
use of already-freed strings, other silliness. Also fix reporting of
config file syntax errors so that it actually works reasonably well
(eg, points at the correct line). Use palloc instead of malloc for
temporary storage to reduce code clutter.
not supposed to (fixes problem with postmaster aborting due to mistaken
postgresql.conf change); don't call superuser() when not inside a
transaction (fixes coredump when, eg, try to set log_statement from
PGOPTIONS); some message style guidelines enforcement.
default tablespace --- they should always go in the database's default
tablespace. Adjust heap_create() API so that it is passed the relkind
to make this easier; should simplify any further tweaking of the same
sort.
to allow DBA to choose the form in which log filenames reflect the
current time. Also allow for truncating instead of appending to
pre-existing files --- this is convenient when the log filename pattern
rewrites the same names cyclically. Per Ed L.
during replay of CREATE DATABASE as well as the first time around.
Else it's possible that the copy operation will copy obsolete blocks.
We are still a long way from guaranteeing anything about using a
recently-written database as a CREATE template, but this seems needed
to ensure the existing behavior holds up during replay.
Fix TablespaceCreateDbspace() to be able to create a dummy directory
in place of a dropped tablespace's symlink. This eliminates the open
problem of a PANIC during WAL replay when a replayed action attempts
to touch a file in a since-deleted tablespace. It also makes for a
significant improvement in the usability of PITR replay.
a more tolerable limit on the number of subtransactions or deleted files
in COMMIT and ABORT records. Buy back the extra space by eliminating the
xl_xact_prev field, which isn't being used for anything and is rather
unlikely ever to be used for anything.
This does not force initdb, but you do need to do pg_resetxlog if you
want to upgrade an existing 8.0 installation without initdb.
Win32 WaitForMultipleObjects:
ret = WaitForMultipleObjects(win32_numChildren, win32_childHNDArray,
FALSE, 0);
Problem is 'win32_numChildren' could be more then 64 ( function supports
), problem basically arise ( kills postgres ) when you create more then
64 connections and terminate some of them sill leaving more then 64.
Claudio Natoli
>>GetLastError will
>>> give much more details than errno.
>>
>>How much more, really? That mapping table gave me the impression that
>>the win32 error codes aren't all that much more detailed than errno...
>
>The mapping table is not complete. My winerror.h from the SDK
>lists 2209
>error codes, whereas errno.h lists 42...
>
>I still don't think we'll get that much more stuff. Right now,
>the Win32
>code paths that actually use the more advanced functions already write
>out the error number in case something happens. We can keep doing that
>for the other paths (ereport the error *number* when the mapping does
>not have a match). The map to errno will catch almost all cases, I
>think. And in the corner cases we can do with just the number, and use
>"net helpmsg" to get the actual message when checking...
Here's an attempt on this. new file goes in backend/port/win32.
Magnus Hagander
so that we close and flush the doomed relation's relcache entry before
we start to delete the underlying catalog rows, rather than afterwards.
For awhile yesterday I thought that an unexpected relcache entry rebuild
partway through this sequence might explain the infrequent parallel
regression failures we were chasing. It doesn't, mainly because there's
no CommandCounterIncrement in the sequence and so the deletions aren't
"really" done yet. But it sure seems like trouble waiting to happen.
relcache entries. Also, change TransactionIdIsCurrentTransactionId()
so that if consulted during transaction abort, it will not say that
the aborted xact is still current. (It would be better to ensure that
it's never called at all during abort, but I'm not sure we can easily
guarantee that.) In combination, these fix a crash we have seen
occasionally during parallel regression tests of 8.0.
from being accepted after the outer right brace. Per report from
Markus Bertheau.
Also add regression test cases for this change, and for previous
recent array literal parser changes.
PROCLOCK structs in shared memory now have only a bitmask for held
locks, rather than counts (making them 40 bytes smaller, which is a
good thing). Multiple locks within a transaction are counted in the
local hash table instead, and we have provision for tracking which
ResourceOwner each count belongs to. Solves recently reported problem
with memory leakage within long transactions.
for every command executed within a transaction. For long transactions
this was a significant memory leak. Instead, we can delete a portal's
or subtransaction's ResourceOwner immediately, if we physically transfer
the information about its locks up to the parent owner. This does not
fully solve the leak problem; we need to do something about counting
multiple acquisitions of the same lock in order to fix it. But it's a
necessary step along the way.
ColLabel instead of just ColId --- that is, any keyword can appear after
a dot and it will be taken as an identifier. Fixes problems with names
that are okay as standalone function names but fail when qualified.
updates are no longer WAL-logged nor even fsync'd; we do not need to,
since after a crash no old pg_subtrans data is needed again. We truncate
pg_subtrans to RecentGlobalXmin at each checkpoint. slru.c's API is
refactored a little bit to separate out the necessary decisions.
RecentXmin (== MyProc->xmin). This ensures that it will be safe to
truncate pg_subtrans at RecentGlobalXmin, which should largely eliminate
any fear of bloat. Along the way, eliminate SubTransXidsHaveCommonAncestor,
which isn't really needed and could not give a trustworthy result anyway
under the lookback restriction.
In an unrelated but nearby change, #ifdef out GetUndoRecPtr, which has
been dead code since 2001 and seems unlikely to ever be resurrected.
>>'127.0.0.1/32' instead of '127.0.0.1 255.255.255.255'.
>>
>>
>
>Yeah, that's probably the path of least resistance. Note that the
>comments and possibly the SGML docs need to be adjusted to match,
>however, so it's not quite a one-liner.
Andrew Dunstan
> why does CVS tip still give me
>
> regression=# select extract(century from now());
> date_part
> -----------
> 20
> (1 row)
> [ ... looks in code ... ]
>
> Apparently it's because you fixed only timestamp_part, and not
> timestamptz_part. I'm not too sure about what timestamp_trunc or
> timestamptz_trunc should do, but they may be wrong as well.
Sigh... as usual, what is not tested does not work:-(
> Could we have a more complete patch?
Please find a submission attached. I hope it really fixes all decade,
century and millenium issues for extract and *_trunc functions on
interval
and other timestamp types. If someone could check that the results
are reasonnable, it would be great.
I indeed overlooked the fact that there were two functions. The patch
fixes the code so that both variants agree.
I added comments to interval extractions, because it relies on the C
division to have a negative remainder: -7/10 = 0 and remains -7.
As for *_trunc functions, I have chosen to put the first year of the
century or millennium: -100, 1, 101... 1001 2001 etc. Indeed, I don't
think it would make sense to put 2000 (last year of the 2nd millennium)
for rounding all years of the third millenium.
I also fixed the code so that all decades last 10 years and decade 199
means the 1990's.
I have added some tests that are relevant to deal with tricky cases. The
formula may be simplified, but all these cases must pass. Please keep
them.
Fabien Coelho
presence of dropped columns. Document the already-presumed fact that
eref aliases in relation RTEs are supposed to have entries for dropped
columns; cause the user alias structs to have such entries too, so that
there's always a one-to-one mapping to the underlying physical attnums.
Adjust expandRTE() and related code to handle the case where a column
that is part of a JOIN has been dropped. Generalize expandRTE()'s API
so that it can be used in a couple of places that formerly rolled their
own implementation of the same logic. Fix ruleutils.c to suppress
display of aliases for columns that were dropped since the rule was made.
value of 'start' could be past the end of the page, if the page was
split by some concurrent inserting process since we visited it. In
this situation the code could look at bogus entries and possibly find
a match (since after all those entries still contain what they had
before the split). This would lead to 'specified item offset is too large'
followed by 'PANIC: failed to add item to the page', as reported by Joe
Conway for scenarios involving heavy concurrent insertion activity.
to the physical layout of the rowtype, ie, there are dummy arguments
corresponding to any dropped columns in the rowtype. We formerly had a
couple of places that did it this way and several others that did not.
Fixes Gaetano Mendola's "cache lookup failed for type 0" bug of 5-Aug.
of XLogInsert had the same sort of checkpoint interlock problem as
RecordTransactionCommit, and indeed I found some. Btree index build
and ALTER TABLE SET TABLESPACE write data outside the friendly confines
of the buffer manager, and therefore they have to take their own
responsibility for checkpoint interlock. The easiest solution seems to
be to force smgrimmedsync at the end of the index build or table copy,
even when the operation is being WAL-logged. This is sufficient since
the new index or table will be of interest to no one if we don't get
as far as committing the current transaction.
therefore starting with GetCurrentTransactionId is wrong. Fixes
miscomputation of RecentGlobalXmin leading to bizarre behavior
reported by Gavin Sherry.
don't hold an open file reference to the original table at the end.
This is a good thing in any case, particularly so on Windows which
cannot drop the table file otherwise.
by the SQL standard. For backwards compatibility, however, continue to
accept the syntax without. Minor editorialization in the reference pages
for these commands, too.
and doesn't process forward slashes in the same way as external
commands. Quoting the first argument to COPY does not convert forward
to backward slashes, but COPY does properly process quoted forward
slashes in the second argument.
Win32 COPY works with quoted forward slashes in the first argument only if the
current directory is the same as the directory of the first argument.
for transaction commits that occurred just before the checkpoint. This is
an EXTREMELY serious bug --- kudos to Satoshi Okada for creating a
reproducible test case to prove its existence.
slashes to backslashes #ifdef WIN32. This is to cope with the fact
that Windows seems exceedingly unfriendly to slashes in shell commands,
as per recent discussion.
CurrentMemoryContext is DLLIMPORT on Win32. Work around that by
creating stubs in the backend for palloc/pstrdup.
Also fix pg_dumpall to do proper quoting on Win32.
was previously allowed in odd places with odd results now causes an ERROR.
Also changed behavior with respect to whitespace -- trailing whitespace is
now ignored as well as leading whitespace (which has always been ignored).
Documentation updated to reflect change in whitespace handling. Also some
refactoring to what I believe is a more sensible order of several paragraphs.
There won't be any, and in fact there won't even be an RTE for NEW,
which was leading to a core dump in CVS tip. 7.4 and earlier manage
not to crash when applying ResolveNew in this scenario, but I think
it was just good fortune that they didn't. Per report from
Bernd Helmle.
This avoids changing the displayed appearance of ACL columns now that
array_out decorates its output with bounds information when the lower
bound isn't one. Per gripe from Gaetano Mendola. Note that I did not
force initdb for this, although any database initdb'd in the last
couple of days is going to have some problems.
recommend that people go get Apache's rotatelogs program. Additional
benefits are that configuration is done through GUC, rather than
externally, and that the postmaster can monitor the log rotator and
restart it after failure (though we certainly hope that won't happen
often).
Andreas Pflug, some rework by Tom Lane.
subarrays of a given dimension have the same number of elements/subarrays.
Also repair a longstanding undocumented (as far as I can see) ability to
explicitly set array bounds in the array literal syntax. It now can
deal properly with negative array indicies. Modify array_out so that
arrays with non-standard lower bounds (i.e. not 1) are output with
the expicit dimension syntax. This fixes a longstanding issue whereby
arrays with non-default lower bounds had them changed to default
after a dump/reload cycle.
Modify regression tests and docs to suit, and add some minimal
documentation regarding the explicit dimension syntax.
dependency was looking at wrong columns and so would always fail.
Was not exposed by regression tests because we are only testing cases
involving built-in (pinned) types and so no actual dependency entry
exists to be removed.
and history files as per recent discussion. While at it, remove
pg_terminate_backend, since we have decided we do not have time during
this release cycle to address the reliability concerns it creates.
Split the 'Miscellaneous Functions' documentation section into
'System Information Functions' and 'System Administration Functions',
which hopefully will draw the eyes of those looking for such things.
mode (per complaint from Kris Jurka) and it was only by chance that it
didn't fail in simple-query mode. A COMMIT or ROLLBACK has to be
executed by a portal, therefore it's wrong to suppose that there aren't
any live portals at CleanupTransaction time.
error code for string-too-long errors. It should be STRING_DATA_RIGHT_TRUNCATION
not STRING_DATA_LENGTH_MISMATCH. The latter probably should only be
applied to cases where a string must be exactly so many bits --- there are
no cases at all where it applies to character strings, only bit strings.
executed. Previously, the DECLARE would succeed but subsequent FETCHes
would fail since the parameter values supplied to DECLARE were not
propagated to the portal created for the cursor.
In support of this, add type Oids to ParamListInfo entries, which seems
like a good idea anyway since code that extracts a value can double-check
that it got the type of value it was expecting.
Oliver Jowett, with minor editorialization by Tom Lane.
to the old owner with the new owner. This is not necessarily right, but
it's sure a lot more likely to be what the user wants than doing nothing.
Christopher Kings-Lynne, some rework by Tom Lane.
number of active subtransaction XIDs in each backend's PGPROC entry,
and use this to avoid expensive probes into pg_subtrans during
TransactionIdIsInProgress. Extend EOXactCallback API to allow add-on
modules to get control at subxact start/end. (This is deliberately
not compatible with the former API, since any uses of that API probably
need manual review anyway.) Add basic reference documentation for
SAVEPOINT and related commands. Minor other cleanups to check off some
of the open issues for subtransactions.
Alvaro Herrera and Tom Lane.
>takes a string to specify the local authentication method:
>
> initdb --auth 'ident'
>
>or whatever the user wants. I think this is more flexible and more
>compact. It would default to 'trust', and the packagers could
>set it to
>whatever they want. If their OS supports local ident, they can use
>that.
>
>Also keep in mind you might want some ident map file:
>
> initdb --auth 'ident mymap'
>
>so you would need to allow multiple words in the string.
Magnus Hagander
Create a shared function to convert a SPI error code into a string
(replacing near-duplicate code in several PLs), and use it anywhere
that a SPI function call error is reported.
There are still some things that need refinement; in particular I fear
that the recognized set of error condition names probably has little in
common with what Oracle recognizes. But it's a start.
possible to trap an error inside a function rather than letting it
propagate out to PostgresMain. You still have to use AbortCurrentTransaction
to clean up, but at least the error handling itself will cooperate.
password/group files. Also allow read-only subtransactions of a read-write
parent, but not vice versa. These are the reasonably noncontroversial
parts of Alvaro's recent mop-up patch, plus further work on large objects
to minimize use of the TopTransactionResourceOwner.
SAVEPOINT/RELEASE/ROLLBACK-TO syntax. (Alvaro)
Cause COMMIT of a failed transaction to report ROLLBACK instead of
COMMIT in its command tag. (Tom)
Fix a few loose ends in the nested-transactions stuff.
live backends, the archiver and stats processes never got sent a
kill signal. They'd eventually exit on their own, but not for awhile,
which is a bit annoying when you are trying to replace the executable
file on a platform that doesn't allow removal of busy executables.
Also, tweak main loop logic so that we will perform the background
tasks after select() returns EINTR.
recovery_target_timeline --- otherwise there is no path from the backup
to the requested timeline. This check was foreseen in the original
discussion but I forgot to implement it.
recovered from archive is not corrupt. It's not much but it will catch
one common problem, viz out-of-disk-space.
Also, force a WAL recovery scan when recovery.conf is present, even if
pg_control shows a clean shutdown. This allows recovery with a tar backup
that was taken with the postmaster shut down, as per complaint from
Mark Kirkwood.
recovery more manageable. Also, undo recent change to add FILE_HEADER
and WASTED_SPACE records to XLOG; instead make the XLOG page header
variable-size with extra fields in the first page of an XLOG file.
This should fix the boundary-case bugs observed by Mark Kirkwood.
initdb forced due to change of XLOG representation.
* Fix help text ordering
* Add back --set-session-authorization to pg_dumpall. Updated the docs
for that. Updated help for that.
* Dump ALTER USER commands for the cluster owner ("pgsql"). These are
dumped AFTER the create user and create database commands in case the
permissions to do these have been revoked.
* Dump ALTER OWNER for public schema (because it's possible to change
it). This was done by adding TOC entries for the public schema, and
filtering them out at archiver time. I also save the owner in the TOC
entry just for the public schema.
* Suppress dumping single quotes around schema_path and DateStyle
options when they are set using ALTER USER or ALTER DATABASE. Added a
comment to the steps in guc.c to remind people to update that list.
* Fix dumping in --clean mode against a pre-7.3 server. It just sets
all drop statements to assume the public schema, allowing it to restore
without error.
* Cleaned up text output. eg. Don't output -- Tablespaces comment if
there are none. Same for groups and users.
* Make the commands to DELETE FROM pg_shadow and DELETE FROM pg_group
only be output when -c mode is enabled. I'm not sure why that hasn't
been done before?!?!
This should be good for application asap, after which I will start on
regression dumping 7.0-7.4 databases.
Christopher Kings-Lynne
force relcache rebuild for the other table as well as the column's
own table. Otherwise, already-cached foreign key triggers will stop
working. Per example from Alexander Pravking.
keep track of portal-related resources separately from transaction-related
resources. This allows cursors to work in a somewhat sane fashion with
nested transactions. For now, cursor behavior is non-subtransactional,
that is a cursor's state does not roll back if you abort a subtransaction
that fetched from the cursor. We might want to change that later.
live in database or schema's default tablespace, as per today's discussion.
Also, remove some unused keywords from the grammar (PATH, PENDANT,
VERSION), and fix ALSO, which was added as a keyword but not added
to the keyword classification lists, thus making it worse-than-reserved.
probably should have been to begin with; this is to cover cases like
needing to recreate the per-db directory during WAL replay.
Also, fix heap_create to force pg_class.reltablespace to be zero instead
of the database's default tablespace; this makes the world safe for
CREATE DATABASE to handle all tables in the default tablespace alike,
as per previous discussion. And force pg_class.reltablespace to zero
when creating a relation without physical storage (eg, a view); this
avoids possibly having dangling references in this column after a
subsequent DROP TABLESPACE.
shift support code into heapam.c accordingly. This is in service of
soon-to-be-committed ALTER TABLE SET TABLESPACE code that will want to
use this same record type for both heaps and indexes.
Theoretically I should have forced initdb for this, but in practice there
is no change in xlog contents because CVS tip will never really emit this
record type anyhow...
than 'datetime BC zone', because the former is accepted by the timestamptz
input converter while the latter may not be depending on spacing. This
is not a loss of compatibility w.r.t. 7.4 and before, because until very
recently there was never a case where we'd output both zone and 'BC'.