begin the shutdown checkpoint; there isn't anything left for them to do,
so we may as well ensure that they shut down sooner rather than later.
Per discussion.
cidr type bit, the same as network_eq does. This is needed for hash joins
and hash aggregation to work correctly on these types. Per bug report
from Michael Fuhr, 2004-04-13.
Also, improve hash function for int8 as suggested by Greg Stark.
not holding the buffer's cntx_lock or io_in_progress_lock. A recent
report from Litao Wu makes me wonder whether it is ever possible for
us to drop a buffer and forget to release its cntx_lock. The Assert
does not fire in the regression tests, but that proves little ...
until Bind is received, so that actual parameter values are visible to the
planner. Make use of the parameter values for estimation purposes (but
don't fold them into the actual plan). This buys back most of the
potential loss of plan quality that ensues from using out-of-line
parameters instead of putting literal values right into the query text.
This patch creates a notion of constant-folding expressions 'for
estimation purposes only', in which case we can be more aggressive than
the normal eval_const_expressions() logic can be. Right now the only
difference in behavior is inserting bound values for Params, but it will
be interesting to look at other possibilities. One that we've seen
come up repeatedly is reducing now() and related functions to current
values, so that queries like ... WHERE timestampcol > now() - '1 day'
have some chance of being planned effectively.
Oliver Jowett, with some kibitzing from Tom Lane.
extensive change then what was suggested. I found the file path.c that
contained a lot of "Unix/Windows" agnostic functions so I added a function
there instead and removed the PATHSEP declaration in exec.c altogether. All
to keep things from scattering all over the code.
I also took the liberty of changing the name of the functions
"first_path_sep" and "last_path_sep". Where I come from (and I'm apparently
not alone given the former macro name PATHSEP), they should be called
"first_dir_sep" and "last_dir_sep". The new function I introduced, that
actually finds path separators, is now the "first_path_sep". The patch
contains changes on all affected places of course.
I also changed the documentation on dynamic_library_path to reflect the
chagnes.
Thomas Hallgren
sequences, as per recent discussion. All these names are now of the
form table_column_type, with digits added if needed to make them unique.
Default constraint names are chosen to be unique across their whole schema,
not just within the parent object, so as to be more SQL-spec-compatible
and make the information schema views more useful.
As a side effect, cause subscripts in INSERT targetlists to do something
more or less sensible; previously we evaluated such subscripts and then
effectively ignored them. Another side effect is that UPDATE-ing an
element or slice of an array value that is NULL now produces a non-null
result, namely an array containing just the assigned-to positions.
in pg_proc for record_in, record_out, etc to reflect that these routines
now make use of the second OID parameter. Remove the ancient SET entry
in pg_type, which is now highly unlikely to ever become used again.
Adjust type_sanity regression test to match.
of a composite type to get that type's OID as their second parameter,
in place of typelem which is useless. The actual changes are mostly
centralized in getTypeInputInfo and siblings, but I had to fix a few
places that were fetching pg_type.typelem for themselves instead of
using the lsyscache.c routines. Also, I renamed all the related variables
from 'typelem' to 'typioparam' to discourage people from assuming that
they necessarily contain array element types.
1. Solve the problem of not having TOAST references hiding inside composite
values by establishing the rule that toasting only goes one level deep:
a tuple can contain toasted fields, but a composite-type datum that is
to be inserted into a tuple cannot. Enforcing this in heap_formtuple
is relatively cheap and it avoids a large increase in the cost of running
the tuptoaster during final storage of a row.
2. Fix some interesting problems in expansion of inherited queries that
reference whole-row variables. We never really did this correctly before,
but it's now relatively painless to solve by expanding the parent's
whole-row Var into a RowExpr() selecting the proper columns from the
child.
If you dike out the preventive check in CheckAttributeType(),
composite-type columns now seem to actually work. However, we surely
cannot ship them like this --- without I/O for composite types, you
can't get pg_dump to dump tables containing them. So a little more
work still to do.
loop over the fields instead of a loop around heap_getattr. This is
considerably faster (O(N) instead of O(N^2)) when there are nulls or
varlena fields, since those prevent use of attcacheoff. Replace loops
over heap_getattr with heap_deformtuple in situations where all or most
of the fields have to be fetched, such as printtup and tuptoaster.
Profiling done more than a year ago shows that this should be a nice
win for situations involving many-column tables.
place of time_t, as per prior discussion. The behavior does not change
on machines without a 64-bit-int type, but on machines with one, which
is most, we are rid of the bizarre boundary behavior at the edges of
the 32-bit-time_t range (1901 and 2038). The system will now treat
times over the full supported timestamp range as being in your local
time zone. It may seem a little bizarre to consider that times in
4000 BC are PST or EST, but this is surely at least as reasonable as
propagating Gregorian calendar rules back that far.
I did not modify the format of the zic timezone database files, which
means that for the moment the system will not know about daylight-savings
periods outside the range 1901-2038. Given the way the files are set up,
it's not a simple decision like 'widen to 64 bits'; we have to actually
think about the range of years that need to be supported. We should
probably inquire what the plans of the upstream zic people are before
making any decisions of our own.
environment variable processing to libpq.
The patch also adds code to our client apps so we set the environment
variable directly based on our binary location, unless it is already
set. This will allow our applications to emit proper locale messages
that are generated in libpq.
locking conflict against concurrent CHECKPOINT that was discussed a few
weeks ago. Also, if not using WAL archiving (which is always true ATM
but won't be if PITR makes it into this release), there's no need to
WAL-log the index build process; it's sufficient to force-fsync the
completed index before commit. This seems to gain about a factor of 2
in my tests, which is consistent with writing half as much data. I did
not try it with WAL on a separate drive though --- probably the gain would
be a lot less in that scenario.
of bug report #1150. Also, arrange that the object owner's irrevocable
grant-option permissions are handled implicitly by the system rather than
being listed in the ACL as self-granted rights (which was wrong anyway).
I did not take the further step of showing these permissions in an
explicit 'granted by _SYSTEM' ACL entry, as that seemed more likely to
bollix up existing clients than to do anything really useful. It's still
a possible future direction, though.
temp tables, and avoid WAL-logging truncations of temp tables. Do issue
fsync on truncated files (not sure this is necessary but it seems like
a good idea).
rather than an error code, and does elog(ERROR) not elog(WARNING)
when it detects a problem. All callers were simply elog(ERROR)'ing on
failure return anyway, and I find it hard to envision a caller that would
not, so we may as well simplify the callers and produce the more useful
error message directly.
explicitly fsync'ing every (non-temp) file we have written since the
last checkpoint. In the vast majority of cases, the burden of the
fsyncs should fall on the bgwriter process not on backends. (To this
end, we assume that an fsync issued by the bgwriter will force out
blocks written to the same file by other processes using other file
descriptors. Anyone have a problem with that?) This makes the world
safe for WIN32, which ain't even got sync(2), and really makes the world
safe for Unixen as well, because sync(2) never had the semantics we need:
it offers no way to wait for the requested I/O to finish.
Along the way, fix a bug I recently introduced in xlog recovery:
file truncation replay failed to clear bufmgr buffers for the dropped
blocks, which could result in 'PANIC: heap_delete_redo: no block'
later on in xlog replay.
than being random pieces of other files. Give bgwriter responsibility
for all checkpoint activity (other than a post-recovery checkpoint);
so this child process absorbs the functionality of the former transient
checkpoint and shutdown subprocesses. While at it, create an actual
include file for postmaster.c, which for some reason never had its own
file before.
about a third, make it work on non-Windows platforms again. (But perhaps
I broke the WIN32 code, since I have no way to test that.) Fold all the
paths that fork postmaster child processes to go through the single
routine SubPostmasterMain, which takes care of resurrecting the state that
would normally be inherited from the postmaster (including GUC variables).
Clean up some places where there's no particularly good reason for the
EXEC and non-EXEC cases to work differently. Take care of one or two
FIXMEs that remained in the code.
of ThisStartUpID and RedoRecPtr into new backends. It's a lot easier just
to make them all grab the values out of shared memory during startup.
This helps to decouple the postmaster from checkpoint execution, which I
need since I'm intending to let the bgwriter do it instead, and it also
fixes a bug in the Win32 port: ThisStartUpID wasn't getting propagated at
all AFAICS. (Doesn't give me a lot of faith in the amount of testing that
port has gotten.)
the four functions.
> Also, please justify the temp-related changes. I was not aware that we
> had any breakage there.
patch-tmp-schema.txt contains the following bits:
*) Changes pg_namespace_aclmask() so that the superuser is always able
to create objects in the temp namespace.
*) Changes pg_namespace_aclmask() so that if this is a temp namespace,
objects are only allowed to be created in the temp namespace if the
user has TEMP privs on the database. This encompasses all object
creation, not just TEMP tables.
*) InitTempTableNamespace() checks to see if the current user, not the
session user, has access to create a temp namespace.
The first two changes are necessary to support the third change. Now
it's possible to revoke all temp table privs from non-super users and
limiting all creation of temp tables/schemas via a function that's
executed with elevated privs (security definer). Before this change,
it was not possible to have a setuid function to create a temp
table/schema if the session user had no TEMP privs.
patch-area-path.txt contains:
*) Can now determine the area of a closed path.
patch-dfmgr.txt contains:
*) Small tweak to add the library path that's being expanded.
I was using $lib/foo.so and couldn't easily figure out what the error
message, "invalid macro name in dynamic library path" meant without
looking through the source code. With the path in there, at least I
know where to start looking in my config file.
Sean Chittenden
(1) boolean-and and boolean-or aggregates named bool_and and bool_or.
they (SHOULD;-) correspond to standard sql every and some/any aggregates.
they do not have the right name as there is a problem with
the standard and the parser for some/any. Tom also think that
the standard name is misleading because NULL are ignored.
Also add 'every' aggregate.
(2) bitwise integer aggregates named bit_and and bit_or for
int2, int4, int8 and bit types. They are not standard, but I find
them useful. I needed them once.
The patches adds:
- 2 new very short strict functions for boolean aggregates in
src/backed/utils/adt/bool.c,
src/include/utils/builtins.h and src/include/catalog/pg_proc.h
- the new aggregates declared in src/include/catalog/pg_proc.h and
src/include/catalog/pg_aggregate.h
- some documentation and validation about these new aggregates.
Fabien COELHO
extend the GUC variable set".
Plugin modules like the pl<lang> modules needs a way to declare
configuration parameters. The postmaster has no knowledge of such
modules when it reads the postgresql.conf file. Rather than allowing
totally unknown configuration parameters, the concept of a variable
"class" is introduced. Variables that belongs to a declared classes will
create a placeholder value of string type and will not generate an
error. When a module is loaded, it will declare variables for such a
class and make those variables "consume" any placeholders that has been
defined. Finally, the module will generate warnings for unrecognized
placeholders defined for its class.
More detail:
The design is outlined after the suggestions made by Tom Lane and Joe
Conway in this thread:
http://archives.postgresql.org/pgsql-hackers/2004-02/msg00229.php
A new string variable 'custom_variable_classes' is introduced. This
variable is a comma separated string of identifiers. Each identifier
denots a 'class' that will allow its members to be added without error.
This variable must be defined in postmaster.conf.
The lexer (guc_file.l) is changed so that it can accept a qualified name
in the form <ID>.<ID> as the name of a variable. I also changed so that
the 'custom_variable_classes', if found, is added first of all variables
in order to remove the order of declaration issue.
The guc_variables table is made more dynamic. It is originally created
with 20% slack and can grow dynamically. A capacity is introduced to
avoid resizing every time a new variable is added. guc_variables and
num_guc_variables becomes static (hidden).
The GucInfoMain now uses the new function get_guc_variables() and
GetNumConfigOptions instead or using the guc_variables directly.
The find_option() function, when passed a missing name, will check if
the name is qualified. If the name is qualified and if the qualifier
denotes a class included in the 'custom_variable_classes', a placeholder
variable will be created. Such a placeholder will not participate in a
list operation but will otherwise function as a normal string variable.
Define<type>GucVariable() functions will be added, one for each variable
type. They are inteded to be used by add-on modules like the pl<lang>
mappings. Example:
extern void DefineCustomBoolVariable(
const char* name,
const char* short_desc,
const char* long_desc,
bool* valueAddr,
GucContext context,
GucBoolAssignHook assign_hook,
GucShowHook show_hook);
(I created typedefs for the assign-hook and show-hook functions). A call
to these functions will define a new GUC-variable. If a placeholder
exists it will be replaced but it's value will be used in place of the
default value. The valueAddr is assumed ot point at a default value when
the define function is called. The only constraint that is imposed on a
Custom variable is that its name is qualified.
Finally, a function:
void EmittWarningsOnPlacholders(const char* className)
was added. This function should be called when a module has completed
its variable definitions. At that time, no placeholders should remain
for the class that the module uses. If they do, elog(INFO, ...) messages
will be issued to inform the user that unrecognized variables are
present.
Thomas Hallgren
It was necessary to touch in grammar and create a new node to make home
to the new syntax. The command is also supported in E
CPG. Doc updates are attached too. Only superusers can change the owner
of the database. New owners don't need any aditional
privileges.
Euler Taveira de Oliveira
In the past, we used a 'Lispy' linked list implementation: a "list" was
merely a pointer to the head node of the list. The problem with that
design is that it makes lappend() and length() linear time. This patch
fixes that problem (and others) by maintaining a count of the list
length and a pointer to the tail node along with each head node pointer.
A "list" is now a pointer to a structure containing some meta-data
about the list; the head and tail pointers in that structure refer
to ListCell structures that maintain the actual linked list of nodes.
The function names of the list API have also been changed to, I hope,
be more logically consistent. By default, the old function names are
still available; they will be disabled-by-default once the rest of
the tree has been updated to use the new API names.
(SIGUSR1, which we have not been using recently) instead of piggybacking
on SIGUSR2-driven NOTIFY processing. This has several good results:
the processing needed to drain the sinval queue is a lot less than the
processing needed to answer a NOTIFY; there's less contention since we
don't have a bunch of backends all trying to acquire exclusive lock on
pg_listener; backends that are sitting inside a transaction block can
still drain the queue, whereas NOTIFY processing can't run if there's
an open transaction block. (This last is a fairly serious issue that
I don't think we ever recognized before --- with clients like JDBC that
tend to sit with open transaction blocks, the sinval queue draining
mechanism never really worked as intended, probably resulting in a lot
of useless cache-reset overhead.) This is the last of several proposed
changes in response to Philip Warner's recent report of sinval-induced
performance problems.
functions. This allows these functions to work correctly with Unicode and
other multibyte encodings. Per prior discussion.
Also, revert my earlier change to move installation path mashing from
Makefile.global to configure. Turns out not to work well because configure
script is working with unexpanded variables, and so fails to match in
cases where it should match.
and should do now that we control our own destiny for timezone handling,
but this commit gets the bulk of the picayune diffs in place.
Magnus Hagander and Tom Lane.
a variant of the function for the 'numeric' datatype; it would be possible
to add additional variants for other datatypes, but I haven't done so yet.
This commit includes regression tests and minimal documentation; if we
want developers to actually use this function in applications, we'll
probably need to document what it does more fully.
find_my_exec/find_other_exec(). Remove passing of progname to these
functions as they can find that out from argv[0], which they already
have.
Make get_progname return const char *, and update all progname variables
to be const char *.
all the code that looks for other binaries. I move FindExec into
port/exec.c (and renamed it to find_my_binary()). I also added
find_other_binary that looks for another binary in the same directory as
the calling program, and checks the version string.
The only behavior change was that initdb and pg_dump would look in the
hard-coded bindir directory if it can't find the requested binary in the
same directory as the caller. The new code throws an error. The old
behavior seemed too error prone for version mismatches.
permissions tests in about the same amount of code as before. Exactly what
the GRANT/REVOKE code ought to be doing is still up for debate, but this
should be helpful in any case, and it already solves an efficiency problem
in executor startup.
rather than allowing them only in a few special cases as before. In
particular you can now pass a ROW() construct to a function that accepts
a rowtype parameter. Internal generation of RowExprs fixes a number of
corner cases that used to not work very well, such as referencing the
whole-row result of a JOIN or subquery. This represents a further step in
the work I started a month or so back to make rowtype values into
first-class citizens.
This simplifies and speeds up the reader by letting it get the representation
right the first time, rather than correcting it after-the-fact. Also,
after int and OID lists become separate node types per Neil's pending
patch, this will let us treat these lists as just plain Nodes instead
of requiring separate read/write macros the way we have now.
costing us lots more to maintain than it was worth. On shared tables
it was of exactly zero benefit because we couldn't trust it to be
up to date. On temp tables it sometimes saved an lseek, but not often
enough to be worth getting excited about. And the real problem was that
we forced an lseek on every relcache flush in order to update the field.
So all in all it seems best to lose the complexity.
in favor of using the REINDEX TABLE apparatus, which does the same thing
simpler and faster. Also, make TRUNCATE not use cluster.c at all, but
just assign a new relfilenode and REINDEX. This partially addresses
Hartmut Raschick's complaint from last December that 7.4's TRUNCATE is
an order of magnitude slower than prior releases. By getting rid of
a lot of unnecessary catalog updates, these changes buy back about a
factor of two (on my system). The remaining overhead seems associated
with creating and deleting storage files, which we may not be able to
do much about without abandoning transaction safety for TRUNCATE.
conversion of basic ASCII letters. Remove all uses of strcasecmp and
strncasecmp in favor of new functions pg_strcasecmp and pg_strncasecmp;
remove most but not all direct uses of toupper and tolower in favor of
pg_toupper and pg_tolower. These functions use the same notions of
case folding already developed for identifier case conversion. I left
the straight locale-based folding in place for situations where we are
just manipulating user data and not trying to match it to built-in
strings --- for example, the SQL upper() function is still locale
dependent. Perhaps this will prove not to be what's wanted, but at
the moment we can initdb and pass regression tests in Turkish locale.
modify. Also fix a passel of problems with ALTER TABLE CLUSTER ON:
failure to check that the index is safe to cluster on (or even belongs
to the indicated rel, or even exists), and failure to broadcast a relcache
flush event when changing an index's state.
* ALTER ... ADD COLUMN with defaults and NOT NULL constraints works per SQL
spec. A default is implemented by rewriting the table with the new value
stored in each row.
* ALTER COLUMN TYPE. You can change a column's datatype to anything you
want, so long as you can specify how to convert the old value. Rewrites
the table. (Possible future improvement: optimize no-op conversions such
as varchar(N) to varchar(N+1).)
* Multiple ALTER actions in a single ALTER TABLE command. You can perform
any number of column additions, type changes, and constraint additions with
only one pass over the table contents.
Basic documentation provided in ALTER TABLE ref page, but some more docs
work is needed.
Original patch from Rod Taylor, additional work from Tom Lane.
> Please find a attached a small patch that adds accessor functions
> for "aclitem" so that it is not an opaque datatype.
>
> I needed these functions to browse aclitems from user land. I can load
> them when necessary, but it seems to me that these accessors for a
> backend type belong to the backend, so I submit them.
>
> Fabien Coelho
for "aclitem" so that it is not an opaque datatype.
I needed these functions to browse aclitems from user land. I can load
them when necessary, but it seems to me that these accessors for a
backend type belong to the backend, so I submit them.
Fabien Coelho
Regression tests and documentation have both been updated.
SQL2003 requires that both ceiling() and ceil() be present, so I have
documented both spellings. SQL2003 doesn't mention pow() as far as I
can see, so I decided to replace pow() with power() in the documentation:
there is little reason to encourage the continued usage of a function
that isn't compliant with the standard, given a standard-compliant
alternative.
RELEASE NOTES: should state that pow() is considered deprecated
(although I don't see the need to ever remove it.)
Allow additional thread flags to be added via port templates.
Change thread flag names to PTHREAD_CFLAGS and PTHREAD_LIBS to match new
configure script.
the next are handled by ReleaseAndReadBuffer rather than separate
ReleaseBuffer and ReadBuffer calls. This cuts the number of acquisitions
of the BufMgrLock by a factor of 2 (possibly more, if an indexscan happens
to pull successive rows from the same heap page). Unfortunately this
doesn't seem enough to get us out of the recently discussed context-switch
storm problem, but it's surely worth doing anyway.
of whether we have successfully read data into a buffer; this makes the
error behavior a bit more transparent (IMHO anyway), and also makes it
work correctly for local buffers which don't use Start/TerminateBufferIO.
Collapse three separate functions for writing a shared buffer into one.
This overlaps a bit with cleanups that Neil proposed awhile back, but
seems not to have committed yet.
of VACUUM cases so that VACUUM requests don't affect the ARC state at all,
avoid corner case where BufferSync would uselessly rewrite a buffer that
no longer contains the page that was to be flushed. Make some minor
other cleanups in and around the bufmgr as well, such as moving PinBuffer
and UnpinBuffer into bufmgr.c where they really belong.
* removed a few redundant defines
* get_user_name safe under win32
* rationalized pipe read EOF for win32 (UPDATED PATCH USED)
* changed all backend instances of sleep() to pg_usleep
- except for the SLEEP_ON_ASSERT in assert.c, as it would exceed a
32-bit long [Note to patcher: If a SLEEP_ON_ASSERT of 2000 seconds is
acceptable, please replace with pg_usleep(2000000000L)]
I added a comment to that part of the code:
/*
* It would be nice to use pg_usleep() here, but only does 2000 sec
* or 33 minutes, which seems too short.
*/
sleep(1000000);
Claudio Natoli
are sought first as local FROM columns, then as local SELECT-list aliases,
and finally as outer FROM columns; the former behavior made outer FROM
columns take precedence over aliases. This does not change spec
conformance because SQL99 allows only the first case anyway, and it seems
more useful and self-consistent. Per gripe from Dennis Bjorklund 2004-04-05.
It works on the principle of turning sockets into non-blocking, and then
emulate blocking behaviour on top of that, while allowing signals to
run. Signals are now implemented using an event instead of APCs, thus
getting rid of the issue of APCs not being compatible with "old style"
sockets functions.
It also moves the win32 specific code away from pqsignal.h/c into
port/win32, and also removes the "thread style workaround" of the APC
issue previously in place.
In order to make things work, a few things are also changed in pgstat.c:
1) There is now a separate pipe to the collector and the bufferer. This
is required because the pipe will otherwise only be signalled in one of
the processes when the postmaster goes down. The MS winsock code for
select() must have some kind of workaround for this behaviour, but I
have found no stable way of doing that. You really are not supposed to
use the same socket from more than one process (unless you use
WSADuplicateSocket(), in which case the docs specifically say that only
one will be flagged).
2) The check for "postmaster death" is moved into a separate select()
call after the main loop. The previous behaviour select():ed on the
postmaster pipe, while later explicitly saying "we do NOT check for
postmaster exit inside the loop".
The issue was that the code relies on the same select() call seeing both
the postmaster pipe *and* the pgstat pipe go away. This does not always
happen, and it appears that useing WSAEventSelect() makes it even more
common that it does not.
Since it's only called when the process exits, I don't think using a
separate select() call will have any significant impact on how the stats
collector works.
Magnus Hagander
by the set operation, so that redundant sorts at higher levels can be
avoided. This was foreseen a good while back, but not done. Per request
from Karel Zak.
> >>with allowed values of "all, mod, ddl, none" with default "none".
OK, here is a patch that implements #1. Here is sample output:
test=> set client_min_messages = 'log';
SET
test=> set log_statement = 'mod';
SET
test=> select 1;
?column?
----------
1
(1 row)
test=> update test set x=1;
LOG: statement: update test set x=1;
ERROR: relation "test" does not exist
test=> update test set x=1;
LOG: statement: update test set x=1;
ERROR: relation "test" does not exist
test=> copy test from '/tmp/x';
LOG: statement: copy test from '/tmp/x';
ERROR: relation "test" does not exist
test=> copy test to '/tmp/x';
ERROR: relation "test" does not exist
test=> prepare xx as select 1;
PREPARE
test=> prepare xx as update x set y=1;
LOG: statement: prepare xx as update x set y=1;
ERROR: relation "x" does not exist
test=> explain analyze select 1;;
QUERY PLAN
------------------------------------------------------------------------------------
Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.006..0.007 rows=1 loops=1)
Total runtime: 0.046 ms
(2 rows)
test=> explain analyze update test set x=1;
LOG: statement: explain analyze update test set x=1;
ERROR: relation "test" does not exist
test=> explain update test set x=1;
ERROR: relation "test" does not exist
It checks PREPARE and EXECUTE ANALYZE too. The log_statement values are
'none', 'mod', 'ddl', and 'all'. For 'all', it prints before the query
is parsed, and for ddl/mod, it does it right after parsing using the
node tag (or command tag for CREATE/ALTER/DROP), so any non-parse errors
will print after the log line.
'SELECT foo()' in a SQL function returning a rowtype, to simply pass
back the results of another function returning the same rowtype.
However, that hasn't actually worked in many years. Now it works again.
results with tuples as ordinary varlena Datums. This commit does not
in itself do much for us, except eliminate the horrid memory leak
associated with evaluation of whole-row variables. However, it lays the
groundwork for allowing composite types as table columns, and perhaps
some other useful features as well. Per my proposal of a few days ago.
boxes. Change interface to user-defined GiST support methods union and
picksplit. Now instead of bytea struct it used special GistEntryVector
structure.
is measured in kilobytes and checked against actual physical execution
stack depth, as per my proposal of 30-Dec. This gives us a fairly
bulletproof defense against crashing due to runaway recursive functions.
remove separate implementation of ALTER TABLE SET WITHOUT OIDS in favor
of doing a regular DROP. Also, cause CREATE TABLE to account completely
correctly for the inheritance status of the OID column. This fixes
problems with dropping OID columns that have dependencies, as noted by
Christopher Kings-Lynne, as well as making sure that you can't drop an
OID column that was inherited from a parent.
listen_addresses parameter, as per recent discussion. The default behavior
is now to listen on localhost, which eliminates the need for the -i
postmaster switch in many scenarios.
Andrew Dunstan
is done at creation time for plpgsql functions. Improve createlang and
droplang to support adding/dropping validators for PLs. Initial steps
towards producing a syntax error position from plpgsql syntax errors
(this part is a work in progress, and will change depending on outcome
of current discussions).
of fighting it, avoid hard-wired (and wrong) assumption about max length
of prefix, cause %l to actually work as documented, don't compute data
we may not need.
so that the 'val' is computed only once, per recent discussion. The
speedup is not much when 'val' is just a simple variable, but could be
significant for larger expressions. More importantly this avoids issues
with multiple evaluations of a volatile 'val', and it allows the CASE
expression to be reverse-listed in its original form by ruleutils.c.
directly to the appropriate per-node execution function, using a function
pointer stored by ExecInitExpr. This speeds things up by eliminating one
level of function call. The function-pointer technique also enables further
small improvements such as only making one-time tests once (and then
changing the function pointer). Overall this seems to gain about 10%
on evaluation of simple expressions, which isn't earthshaking but seems
a worthwhile gain for a relatively small hack. Per recent discussion
on pghackers.
implemented casts to varchar and bpchar using a cast-to-text function.
This is a holdover from before we had pg_cast; it now makes more sense
to just list these casts in pg_cast. While at it, add pg_cast entries
for the other direction (casts from varchar/bpchar) where feasible.
types. Update the regression tests and the documentation to reflect
this. Remove the UNSAFE_FLOATS #ifdef.
This is only half the story: we still unconditionally reject
floating point operations that result in +/- infinity. See
recent thread on -hackers for more information.
initialization of stats process under EXEC_BACKEND.
[A cleaner, rationalized approach to stat/backend/SSDataBase child
processes under EXEC_BACKEND is on my TODO list. However this patch
takes care of immediate concerns (ie. stats test now passes under
win32)]
Claudio Natoli
#log_line_prefix = '' # e.g. '<%u%%%d> '
# %u=user name %d=database name
# %r=remote host and port
# %p=PID %t=timestamp %i=command tag
# %c=session id %l=session line number
# %s=session start timestamp
# %x=stop here in non-session processes
# %%='%'
Andrew Dunstan
+extern Oid SPI_getargtypeid(void *plan, int argIndex);
+extern int SPI_getargcount(void *plan);
+extern bool SPI_is_cursor_plan(void *plan);
Thomas Hallgren
float8 types. This begins the deprecation of this feature: in 7.6,
this input will be rejected.
Also added a new error code for warnings about deprecated features,
and updated the regression tests.
* Changes incorrect CYGWIN defines to __CYGWIN__
* Some localtime returns NULL checks (when unchecked cause SEGVs under
Win32
regression tests)
* Rationalized CreateSharedMemoryAndSemaphores and
AttachSharedMemoryAndSemaphores (Bruce, I finally remembered to do it);
requires attention.
Claudio Natoli
and FreeDir routines modeled on the existing AllocateFile/FreeFile.
Like the latter, these routines will avoid failing on EMFILE/ENFILE
conditions whenever possible, and will prevent leakage of directory
descriptors if an elog() occurs while one is open.
Also, reduce PANIC to ERROR in MoveOfflineLogs() --- this is not
critical code and there is no reason to force a DB restart on failure.
All per recent trouble report from Olivier Hubaut.
number of openable files and the number already opened. This eliminates
depending on sysconf(_SC_OPEN_MAX), and allows much saner behavior on
platforms where open-file slots are used up by semaphores.
logically belongs. Arrange to update the _NSGetArgv() copy of the argv
pointer on Darwin. (It seems likely that other NeXT-derived platforms
also have an _NSGetArgv() problem, but until we have some reports I'll
just make this #ifdef __darwin__.)
problem, per previous discussion. Make some additional changes to
centralize the knowledge of just how identifier downcasing is done,
in hopes of simplifying any future tweaking in this area.
applied, deadlock detection and statement_timeout now works.
The file timer.c goes into src/backend/port/win32/.
The patch also removes two lines of "printf debugging" accidentally left
in pqsignal.h, in the console control handler.
Magnus Hagander
corner cases that could stand improvement, but it does all the basic
stuff. A byproduct is that the selectivity routines are no longer
constrained to working on simple Vars; we might in future be able to
improve the behavior for subexpressions that don't match indexes.
This commit teaches ANALYZE to store such stats in pg_statistic, but
nothing is done yet about teaching the planner to use 'em.
Also, repair longstanding oversight in separate ANALYZE command: it
updated the pg_class.relpages and reltuples counts for the table proper,
but not for indexes.
vs. timestamptz. This allows use of indexes for expressions like
datecol >= date 'today' - interval '1 month'
which were formerly not indexable without casting the righthand side
down from timestamp to date.
indexes, it seems like we ought to put another layer of indirection
between the compute_stats functions and the actual data storage. This
would allow us to compute the values on-the-fly, for example.
for already empty buffers because their buffer tag was not cleard out
when the buffers have been invalidated before.
Also removed the misnamed BM_FREE bufhdr flag and replaced the checks,
which effectively ask if the buffer is unpinned, with checks against the
refcount field.
Jan
wit: Add a header record to each WAL segment file so that it can be reliably
identified. Avoid splitting WAL records across segment files (this is not
strictly necessary, but makes it simpler to incorporate the header records).
Make WAL entries for file creation, deletion, and truncation (as foreseen but
never implemented by Vadim). Also, add support for making XLOG_SEG_SIZE
configurable at compile time, similarly to BLCKSZ. Fix a couple bugs I
introduced in WAL replay during recent smgr API changes. initdb is forced
due to changes in pg_control contents.
variable.
Remove thread locking for non-thread-safe functions, instead throw a
compile error.
Platforms will have to re-run tools/thread to record their thread
safety.
subroutine in src/port/pgsleep.c. Remove platform dependencies from
miscadmin.h and put them in port.h where they belong. Extend recent
vacuum cost-based-delay patch to apply to VACUUM FULL, ANALYZE, and
non-btree index vacuuming.
By the way, where is the documentation for the cost-based-delay patch?
the relcache, and so the notion of 'blind write' is gone. This should
improve efficiency in bgwriter and background checkpoint processes.
Internal restructuring in md.c to remove the not-very-useful array of
MdfdVec objects --- might as well just use pointers.
Also remove the long-dead 'persistent main memory' storage manager (mm.c),
since it seems quite unlikely to ever get resurrected.
Natoli and Bruce Momjian (and some cosmetic fixes from Neil Conway).
Changes:
- remove duplicate signal definitions from pqsignal.h
- replace pqkill() with kill() and redefine kill() in Win32
- use ereport() in place of fprintf() in some error handling in
pqsignal.c
- export pg_queue_signal() and make use of it where necessary
- add a console control handler for Ctrl-C and similar handling
on Win32
- do WaitForSingleObjectEx() in CHECK_FOR_INTERRUPTS() on Win32;
query cancelling should now work on Win32
- various other fixes and cleanups
Make btree index creation and initial validation of foreign-key constraints
use maintenance_work_mem rather than work_mem as their memory limit.
Add some code to guc.c to allow these variables to be referenced by their
old names in SHOW and SET commands, for backwards compatibility.
a series of numbers, optionally using an explicit step size other
than the default value (one). Use function in the information_schema
to replace hard-wired knowledge of INDEX_MAX_KEYS. initdb forced due
to pg_proc change. Documentation update still needed -- will be
committed separately.
valgrind: a buffer passed to strncmp() had to be NUL-terminated. Original
report and patch from Dennis Bjorkland, some cleanup by Andrew Dunstan,
and finally some editorializing from Neil Conway.
* configure + Makefile changes
* shared memory attaching in EXEC_BACKEND case (+ minor fix for apparent
cygwin bug under cygwin/EXEC_BACKEND case only)
* PATH env var separator differences
* missing win32 rand functions added
* placeholder replacements for sync etc under port.h
To those who are really interested, and there are a few of you: the attached
patch + file will allow the source base to be compiled (and, for some
definition, "run") under MingW, with the following caveats (I wanted to
first properly fix all but the last of these, but y'all won't quit asking
for a patch :-):
* child death: SIGCHLD not yet sent, so as a minimum, you'll need to
put in some sort of delay after StartupDatabase, and handle setting
StartupPID to 0 etc (ie. the stuff the reaper() signal function is supposed
to do)
* dirmod.c: comment out the elog calls
* dfmgr.c: some hackage required to substitute_libpath_macro
* slru/xact.c: comment out the errno checking after the readdir
(fixed by next version of MingW)
Again, this is only if you *really* want to see postgres compile and start,
and is a nice leg-up for working on the other Win32 TODO list items. Just
don't expect too much else from it at this point...
Claudio Natoli
against the latest shapshot. It also includes the replacement of kill()
with pqkill() and sigsetmask() with pqsigsetmask().
Passes all tests fine on my linux machine once applied. Still doesn't
link completely on Win32 - there are a few things still required. But
much closer than before.
At Bruce's request, I'm goint to write up a README file about the method
of signals delivery chosen and why the others were rejected (basically a
summary of the mailinglist discussions). I'll finish that up once/if the
patch is accepted.
Magnus Hagander
PostmasterPid variable, which gets set (early) in PostmasterMain
getppid would not be the postmaster?
[fork/exec] Implements processCancelRequest by keeping an array of
pid/cancel_key structs in shared mem
[fork/exec] Moves AttachSharedMemoryAndSemaphores call for backends into
SubPostmasterMain
[win32] Implements reaper/waitpid by keeping an arrays of children
pids,handles in postmaster local mem
- this item is largely untested, for reasons which should be
obvious, but appears sound
[win32/all] Added extern for pgpipe in Win32 case, and changed the second
pipe call (which seems to have been missed earlier) to pgpipe
[win32] #define'd ftruncate to chsize in the Win32 case
[win32] PG_USLEEP for Win32 has a misplaced paren. Fixed.
[win32] DLLIMPORT handling for MingW case
Claudio Natoli
done by the background writer between writing dirty blocks and
napping.
none (default) no action
sync bgwriter calls smgrsync() causing a sync(2)
A global sync() is only good on dedicated database servers, so
more flush methods should be added in the future.
Jan
that it's good to join where there are join clauses rather than where there
are not. Also enable it to generate bushy plans at need, so that it doesn't
fail in the presence of multiple IN clauses containing sub-joins. These
changes appear to improve the behavior enough that we can substantially reduce
the default pool size and generations count, thereby decreasing the runtime,
and yet get as good or better plans as we were getting in 7.4. Consequently,
adjust the default GEQO parameters. I also modified the way geqo_effort is
used so that it affects both population size and number of generations;
it's now useful as a single control to adjust the GEQO runtime-vs-plan-quality
tradeoff. Bump geqo_threshold to 12, since even with these changes GEQO
seems to be slower than the regular planner at 11 relations.
patch: a 3-value enum was mistakenly assigned directly to a 'bool'
in transformCreateStmt(). Along the way, change makeObjectName()
to be static, as it isn't used outside analyze.c
when scanning a table that we need all the columns from. In case of
SELECT INTO, we have to check that the hasoids flag matches the desired
output type, too. Per report from Mike Mascari.
default value for geqo_effort is supposed to be 40, not 1. The actual
'genetic' component of the GEQO algorithm has been practically disabled
since 7.1 because of this mistake. Improve documentation while at it.
should not be too eager to reject paths involving unknown schemas, since
it can't really tell whether the schemas exist in the target database.
(Also, when reading pg_dumpall output, it could be that the schemas
don't exist yet, but eventually will.) ALTER USER SET has a similar issue.
So, reduce the normal ERROR to a NOTICE when checking search_path values
for these commands. Supporting this requires changing the API for GUC
assign_hook functions, which causes the patch to touch a lot of places,
but the changes are conceptually trivial.
dynamically loaded C functions). Some limited testing suggests that
this puts the lookup speed for external functions just about on par
with built-in functions. Per discussion with Eric Ridge.
check instead of hardwiring assumptions that only certain plan node types
can appear at the places where we are testing. This was always a pretty
fragile assumption, and it turns out to be broken in 7.4 for certain cases
involving IN-subselect tests that need type coercion.
Also, modify code that builds finished Plan tree so that node types that
don't do projection always copy their input node's targetlist, rather than
having the tlist passed in from the caller. The old method makes it too
easy to write broken code that thinks it can modify the tlist when it
cannot.
tuptoaster.c --- fields that are compressed in-line are not a reason
to invoke the toaster. Along the way, add a couple more htup.h macros
to eliminate confusing negated tests, and get rid of the already
vestigial TUPLE_TOASTER_ACTIVE symbol.
for sure...). Rather than relying on the query context of a rangetable
entry to identify what permissions it wants checked, store a full AclMode
mask in each RTE, and check exactly those bits. This allows an RTE
specifying, say, INSERT privilege on a view to be copied into a derived
UPDATE query without changing meaning. Per recent discussion thread.
initdb forced due to change of stored rule representation.
intended to allow application authors to insulate themselves from
changes to the default value of 'default_with_oids' in future releases
of PostgreSQL.
This patch also fixes a bug in the earlier implementation of the
'default_with_oids' GUC variable: code in gram.y should not examine
the value of GUC variables directly due to synchronization issues.
parameters to be declared with names. pg_proc has a column to store
names, and CREATE FUNCTION can insert data into it, but that's all as
yet. I need to do more work on the pg_dump and plpgsql portions of the
patch before committing those, but I thought I'd get the bulky changes
in before the tree drifts under me.
initdb forced due to pg_proc change.
BackendFork/SSDataBase/pgstat) startup, to allow fork/exec calls to
closely mimic (the soon to be provided) Win32 CreateProcess equivalent
calls.
Claudio Natoli
- Update comment in IsReservedName() to the present day
- Improve some variable & function names in commands/vacuum.c. I
was planning to rewrite this to avoid lappend(), but since I
still intend to do the list rewrite, there's no need for that.
- Update some smgr comments which seemed to imply that we still
forced all dirty pages to disk at commit-time.
- Replace some #ifdef DIAGNOSTIC code with assertions.
- Make the distinction between OS-level file descriptors and
virtual file descriptors a little clearer in a few comments
- Other minor comment improvements in the smgr code
regular qpqual ('filter condition'), add special-purpose code to
nodeIndexscan.c to recheck them. This ends being almost no net addition
of code, because the removal of planner code balances out the extra
executor code, but it is significantly more efficient when a lossy
operator is involved in an OR indexscan. The old implementation had
to recheck the entire indexqual in such cases.
with index qual clauses in the Path representation. This saves a little
work during createplan and (probably more importantly) allows reuse of
cached selectivity estimates during indexscan planning. Also fix latent
bug: wrong plan would have been generated for a 'special operator' used
in a nestloop-inner-indexscan join qual, because the special operator
would not have gotten into the list of quals to recheck. This bug is
only latent because at present the special-operator code could never
trigger on a join qual, but sooner or later someone will want to do it.
join conditions in which each OR subclause includes a constraint on
the same relation. This implements the other useful side-effect of
conversion to CNF format, without its unpleasant side-effects. As
per pghackers discussion of a few weeks ago.
run the data through cpp, and we know of at least one platform where
unusual cpp behavior breaks the process. So remove the cpp step,
and make consequent simplifications.
teaching the latter to accept either RestrictInfo nodes or bare
clause expressions; and cache the selectivity result in the RestrictInfo
node when possible. This extends the caching behavior of approx_selectivity
to many more contexts, and should reduce duplicate selectivity
calculations.
first time generate an OR indexscan for a two-column index when the WHERE
condition is like 'col1 = foo AND (col2 = bar OR col2 = baz)' --- before,
the OR had to be on the first column of the index or we'd not notice the
possibility of using it. Some progress towards extracting OR indexscans
from subclauses of an OR that references multiple relations, too, although
this code is #ifdef'd out because it needs more work.
fields: now they are valid whenever the clause is a binary opclause,
not only when it is a potential join clause (there is a new boolean
field canjoin to signal the latter condition). This lets us avoid
recomputing the relid sets over and over while examining indexes.
Still more work to do to make this as useful as it could be, because
there are places that could use the info but don't have access to the
RestrictInfo node.
about whether it is applied before or after eval_const_expressions().
I believe there were some corner cases where the system would fail to
recognize that a partial index is applicable because of the previous
inconsistency. Store normal rather than 'implicit AND' representations
of constraints and index predicates in the catalogs.
initdb forced due to representation change of constraints/predicates.
instruction in the s_lock() wait loop, and use test before test-and-set
in TAS() macro to avoid unnecessary bus traffic. Patch from Manfred
Spraul, reworked a bit by Tom.
> > needed, and other people in the past asked about it too.
>
> It is in Oracle, but you aren't exactly on the spot. It should be
>
> IYYY - 4 digits ('2003')
> IYY - 3 digits ('003')
> IY - 2 digits ('03')
> I - 1 digit ('3')
Here is an updated patch that does that.
Kurt Roeckx
that were broken, try to make layout of s_lock.h entries consistent,
use HAVE_SPINLOCKS in preference to HAS_TEST_AND_SET everywhere outside
s_lock.h itself.
> Attached is a patch that addressed all the discussed issues
> that did not break backward compatability, including the
> ability to output ISO-8601 compliant intervals by setting
> datestyle to iso8601basic.
to step more than one entry after descending the search tree to arrive at
the correct place to start the scan. This can improve the behavior
substantially when there are many entries equal to the chosen boundary
value. Per suggestion from Dmitry Tkach, 14-Jul-03.
a) ones that are 100% backward (such as the comment about
outputting this format)
and
b) ones that aren't (such as deprecating the current
postgresql shorthand of
'1Y1M'::interval = 1 year 1 minute
in favor of the ISO-8601
'P1Y1M'::interval = 1 year 1 month.
Attached is a patch that addressed all the discussed issues that
did not break backward compatability, including the ability to
output ISO-8601 compliant intervals by setting datestyle to
iso8601basic.
Interval values can now be written as ISO 8601 time intervals, using
the "Format with time-unit designators". This format always starts with
the character 'P', followed by a string of values followed
by single character time-unit designators. A 'T' separates the date and
time parts of the interval.
Ron Mayer
shut down cleanly if the plan node is ReScanned before the SRFs are run
to completion. This fixes the problem for SQL-language functions, but
still need work on functions using the SRF_XXX() macros.
some concurrent changes Jan was making to the bufmgr. Here's an
updated version of the patch -- it should apply cleanly to CVS
HEAD and passes the regression tests.
This patch makes the following changes:
- remove the UnlockAndReleaseBuffer() and UnlockAndWriteBuffer()
macros, and replace uses of them with calls to the appropriate
functions.
- remove a bunch of #ifdef BMTRACE code: it is ugly & broken
(i.e. it doesn't compile)
- make BufferReplace() return a bool, not an int
- cleanup some logic in bufmgr.c; should be functionality
equivalent to the previous code, just cleaner now
- remove the BM_PRIVATE flag as it is unused
- improve a few comments, etc.
to certain compile-time options (FUNC_MAX_ARGS, INDEX_MAX_KEYS,
NAMEDATALEN, BLCKSZ, HAVE_INT64_TIMESTAMP). Also added "category",
"short_desc", and "extra_desc" to the pg_settings view. Per recent
discussion here:
http://archives.postgresql.org/pgsql-patches/2003-11/msg00363.php
and hash bucket-size estimation. Issue has been there awhile but is more
critical in 7.4 because it affects varchar columns. Per report from
Greg Stark.
on 64-bit Solaris. Use a non-system-dependent datatype for UsedShmemSegID,
namely unsigned long (which we were already assuming could hold a shmem
key anyway, cf RecordSharedMemoryInLockFile).
proposal for eventually deprecating OIDs on user tables that I posted
earlier to pgsql-hackers. pg_dump now always specifies WITH OIDS or
WITHOUT OIDS when dumping a table. The documentation has been updated.
Neil Conway
method control structure, or a table of control structures.
. Use type LOCKMASK where an int is not a counter.
. Get rid of INVALID_TABLEID, use INVALID_LOCKMETHOD instead.
. Use INVALID_LOCKMETHOD instead of (LOCKMETHOD) NULL, because
LOCKMETHOD is not a pointer.
. Define and use macro LockMethodIsValid.
. Rename LOCKMETHOD to LOCKMETHODID.
. Remove global variable LongTermTableId in lmgr.c, because it is
never used.
. Make LockTableId static in lmgr.c, because it is used nowhere else.
Why not remove it and use DEFAULT_LOCKMETHOD?
. Rename the lock method control structure from LOCKMETHODTABLE to
LockMethodData. Introduce a pointer type named LockMethod.
. Remove elog(FATAL) after InitLockTable() call in
CreateSharedMemoryAndSemaphores(), because if something goes wrong,
there is elog(FATAL) in LockMethodTableInit(), and if this doesn't
help, an elog(ERROR) in InitLockTable() is promoted to FATAL.
. Make InitLockTable() void, because its only caller does not use its
return value any more.
. Rename variables in lock.c to avoid statements like
LockMethodTable[NumLockMethods] = lockMethodTable;
lockMethodTable = LockMethodTable[lockmethod];
. Change LOCKMETHODID type to uint16 to fit into struct LOCKTAG.
. Remove static variables BITS_OFF and BITS_ON from lock.c, because
I agree to this doubt:
* XXX is a fetch from a static array really faster than a shift?
. Define and use macros LOCKBIT_ON/OFF.
Manfred Koizar
to note:
1) arttype is numeric. I thought this was the best way of allowing
arbitarily large factorials, even though factorial(2^63) is a large
number. Happy to change to integers if this is overkill.
2) since we're accepting numeric arguments, the patch tests for floats.
If a numeric is passed with non-zero decimal portion, an error is raised
since (from memory) they are undefined.
Gavin Sherry
the hashclauses field of the parent HashJoin. This avoids problems with
duplicated links to SubPlans in hash clauses, as per report from
Andrew Holm-Hansen.
large objects. Dump all these in pg_dump; also add code to pg_dump
user-defined conversions. Make psql's large object code rely on
the backend for inserting/deleting LOB comments, instead of trying to
hack pg_description directly. Documentation and regression tests added.
Christopher Kings-Lynne, code reviewed by Tom
This first part of the background writer does no syncing at all.
It's only purpose is to keep the LRU heads clean so that regular
backends seldom to never have to call write().
Jan
pghackers proposal of 8-Nov. All the existing cross-type comparison
operators (int2/int4/int8 and float4/float8) have appropriate support.
The original proposal of storing the right-hand-side datatype as part of
the primary key for pg_amop and pg_amproc got modified a bit in the event;
it is easier to store zero as the 'default' case and only store a nonzero
when the operator is actually cross-type. Along the way, remove the
long-since-defunct bigbox_ops operator class.
Remove the 'strategy map' code, which was a large amount of mechanism
that no longer had any use except reverse-mapping from procedure OID to
strategy number. Passing the strategy number to the index AM in the
first place is simpler and faster.
This is a preliminary step in planned support for cross-datatype index
operations. I'm committing it now since the ScanKeyEntryInitialize()
API change touches quite a lot of files, and I want to commit those
changes before the tree drifts under me.
ACL array, and force languages to be treated as owned by the bootstrap
user ID. (pg_language should have a lanowner column, but until it does
this will have to do as a workaround.)
of the entries used to be zero, which I think I had deliberately done in
the name of saving cycles during ANALYZE, but it was really a rather
foolish decision. Some of the more complex views in information_schema
were getting really bad plans for lack of statistics on the columns they
were joining over.
I'm not forcing an initdb for this, but I think there will be one soon
anyway to repair some bugs in the information_schema views.
protocol, per report from Igor Shevchenko. NOTIFY thought it could
do its thing if transaction blockState is TBLOCK_DEFAULT, but in
reality it had better check the low-level transaction state is
TRANS_DEFAULT as well. Formerly it was not possible to wait for the
client in a state where the first is true and the second is not ...
but now we can have such a state. Minor cleanup in StartTransaction()
as well.
a single LEFT JOIN query instead of firing the check trigger for each
row individually. Stephan Szabo, with some kibitzing from Tom Lane and
Jan Wieck.
with required outer parentheses. Breakage seems to be leftover from
domain-constraint patches. This could be smarter about suppressing
extra parens, but at this stage of the release cycle I want certainty
not cuteness.
discussion on pgsql-hackers: in READ COMMITTED mode we just have to force
a QuerySnapshot update in the trigger, but in SERIALIZABLE mode we have
to run the scan under a current snapshot and then complain if any rows
would be updated/deleted that are not visible in the transaction snapshot.
invalid (has the wrong magic number) until the build is entirely
complete. This turns out to cost no additional writes in the normal
case, since we were rewriting the metapage at the end of the process
anyway. In normal scenarios there's no real gain in security, because
a failed index build would roll back the transaction leaving an unused
index file, but for rebuilding shared system indexes this seems to add
some useful protection.
post-abort cleanup hooks. I'm surprised that we have not needed this
already, but I need it now to fix a plpgsql problem, and the usefulness
for other dynamically loaded modules seems obvious.
to allow es_snapshot to be set to SnapshotNow rather than a query snapshot.
This solves a bug reported by Wade Klaver, wherein triggers fired as a
result of RI cascade updates could misbehave.
now able to cope with assigning new relfilenode values to nailed-in-cache
indexes, so they can be reindexed using the fully crash-safe method. This
leaves only shared system indexes as special cases. Remove the 'index
deactivation' code, since it provides no useful protection in the shared-
index case. Require reindexing of shared indexes to be done in standalone
mode, but remove other restrictions on REINDEX. -P (IgnoreSystemIndexes)
now prevents using indexes for lookups, but does not disable index updates.
It is therefore safe to allow from PGOPTIONS. Upshot: reindexing system catalogs
can be done without a standalone backend for all cases except
shared catalogs.
not just MAXALIGN boundaries. This makes a noticeable difference in
the speed of transfers to and from kernel space, at least on recent
Pentiums, and might help other CPUs too. We should look at making
this happen for local buffers and buffile.c too. Patch from Manfred Spraul.
Per recent discussion, this does not work because other backends can't
reliably see tuples in a temp table and so cannot run the RI checks
correctly. Seems better to disallow this case than go back to accessing
temp tables through shared buffers. Also, disallow FK references to
ON COMMIT DELETE ROWS tables. We already caught this problem for normal
TRUNCATE, but the path used by ON COMMIT didn't check.
really general fix might be difficult, I believe the only case where
AtCommit_Notify could see an uncommitted tuple is where the other guy
has just unlistened and not yet committed. The best solution seems to
be to just skip updating that tuple, on the assumption that the other
guy does not want to hear about the notification anyway. This is not
perfect --- if the other guy rolls back his unlisten instead of committing,
then he really should have gotten this notify. But to do that, we'd have
to wait to see if he commits or not, or make UNLISTEN hold exclusive lock
on pg_listener until commit. Either of these answers is deadlock-prone,
not to mention horrible for interactive performance. Do it this way
for now. (What happened to that project to do LISTEN/NOTIFY in memory
with no table, anyway?)
* use non-*_r function names if they are all thread-safe
* (NEED_REENTRANT_FUNCS=no)
* use *_r functions if they exist (configure test)
* do our own locking and copying of non-threadsafe functions
New to this patch is the last option.
sequence every time it's called is bogus --- it interferes with user
control over the seed, and actually decreases randomness overall
(because a seed based on time(NULL) is pretty predictable). If you really
want a reproducible result from geqo, do 'set seed = 0' before planning
a query.
o allow configure to see include/port/win32 include files
o add matching Win32 accept() prototype
o allow pg_id to compile with native Win32 API
o fix invalide mbvalidate() function calls (existing bug)
o allow /scripts to compile with native Win32 API
o add win32.c to Win32 compiles (already in *.mak files)
pghackers. This fixes the problem recently reported by Markus KrÌutner
(hash bucket split corrupts the state of scans being done concurrently),
and I believe it also fixes all the known problems with deadlocks in
hash index operations. Hash indexes are still not really ready for prime
time (since they aren't WAL-logged), but this is a step forward.
layout; therefore, this change forces REINDEX of hash indexes (though
not a full initdb). Widen hashm_ntuples to double so that hash space
management doesn't get confused by more than 4G entries; enlarge the
allowed number of free-space-bitmap pages; replace the useless bshift
field with a useful bmshift field; eliminate 4 bytes of wasted space
in the per-page special area.
scheme. A pleasant side effect is that it is *much* faster when deleting
a large fraction of the indexed tuples, because of elimination of
redundant hash_step activity induced by hash_adjscans. Various other
continuing code cleanup.
yet). Fix a couple of bugs that would only appear if multiple bitmap pages
are used, including a buffer reference leak and incorrect computation of bit
indexes. Get rid of 'overflow address' concept, which accomplished nothing
except obfuscating the code and creating a risk of failure due to limited
range of offset field. Rename some misleadingly-named fields and routines,
and improve documentation.
SQLSTATE error codes required by SQL99 (invalid format, datetime field
overflow, interval field overflow, invalid time zone displacement value).
Also emit a HINT about DateStyle in cases where it seems appropriate.
Per recent gripes.