ago. This should give significantly better results when the density of
live tuples is not uniform throughout a table. Manfred Koizar, with
minor kibitzing from Tom Lane.
(SIGUSR1, which we have not been using recently) instead of piggybacking
on SIGUSR2-driven NOTIFY processing. This has several good results:
the processing needed to drain the sinval queue is a lot less than the
processing needed to answer a NOTIFY; there's less contention since we
don't have a bunch of backends all trying to acquire exclusive lock on
pg_listener; backends that are sitting inside a transaction block can
still drain the queue, whereas NOTIFY processing can't run if there's
an open transaction block. (This last is a fairly serious issue that
I don't think we ever recognized before --- with clients like JDBC that
tend to sit with open transaction blocks, the sinval queue draining
mechanism never really worked as intended, probably resulting in a lot
of useless cache-reset overhead.) This is the last of several proposed
changes in response to Philip Warner's recent report of sinval-induced
performance problems.
to ExclusiveLock. This still serializes the operations of this module,
but doesn't conflict with concurrent ANALYZE operations. Per trouble
report from Philip Warner a few weeks ago.
functions. This allows these functions to work correctly with Unicode and
other multibyte encodings. Per prior discussion.
Also, revert my earlier change to move installation path mashing from
Makefile.global to configure. Turns out not to work well because configure
script is working with unexpanded variables, and so fails to match in
cases where it should match.
several different module Makefiles with it. Also, do any adjustment
of installation paths during configure, rather than every time Makefile.global
is read.
and should do now that we control our own destiny for timezone handling,
but this commit gets the bulk of the picayune diffs in place.
Magnus Hagander and Tom Lane.
timezone code and other places.
Remove elog() calls from find_my_exec; do fprintf(stderr) instead. We
can then remove the exec.c handling in the makefile because it doesn't
have to be built to suppress elog calls.
a variant of the function for the 'numeric' datatype; it would be possible
to add additional variants for other datatypes, but I haven't done so yet.
This commit includes regression tests and minimal documentation; if we
want developers to actually use this function in applications, we'll
probably need to document what it does more fully.
by Ken Ashcraft's report. I think there is no actual bug here since if
the int32 value does wrap a little bit, palloc will still reject it.
Still it's better that the code be obviously correct.
find_my_exec/find_other_exec(). Remove passing of progname to these
functions as they can find that out from argv[0], which they already
have.
Make get_progname return const char *, and update all progname variables
to be const char *.
all the code that looks for other binaries. I move FindExec into
port/exec.c (and renamed it to find_my_binary()). I also added
find_other_binary that looks for another binary in the same directory as
the calling program, and checks the version string.
The only behavior change was that initdb and pg_dump would look in the
hard-coded bindir directory if it can't find the requested binary in the
same directory as the caller. The new code throws an error. The old
behavior seemed too error prone for version mismatches.
permissions tests in about the same amount of code as before. Exactly what
the GRANT/REVOKE code ought to be doing is still up for debate, but this
should be helpful in any case, and it already solves an efficiency problem
in executor startup.
Didier Moens. Bug is new in 7.4, and was caused by not updating everyplace
I should've when replacing locParam markers by allParam.
Add a regression test to catch related errors in future.
rather than allowing them only in a few special cases as before. In
particular you can now pass a ROW() construct to a function that accepts
a rowtype parameter. Internal generation of RowExprs fixes a number of
corner cases that used to not work very well, such as referencing the
whole-row result of a JOIN or subquery. This represents a further step in
the work I started a month or so back to make rowtype values into
first-class citizens.
This simplifies and speeds up the reader by letting it get the representation
right the first time, rather than correcting it after-the-fact. Also,
after int and OID lists become separate node types per Neil's pending
patch, this will let us treat these lists as just plain Nodes instead
of requiring separate read/write macros the way we have now.
costing us lots more to maintain than it was worth. On shared tables
it was of exactly zero benefit because we couldn't trust it to be
up to date. On temp tables it sometimes saved an lseek, but not often
enough to be worth getting excited about. And the real problem was that
we forced an lseek on every relcache flush in order to update the field.
So all in all it seems best to lose the complexity.
in favor of using the REINDEX TABLE apparatus, which does the same thing
simpler and faster. Also, make TRUNCATE not use cluster.c at all, but
just assign a new relfilenode and REINDEX. This partially addresses
Hartmut Raschick's complaint from last December that 7.4's TRUNCATE is
an order of magnitude slower than prior releases. By getting rid of
a lot of unnecessary catalog updates, these changes buy back about a
factor of two (on my system). The remaining overhead seems associated
with creating and deleting storage files, which we may not be able to
do much about without abandoning transaction safety for TRUNCATE.
conversion of basic ASCII letters. Remove all uses of strcasecmp and
strncasecmp in favor of new functions pg_strcasecmp and pg_strncasecmp;
remove most but not all direct uses of toupper and tolower in favor of
pg_toupper and pg_tolower. These functions use the same notions of
case folding already developed for identifier case conversion. I left
the straight locale-based folding in place for situations where we are
just manipulating user data and not trying to match it to built-in
strings --- for example, the SQL upper() function is still locale
dependent. Perhaps this will prove not to be what's wanted, but at
the moment we can initdb and pass regression tests in Turkish locale.
modify. Also fix a passel of problems with ALTER TABLE CLUSTER ON:
failure to check that the index is safe to cluster on (or even belongs
to the indicated rel, or even exists), and failure to broadcast a relcache
flush event when changing an index's state.
* ALTER ... ADD COLUMN with defaults and NOT NULL constraints works per SQL
spec. A default is implemented by rewriting the table with the new value
stored in each row.
* ALTER COLUMN TYPE. You can change a column's datatype to anything you
want, so long as you can specify how to convert the old value. Rewrites
the table. (Possible future improvement: optimize no-op conversions such
as varchar(N) to varchar(N+1).)
* Multiple ALTER actions in a single ALTER TABLE command. You can perform
any number of column additions, type changes, and constraint additions with
only one pass over the table contents.
Basic documentation provided in ALTER TABLE ref page, but some more docs
work is needed.
Original patch from Rod Taylor, additional work from Tom Lane.
> Please find a attached a small patch that adds accessor functions
> for "aclitem" so that it is not an opaque datatype.
>
> I needed these functions to browse aclitems from user land. I can load
> them when necessary, but it seems to me that these accessors for a
> backend type belong to the backend, so I submit them.
>
> Fabien Coelho
for "aclitem" so that it is not an opaque datatype.
I needed these functions to browse aclitems from user land. I can load
them when necessary, but it seems to me that these accessors for a
backend type belong to the backend, so I submit them.
Fabien Coelho
the next are handled by ReleaseAndReadBuffer rather than separate
ReleaseBuffer and ReadBuffer calls. This cuts the number of acquisitions
of the BufMgrLock by a factor of 2 (possibly more, if an indexscan happens
to pull successive rows from the same heap page). Unfortunately this
doesn't seem enough to get us out of the recently discussed context-switch
storm problem, but it's surely worth doing anyway.
of whether we have successfully read data into a buffer; this makes the
error behavior a bit more transparent (IMHO anyway), and also makes it
work correctly for local buffers which don't use Start/TerminateBufferIO.
Collapse three separate functions for writing a shared buffer into one.
This overlaps a bit with cleanups that Neil proposed awhile back, but
seems not to have committed yet.
of VACUUM cases so that VACUUM requests don't affect the ARC state at all,
avoid corner case where BufferSync would uselessly rewrite a buffer that
no longer contains the page that was to be flushed. Make some minor
other cleanups in and around the bufmgr as well, such as moving PinBuffer
and UnpinBuffer into bufmgr.c where they really belong.
* removed a few redundant defines
* get_user_name safe under win32
* rationalized pipe read EOF for win32 (UPDATED PATCH USED)
* changed all backend instances of sleep() to pg_usleep
- except for the SLEEP_ON_ASSERT in assert.c, as it would exceed a
32-bit long [Note to patcher: If a SLEEP_ON_ASSERT of 2000 seconds is
acceptable, please replace with pg_usleep(2000000000L)]
I added a comment to that part of the code:
/*
* It would be nice to use pg_usleep() here, but only does 2000 sec
* or 33 minutes, which seems too short.
*/
sleep(1000000);
Claudio Natoli
o -Allow dump/load of CSV format
This adds new keywords to COPY and \copy:
CSV - enable CSV mode (comma separated variable)
QUOTE - specify quote character
ESCAPE - specify escape character
FORCE - force quoting of specified column
LITERAL - suppress null comparison for columns
Doc changes included. Regression updates coming from Andrew.
are sought first as local FROM columns, then as local SELECT-list aliases,
and finally as outer FROM columns; the former behavior made outer FROM
columns take precedence over aliases. This does not change spec
conformance because SQL99 allows only the first case anyway, and it seems
more useful and self-consistent. Per gripe from Dennis Bjorklund 2004-04-05.
It works on the principle of turning sockets into non-blocking, and then
emulate blocking behaviour on top of that, while allowing signals to
run. Signals are now implemented using an event instead of APCs, thus
getting rid of the issue of APCs not being compatible with "old style"
sockets functions.
It also moves the win32 specific code away from pqsignal.h/c into
port/win32, and also removes the "thread style workaround" of the APC
issue previously in place.
In order to make things work, a few things are also changed in pgstat.c:
1) There is now a separate pipe to the collector and the bufferer. This
is required because the pipe will otherwise only be signalled in one of
the processes when the postmaster goes down. The MS winsock code for
select() must have some kind of workaround for this behaviour, but I
have found no stable way of doing that. You really are not supposed to
use the same socket from more than one process (unless you use
WSADuplicateSocket(), in which case the docs specifically say that only
one will be flagged).
2) The check for "postmaster death" is moved into a separate select()
call after the main loop. The previous behaviour select():ed on the
postmaster pipe, while later explicitly saying "we do NOT check for
postmaster exit inside the loop".
The issue was that the code relies on the same select() call seeing both
the postmaster pipe *and* the pgstat pipe go away. This does not always
happen, and it appears that useing WSAEventSelect() makes it even more
common that it does not.
Since it's only called when the process exits, I don't think using a
separate select() call will have any significant impact on how the stats
collector works.
Magnus Hagander
"millennium" date part implementation in postgresql, both in the code
and the documentation, so that it conforms to the official definition.
If you do not agree with the official definition, please send your
complaint to "pope@vatican.org". I'm not responsible for them;-)
With the previous version, the centuries and millenniums had a wrong
number and started the wrong year. Moreover century number 0, which does
not exist in reality, lasted 200 years. Also, millennium number 0 lasted
2000 years.
If you want postgresql to have it's own definition of "century" and
"millennium" that does not conform to the one of the society, just give
them another name. I would suggest "pgCENTURY" and "pgMILLENNIUM";-)
IMO, if someone may use the options, it means that postgresql is used for
historical data, so it make sense to have an historical definition. Also,
I just want to divide the year by 100 or 1000, I can do that quite easily.
BACKWARD INCOMPATIBLE CHANGE
Fabien Coelho - coelho@cri.ensmp.fr
by the set operation, so that redundant sorts at higher levels can be
avoided. This was foreseen a good while back, but not done. Per request
from Karel Zak.
> >>with allowed values of "all, mod, ddl, none" with default "none".
OK, here is a patch that implements #1. Here is sample output:
test=> set client_min_messages = 'log';
SET
test=> set log_statement = 'mod';
SET
test=> select 1;
?column?
----------
1
(1 row)
test=> update test set x=1;
LOG: statement: update test set x=1;
ERROR: relation "test" does not exist
test=> update test set x=1;
LOG: statement: update test set x=1;
ERROR: relation "test" does not exist
test=> copy test from '/tmp/x';
LOG: statement: copy test from '/tmp/x';
ERROR: relation "test" does not exist
test=> copy test to '/tmp/x';
ERROR: relation "test" does not exist
test=> prepare xx as select 1;
PREPARE
test=> prepare xx as update x set y=1;
LOG: statement: prepare xx as update x set y=1;
ERROR: relation "x" does not exist
test=> explain analyze select 1;;
QUERY PLAN
------------------------------------------------------------------------------------
Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.006..0.007 rows=1 loops=1)
Total runtime: 0.046 ms
(2 rows)
test=> explain analyze update test set x=1;
LOG: statement: explain analyze update test set x=1;
ERROR: relation "test" does not exist
test=> explain update test set x=1;
ERROR: relation "test" does not exist
It checks PREPARE and EXECUTE ANALYZE too. The log_statement values are
'none', 'mod', 'ddl', and 'all'. For 'all', it prints before the query
is parsed, and for ddl/mod, it does it right after parsing using the
node tag (or command tag for CREATE/ALTER/DROP), so any non-parse errors
will print after the log line.
That particular corner case is not exactly compelling, but given 7.4's
ability to discard redundant join clauses, it is possible for the situation
to arise from queries that are not so obviously silly. Per bug report
of 6-Apr-04.
the COPY NULL string:
test=> copy pg_language to '/tmp/x' with delimiter '|';
COPY
test=> copy pg_language to '/tmp/x' with delimiter '|' null '|x';
ERROR: COPY delimiter must not appear in the NULL specification
test=> copy pg_language from '/tmp/x' with delimiter '|' null '|x';
ERROR: COPY delimiter must not appear in the NULL specification
It also throws an error if it conflicts with the default NULL string:
test=> copy pg_language to '/tmp/x' with delimiter '\\';
ERROR: COPY delimiter must not appear in the NULL specification
test=> copy pg_language to '/tmp/x' with delimiter '\\' NULL 'x';
COPY
'SELECT foo()' in a SQL function returning a rowtype, to simply pass
back the results of another function returning the same rowtype.
However, that hasn't actually worked in many years. Now it works again.