Cases such as similarity('', '') produced a NaN result due to computing
0/0. Per discussion, make it return zero instead.
This appears to be the basic cause of bug #7867 from Michele Baravalle,
although it remains unclear why her installation doesn't think Cyrillic
letters are letters.
Back-patch to all active branches.
libpgcommon is a new static library to allow sharing code among the
various frontend programs and backend; this lets us eliminate duplicate
implementations of common routines. We avoid libpgport, because that's
intended as a place for porting issues; per discussion, it seems better
to keep them separate.
The first use case, and the only implemented by this patch, is pg_malloc
and friends, which many frontend programs were already using.
At the same time, we can use this to provide palloc emulation functions
for the frontend; this way, some palloc-using files in the backend can
also be used by the frontend cleanly. To do this, we change palloc() in
the backend to be a function instead of a macro on top of
MemoryContextAlloc(). This was previously believed to cause loss of
performance, but this implementation has been tweaked by Tom and Andres
so that on modern compilers it provides a slight improvement over the
previous one.
This lets us clean up some places that were already with
localized hacks.
Most of the pg_malloc/palloc changes in this patch were authored by
Andres Freund. Zoltán Böszörményi also independently provided a form of
that. libpgcommon infrastructure was authored by Álvaro.
The previous coding supposed that the first differing bytes in two varlena
datums must have the same sign difference as their overall comparison
result. This is obviously bogus for text strings in non-C locales, and
probably wrong for numeric, and even for bytea I think it was wrong on
machines where char is signed. When the assumption failed, the function
could deliver a zero or negative penalty in situations where such a result
is quite ridiculous, leading the core GiST code to make very bad page-split
decisions.
To fix, take the absolute values of the byte-level differences. Also,
switch the code to using unsigned char not just char, so that the behavior
will be consistent whether char is signed or not.
Per investigation of a trouble report from Tomas Vondra. Back-patch to all
supported branches.
gbt_var_bin_union() failed to do the right thing when the existing range
needed to be widened at both ends rather than just one end. This could
result in an invalid index in which keys that are present would not be
found by searches, because the searches would not think they need to
descend to the relevant leaf pages. This error affected all the varlena
datatypes supported by btree_gist (text, bytea, bit, numeric).
Per investigation of a trouble report from Tomas Vondra. (There is also
an issue in gbt_var_penalty(), but that should only result in inefficiency
not wrong answers. I'm committing this separately so that we have a git
state in which it can be tested that bad penalty results don't produce
invalid indexes.) Back-patch to all supported branches.
The new option specifies length of aggregation interval (in
seconds). May be used only together with -l. With this option, the log
contains per-interval summary (number of transactions, min/max latency
and two additional fields useful for variance estimation).
Patch contributed by Tomas Vondra, reviewed by Pavel Stehule. Slight
change by Tatsuo Ishii, suggested by Robert Hass to emit an error
message indicating that the option is not currently supported on
Windows.
Beyond 21474, the number of accounts exceed the range for int4. Change the
initialization code to use bigint for account id columns when scale is large
enough, and switch to using int64s for the variables in pgbench code. The
threshold where we switch to bigints is set at 20000, because that's easier
to remember and document than 21474, and ensures that there is some headroom
when int4s are used.
Greg Smith, with various changes by Euler Taveira de Oliveira, Gurjeet
Singh and Satoshi Nagayasu.
This makes 9.3 -> 9.3 upgrades work when they cross the commit that
added persistent multixacts; early 9.3 pg_controldata did not have the
required oldestMultiXact line, and so would fail to upgrade.
per Bruce Momjian
When pg_upgrade can't find required pg_controldata information, report
_which_ cluster is failing, with this message:
The %s cluster lacks some required control information:
This patch introduces two additional lock modes for tuples: "SELECT FOR
KEY SHARE" and "SELECT FOR NO KEY UPDATE". These don't block each
other, in contrast with already existing "SELECT FOR SHARE" and "SELECT
FOR UPDATE". UPDATE commands that do not modify the values stored in
the columns that are part of the key of the tuple now grab a SELECT FOR
NO KEY UPDATE lock on the tuple, allowing them to proceed concurrently
with tuple locks of the FOR KEY SHARE variety.
Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this
means the concurrency improvement applies to them, which is the whole
point of this patch.
The added tuple lock semantics require some rejiggering of the multixact
module, so that the locking level that each transaction is holding can
be stored alongside its Xid. Also, multixacts now need to persist
across server restarts and crashes, because they can now represent not
only tuple locks, but also tuple updates. This means we need more
careful tracking of lifetime of pg_multixact SLRU files; since they now
persist longer, we require more infrastructure to figure out when they
can be removed. pg_upgrade also needs to be careful to copy
pg_multixact files over from the old server to the new, or at least part
of multixact.c state, depending on the versions of the old and new
servers.
Tuple time qualification rules (HeapTupleSatisfies routines) need to be
careful not to consider tuples with the "is multi" infomask bit set as
being only locked; they might need to look up MultiXact values (i.e.
possibly do pg_multixact I/O) to find out the Xid that updated a tuple,
whereas they previously were assured to only use information readily
available from the tuple header. This is considered acceptable, because
the extra I/O would involve cases that would previously cause some
commands to block waiting for concurrent transactions to finish.
Another important change is the fact that locking tuples that have
previously been updated causes the future versions to be marked as
locked, too; this is essential for correctness of foreign key checks.
This causes additional WAL-logging, also (there was previously a single
WAL record for a locked tuple; now there are as many as updated copies
of the tuple there exist.)
With all this in place, contention related to tuples being checked by
foreign key rules should be much reduced.
As a bonus, the old behavior that a subtransaction grabbing a stronger
tuple lock than the parent (sub)transaction held on a given tuple and
later aborting caused the weaker lock to be lost, has been fixed.
Many new spec files were added for isolation tester framework, to ensure
overall behavior is sane. There's probably room for several more tests.
There were several reviewers of this patch; in particular, Noah Misch
and Andres Freund spent considerable time in it. Original idea for the
patch came from Simon Riggs, after a problem report by Joel Jacobson.
Most code is from me, with contributions from Marti Raudsepp, Alexander
Shulgin, Noah Misch and Andres Freund.
This patch was discussed in several pgsql-hackers threads; the most
important start at the following message-ids:
AANLkTimo9XVcEzfiBR-ut3KVNDkjm2Vxh+t8kAmWjPuv@mail.gmail.com1290721684-sup-3951@alvh.no-ip.org1294953201-sup-2099@alvh.no-ip.org1320343602-sup-2290@alvh.no-ip.org1339690386-sup-8927@alvh.no-ip.org4FE5FF020200002500048A3D@gw.wicourts.gov4FEAB90A0200002500048B7D@gw.wicourts.gov
With AtEOXact applied, --single-transaction makes pg_restore slower, and
has the potential to require lock table configuration, so remove the
argument.
Per suggestion from Tom.
In commit 71450d7fd6, we added code to inform
suitably-intelligent compilers that ereport() doesn't return if the elevel
is ERROR or higher. This patch extends that to elog(), and also fixes a
double-evaluation hazard that the previous commit created in ereport(),
as well as reducing the emitted code size.
The elog() improvement requires the compiler to support __VA_ARGS__, which
should be available in just about anything nowadays since it's required by
C99. But our minimum language baseline is still C89, so add a configure
test for that.
The previous commit assumed that ereport's elevel could be evaluated twice,
which isn't terribly safe --- there are already counterexamples in xlog.c.
On compilers that have __builtin_constant_p, we can use that to protect the
second test, since there's no possible optimization gain if the compiler
doesn't know the value of elevel. Otherwise, use a local variable inside
the macros to prevent double evaluation. The local-variable solution is
inferior because (a) it leads to useless code being emitted when elevel
isn't constant, and (b) it increases the optimization level needed for the
compiler to recognize that subsequent code is unreachable. But it seems
better than not teaching non-gcc compilers about unreachability at all.
Lastly, if the compiler has __builtin_unreachable(), we can use that
instead of abort(), resulting in a noticeable code savings since no
function call is actually emitted. However, it seems wise to do this only
in non-assert builds. In an assert build, continue to use abort(), so that
the behavior will be predictable and debuggable if the "impossible"
happens.
These changes involve making the ereport and elog macros emit do-while
statement blocks not just expressions, which forces small changes in
a few call sites.
Andres Freund, Tom Lane, Heikki Linnakangas
This is now used by ecpg tests, and not clobbered by pg_upgrade
tests. This change won't affect anything that doesn't set this
environment variable, but will enable the buildfarm to control
exactly what port regression test installs will be running on,
and thus to detect possible rogue postmasters more easily.
Backpatch to release 9.2 where EXTRA_REGRESS_OPTS was first used.
(-i), producing only one progress message per 5 seconds along with
elapsed time and estimated remaining time. Also add elapsed time and
estimated remaining time to the default logging(prints one message
each 100000 rows).
Patch contributed by Tomas Vondra, reviewed by Jeevan Chalke and
Tatsuo Ishii.
On non-Windows machines, we use the Unix socket for connections to test
postmasters, so there is no need to create a TCP socket. Furthermore,
doing so causes failures due to port conflicts if two builds are carried
out concurrently on one machine. (If the builds are done in different
chroots, which is standard practice at least in Red Hat distros, there
is no risk of conflict on the Unix socket.) Suppressing the TCP socket
by setting listen_addresses to empty has long been standard practice
for pg_regress, and pg_upgrade knows about this too ... but pg_upgrade's
test.sh didn't get the memo.
Back-patch to 9.2, and also sync the 9.2 version of the script with HEAD
as much as practical.
Because the client encoding might not match the server encoding,
pg_upgrade can't allocate NAMEDATALEN bytes for storage of database,
relation, and namespace identifiers. Instead pg_strdup() the memory and
free it.
Also add C comment in initdb.c about safe NAMEDATALEN usage.
All versions of pg_upgrade upgraded invalid indexes caused by CREATE
INDEX CONCURRENTLY failures and marked them as valid. The patch adds a
check to all pg_upgrade versions and throws an error during upgrade or
--check.
Backpatch to 9.2, 9.1, 9.0. Patch slightly adjusted.
Normally each module is tested in a database named contrib_regression,
which is dropped and recreated at the beginhning of each pg_regress run.
This new mode, enabled by adding USE_MODULE_DB=1 to the make command
line, runs most modules in a database with the module name embedded in
it.
This will make testing pg_upgrade on clusters with the contrib modules
a lot easier.
Second attempt at this, this time accomodating make versions older
than 3.82.
Still to be done: adapt to the MSVC build system.
Backpatch to 9.0, which is the earliest version it is reasonably
possible to test upgrading from.
Pg_upgrade displays file names during copy and database names during
dump/restore. Andrew Dunstan identified three bugs:
* long file names were being truncated to 60 _leading_ characters, which
often do not change for long file names
* file names were truncated to 60 characters in log files
* carriage returns were being output to log files
This commit fixes these --- it prints 60 _trailing_ characters to the
status display, and full path names without carriage returns to log
files. It also suppresses status output to the log file unless verbose
mode is used.
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
storage.
Have pg_upgrade use it, and enable server options fsync=off and
full_page_writes=off.
Document that users turning fsync from off to on should run initdb
--sync-only.
[ Previous commit was incorrectly applied as a git merge. ]
Normally each module is tested in aq database named contrib_regression,
which is dropped and recreated at the beginhning of each pg_regress run.
This mode, enabled by adding USE_MODULE_DB=1 to the make command line,
runs most modules in a database with the module name embedded in it.
This will make testing pg_upgrade on clusters with the contrib modules
a lot easier.
Still to be done: adapt to the MSVC build system.
Backpatch to 9.0, which is the earliest version it is reasonably possible
to test upgrading from.
--single-transaction to restore each database schema. This yields
performance improvements for databases with many tables. Also, remove
split_old_dump() as it is no longer needed.
Commit 8cb53654db, which introduced DROP
INDEX CONCURRENTLY, managed to break CREATE INDEX CONCURRENTLY via a poor
choice of catalog state representation. The pg_index state for an index
that's reached the final pre-drop stage was the same as the state for an
index just created by CREATE INDEX CONCURRENTLY. This meant that the
(necessary) change to make RelationGetIndexList ignore about-to-die indexes
also made it ignore freshly-created indexes; which is catastrophic because
the latter do need to be considered in HOT-safety decisions. Failure to
do so leads to incorrect index entries and subsequently wrong results from
queries depending on the concurrently-created index.
To fix, add an additional boolean column "indislive" to pg_index, so that
the freshly-created and about-to-die states can be distinguished. (This
change obviously is only possible in HEAD. This patch will need to be
back-patched, but in 9.2 we'll use a kluge consisting of overloading the
formerly-impossible state of indisvalid = true and indisready = false.)
In addition, change CREATE/DROP INDEX CONCURRENTLY so that the pg_index
flag changes they make without exclusive lock on the index are made via
heap_inplace_update() rather than a normal transactional update. The
latter is not very safe because moving the pg_index tuple could result in
concurrent SnapshotNow scans finding it twice or not at all, thus possibly
resulting in index corruption. This is a pre-existing bug in CREATE INDEX
CONCURRENTLY, which was copied into the DROP code.
In addition, fix various places in the code that ought to check to make
sure that the indexes they are manipulating are valid and/or ready as
appropriate. These represent bugs that have existed since 8.2, since
a failed CREATE INDEX CONCURRENTLY could leave a corrupt or invalid
index behind, and we ought not try to do anything that might fail with
such an index.
Also fix RelationReloadIndexInfo to ensure it copies all the pg_index
columns that are allowed to change after initial creation. Previously we
could have been left with stale values of some fields in an index relcache
entry. It's not clear whether this actually had any user-visible
consequences, but it's at least a bug waiting to happen.
In addition, do some code and docs review for DROP INDEX CONCURRENTLY;
some cosmetic code cleanup but mostly addition and revision of comments.
This will need to be back-patched, but in a noticeably different form,
so I'm committing it to HEAD before working on the back-patch.
Problem reported by Amit Kapila, diagnosis by Pavan Deolassee,
fix by Tom Lane and Andres Freund.
The previous definitions of these GUC variables allowed them to range
up to INT_MAX, but in point of fact the underlying code would suffer
overflows or other errors with large values. Reduce the maximum values
to something that won't misbehave. There's no apparent value in working
harder than this, since very large delays aren't sensible for any of
these. (Note: the risk with archive_timeout is that if we're late
checking the state, the timestamp difference it's being compared to
might overflow. So we need some amount of slop; the choice of INT_MAX/2
is arbitrary.)
Per followup investigation of bug #7670. Although this isn't a very
significant fix, might as well back-patch.
existence via open(), rather than collecting a directory listing and
looking up matching relfilenode files with sequential scans of the
array. This speeds up pg_upgrade by 2x for a large number of tables,
e.g. 16k.
Per observation by Ants Aasma.
... and have sepgsql use it to determine whether to check permissions
during certain operations. Indexes that are being created as a result
of REINDEX, for instance, do not need to have their permissions checked;
they were already checked when the index was created.
Author: KaiGai Kohei, slightly revised by me
Numerous flex and bison make rules have appeared in the source tree
over time, and they are all virtually identical, so we can replace
them by pattern rules with some variables for customization.
Users of pgxs will also be able to benefit from this.
The HINTs generated for these error cases vary across builds. We
could try to work around that, but the test cases aren't really useful
enough to justify taking any trouble.
Per buildfarm.
dblink now has its own validator function dblink_fdw_validator(), which is
better than the core function postgresql_fdw_validator() because it gets
the list of legal options from libpq instead of having a hard-wired list.
Make the dblink extension module provide a standard foreign data wrapper
dblink_fdw that encapsulates use of this validator, and recommend use of
that wrapper instead of making up wrappers on the fly.
Unfortunately, because ad-hoc wrappers *were* recommended practice
previously, it's not clear when we can get rid of postgresql_fdw_validator
without causing upgrade problems. But this is a step in the right
direction.
Shigeru Hanada, reviewed by KaiGai Kohei
This allows logging only some fraction of transactions, greatly reducing
the amount of log generated.
Tomas Vondra, reviewed by Robert Haas and Jeff Janes.
On some platforms these functions return NULL, rather than the more common
practice of returning a pointer to a zero-sized block of memory. Hack our
various wrapper functions to hide the difference by substituting a size
request of 1. This is probably not so important for the callers, who
should never touch the block anyway if they asked for size 0 --- but it's
important for the wrapper functions themselves, which mistakenly treated
the NULL result as an out-of-memory failure. This broke at least pg_dump
for the case of no user-defined aggregates, as per report from
Matthew Carrington.
Back-patch to 9.2 to fix the pg_dump issue. Given the lack of previous
complaints, it seems likely that there is no live bug in previous releases,
even though some of these functions were in place before that.
We had a number of variants on the theme of "malloc or die", with the
majority named like "pg_malloc", but by no means all. Standardize on the
names pg_malloc, pg_malloc0, pg_realloc, pg_strdup. Get rid of pg_calloc
entirely in favor of using pg_malloc0.
This is an essentially cosmetic change, so no back-patch. (I did find
a couple of places where psql and pg_dump were using plain malloc or
strdup instead of the pg_ versions, but they don't look significant
enough to bother back-patching.)
entries are not dumped. This fixes an error caused by
droping/recreating the information_schema, but other failures were also
possible.
Backpatch to 9.2.
On reflection (especially after noticing how many buildfarm critters have
__builtin_types_compatible_p but not _Static_assert), it seems like we
ought to try a bit harder to make these macros do something everywhere.
The initial cut at it would have been no help to code that is compiled only
on platforms without _Static_assert, for instance; and in any case not all
our contributors do their initial coding on the latest gcc version.
Some googling about static assertions turns up quite a bit of prior art
for making it work in compilers that lack _Static_assert. The method
that seems closest to our needs involves defining a struct with a bit-field
that has negative width if the assertion condition fails. There seems no
reliable way to get the error message string to be output, but throwing a
compile error with a confusing message is better than missing the problem
altogether.
In the same spirit, if we don't have __builtin_types_compatible_p we can at
least insist that the variable have the same width as the type. This won't
catch errors such as "wrong pointer type", but it's far better than
nothing.
In addition to changing the macro definitions, adjust a
compile-time-constant Assert in contrib/hstore to use StaticAssertStmt,
so we can get some buildfarm coverage on whether that macro behaves sanely
or not. There's surely more places that could be converted, but this is
the first one I came across.
If we call pg_ctl stop, the server might continue and thus
hold a log file for a short time after it has deleted its pid file,
(which is when pg_ctl will exit), and so a subsequent attempt to
open the log file might fail.
We therefore try to open it a few times, sleeping one second between
tries, to give the server time to exit.
This corrects an error that was observed on the buildfarm.
Backpatched to 9.2,
Call pg_dumpall using -f switch instead of redirection, to avoid
writing the output in text mode and generating spurious carriage
returns. Remove to carriage return ignoring hack introduced by
commit e442b0f0c6.
Backpatch to 9.2.
pg_upgrade opened the output from pg_dumpall in text mode and
wrote the split files in text mode. This caused unwanted eating
of intended carriage returns on input and production of spurious
carriage returns on output. To avoid this, open all these files
in binary mode. On non-Windows platforms, this change has no
effect.
Backpatch to 9.0. On 9.0 and 9.1, we also switch from redirecting
pg_dumpall's output to using pg_dumpall's -f switch, for the same
reason.
socket location. Also, prevent putting the socket in the current
directory for pre-9.1 servers in live check and non-live check mode,
because pre-9.1 pg_ctl -w can't handle it.
Backpatch to 9.2.
pg_upgrade produces a platform-specific script to remove the old
directory, but on Windows it has not been making sure that the
paths it writes as arguments for rmdir and del use the backslash
path separator, which will cause these scripts to fail.
The fix is backpatched to Release 9.0.
When starting either an old or new postmaster, force it to place its Unix
socket in the current directory. This makes it even harder for accidental
connections to occur during pg_upgrade, and also works around some
scenarios where the default socket location isn't usable. (For example,
if the default location is something other than "/tmp", it might not exist
during "make check".)
When checking an already-running old postmaster, find out its actual socket
directory location from postmaster.pid, if possible. This dodges problems
with an old postmaster having a configured location different from the
default built into pg_upgrade's libpq. We can't find that out if the old
postmaster is pre-9.1, so also document how to cope with such scenarios
manually.
In support of this, centralize handling of the connection-related command
line options passed to pg_upgrade's subsidiary programs, such as pg_dump.
This should make future changes easier.
Bruce Momjian and Tom Lane
This reduces unnecessary exposure of other headers through htup.h, which
is very widely included by many files.
I have chosen to move the function prototypes to the new file as well,
because that means htup.h no longer needs to include tupdesc.h. In
itself this doesn't have much effect in indirect inclusion of tupdesc.h
throughout the tree, because it's also required by execnodes.h; but it's
something to explore in the future, and it seemed best to do the htup.h
change now while I'm busy with it.
The previous signature made it very easy to pass something other than
the printf-format specifier in the corresponding position, without any
warning from the compiler.
While at it, move some of the escaping, redirecting and quoting
responsibilities from the callers into exec_prog() itself. This makes
the callsites cleaner.
Extraction of trigrams did not process LIKE escape sequences properly,
leading to possible misidentification of trigrams near escapes, resulting
in incorrect index search results.
Fujii Masao
libxslt offers the ability to read and write both files and URLs through
stylesheet commands, thus allowing unprivileged database users to both read
and write data with the privileges of the database server. Disable that
through proper use of libxslt's security options.
Also, remove xslt_process()'s ability to fetch documents and stylesheets
from external files/URLs. While this was a documented "feature", it was
long regarded as a terrible idea. The fix for CVE-2012-3489 broke that
capability, and rather than expend effort on trying to fix it, we're just
going to summarily remove it.
While the ability to write as well as read makes this security hole
considerably worse than CVE-2012-3489, the problem is mitigated by the fact
that xslt_process() is not available unless contrib/xml2 is installed,
and the longstanding warnings about security risks from that should have
discouraged prudent DBAs from installing it in security-exposed databases.
Reported and fixed by Peter Eisentraut.
Security: CVE-2012-3488
4741e9afb9. This was done by adding an
optional second log file parameter to exec_prog(), and closing and
reopening the log file between system() calls.
Backpatch to 9.2.
After taking awhile to digest the row-processor feature that was added to
libpq in commit 92785dac2e, we've concluded
it is over-complicated and too hard to use. Leave the core infrastructure
changes in place (that is, there's still a row processor function inside
libpq), but remove the exposed API pieces, and instead provide a "single
row" mode switch that causes PQgetResult to return one row at a time in
separate PGresult objects.
This approach incurs more overhead than proper use of a row processor
callback would, since construction of a PGresult per row adds extra cycles.
However, it is far easier to use and harder to break. The single-row mode
still affords applications the primary benefit that the row processor API
was meant to provide, namely not having to accumulate large result sets in
memory before processing them. Preliminary testing suggests that we can
probably buy back most of the extra cycles by micro-optimizing construction
of the extra results, but that task will be left for another day.
Marko Kreen
This is apparently faster than doing things the other way around when
the scale factor is large.
Along the way, adjust -n to suppress vacuuming during initialization
as well as during test runs.
Jeff Janes, with some small changes by me.
Commit 3855968f32 added syntax, pg_dump,
psql support, and documentation, but the triggers didn't actually fire.
With this commit, they now do. This is still a pretty basic facility
overall because event triggers do not get a whole lot of information
about what the user is trying to do unless you write them in C; and
there's still no option to fire them anywhere except at the very
beginning of the execution sequence, but it's better than nothing,
and a good building block for future work.
Along the way, add a regression test for ALTER LARGE OBJECT, since
testing of event triggers reveals that we haven't got one.
Dimitri Fontaine and Robert Haas
Since the scandir() emulation was taken out of pg_upgrade, there's
no longer any need for scandir_file_pattern to exist as a global
variable. Replace it with a local in the one remaining function
that was making use of it.
Error out on out-of-memory, rather than returning -1, which the sole
existing caller wasn't checking for anyway. There doesn't seem to be
any use-case for making the caller check for failure here.
Detect failure return from readdir().
Use a less platform-dependent method of calculating the entrysize.
It's possible, but not yet confirmed, that this explains bug #6733,
in which Mike Wilson reports a pg_upgrade crash that did not occur
in 9.1. (Note that load_directory is effectively new code in 9.2,
at least on platforms that have scandir().)
Fix up comments, avoid uselessly using two counters, reduce the number
of realloc calls to something sane.
The Solaris Studio compiler warns about these instances, unlike more
mainstream compilers such as gcc. But manual inspection showed that
the code is clearly not reachable, and we hope no worthy compiler will
complain about removing this code.
When reading from a text- or CSV-format file in file_fdw, the datatype
input routines can consume a significant fraction of the runtime.
Often, the query does not need all the columns, so we can get a useful
speed boost by skipping I/O conversion for unnecessary columns.
To support this, add a "convert_selectively" option to the core COPY code.
This is undocumented and not accessible from SQL (for now, anyway).
Etsuro Fujita, reviewed by KaiGai Kohei
Currently only pg_clog is copied, but some other directories could need
the same treatment as well, so create a subroutine to do it.
Extracted from my (somewhat larger) FOR KEY SHARE patch.
Now the log file not only contains the output from commands executed by
system(), but also what command it was in the first place. This
arrangement makes debugging a lot simpler.
snprintf counts trailing NUL towards the char limit. Failing to account
for that was causing an invalid value to be passed to pg_resetxlog -l,
aborting the upgrade process.
The xlogid + segno representation of a particular WAL segment doesn't make
much sense in pg_resetxlog anymore, now that we don't use that anywhere
else. Use the WAL filename instead, since that's a convenient way to name a
particular WAL segment.
I did this partially for pg_resetxlog in the original xlogid/segno -> uint64
patch, but I neglected pg_upgrade and the docs. This should now be more
complete.
The latter was already the dominant use, and it's preferable because
in C the convention is that intXX means XX bits. Therefore, allowing
mixed use of int2, int4, int8, int16, int32 is obviously confusing.
Remove the typedefs for int2 and int4 for now. They don't seem to be
widely used outside of the PostgreSQL source tree, and the few uses
can probably be cleaned up by the time this ships.
This simplifies code that needs to do arithmetic on XLogRecPtrs.
To avoid changing on-disk format of data pages, the LSN on data pages is
still stored in the old format. That should keep pg_upgrade happy. However,
we have XLogRecPtrs embedded in the control file, and in the structs that
are sent over the replication protocol, so this changes breaks compatibility
of pg_basebackup and server. I didn't do anything about this in this patch,
per discussion on -hackers, the right thing to do would to be to change the
replication protocol to be architecture-independent, so that you could use
a newer version of pg_receivexlog, for example, against an older server
version.
These days, even a wimpy system can insert 10000 tuples in the blink of
an eye, so there's no real need for this much verbosity.
Per complaint from Tatsuo Ishii.
The option --foreign-keys, used at initialization time, will create foreign
key constraints for the columns that represent references to other tables'
primary keys. This can help in benchmarking FK performance.
Jeff Janes
Before, some places didn't document the short options (-? and -V),
some documented both, some documented nothing, and they were listed in
various orders. Now this is hopefully more consistent and complete.
The simplest way to handle this is just to copy-and-paste the relevant
code block in fork_process.c, so that's what I did. (It's possible that
something more complicated would be useful to packagers who want to work
with either the old or the new API; but at this point the number of such
people is rapidly approaching zero, so let's just get the minimal thing
done.) Update relevant documentation as well.
It failed to check for error return from xsltApplyStylesheet(), as reported
by Peter Gagarinov. (So far as I can tell, libxslt provides no convenient
way to get a useful error message in failure cases. There might be some
inconvenient way, but considering that this code is deprecated it's hard to
get enthusiastic about putting lots of work into it. So I just made it say
"failed to apply stylesheet", in line with the existing error checks.)
While looking at the code I also noticed that the string returned by
xsltSaveResultToString was never freed, resulting in a session-lifespan
memory leak.
Back-patch to all supported versions.
Overly tight coding caused the password transformation loop to stop
examining input once it had processed a byte equal to 0x80. Thus, if the
given password string contained such a byte (which is possible though not
highly likely in UTF8, and perhaps also in other non-ASCII encodings), all
subsequent characters would not contribute to the hash, making the password
much weaker than it appears on the surface.
This would only affect cases where applications used DES crypt() to encode
passwords before storing them in the database. If a weak password has been
created in this fashion, the hash will stop matching after this update has
been applied, so it will be easy to tell if any passwords were unexpectedly
weak. Changing to a different password would be a good idea in such a case.
(Since DES has been considered inadequately secure for some time, changing
to a different encryption algorithm can also be recommended.)
This code, and the bug, are shared with at least PHP, FreeBSD, and OpenBSD.
Since the other projects have already published their fixes, there is no
point in trying to keep this commit private.
This bug has been assigned CVE-2012-2143, and credit for its discovery goes
to Rubin Xu and Joseph Bonneau.
Write the file to a temporary name and then rename() it into the
permanent name, to ensure it can't end up half-written and corrupt
in case of a crash during shutdown.
Unlink the file after it has been read so it's removed from the data
directory and not included in base backups going to replication slaves.
When the column name is an unqualified name, rather than table.column,
the error message complains about too many dotted names, which is
wrong. Report by Peter Eisentraut based on examination of the
sepgsql regression test output, but the problem also affects COMMENT.
New wording as suggested by Tom Lane.
We previously recognized that citext wouldn't get marked as collatable
during pg_upgrade from a pre-9.1 installation, and hacked its
create-from-unpackaged script to manually perform the necessary catalog
adjustments. However, we overlooked the fact that domains over citext,
as well as the citext[] array type, need the same adjustments. Extend
the script to handle those cases.
Also, the documentation suggested that this was only an issue in pg_upgrade
scenarios, which is quite wrong; loading any dump containing citext from a
pre-9.1 server will also result in the type being wrongly marked.
I approached the documentation problem by changing the 9.1.2 release note
paragraphs about this issue, which is historically inaccurate. But it
seems better than having the information scattered in multiple places, and
leaving incorrect info in the 9.1.2 notes would be bad anyway. We'll still
need to mention the issue again in the 9.1.4 notes, but perhaps they can
just reference 9.1.2 for fix instructions.
Per report from Evan Carroll. Back-patch into 9.1.
Display total time and I/O timings in milliseconds, for consistency with
the units used for timings in the core statistics views. The columns
remain of float8 type, so that sub-msec precision is available. (At some
point we will probably want to convert the core views to use float8 type
for the same reason, but this patch does not touch that issue.)
This is a release-note-requiring change in the meaning of the total_time
column. The I/O timing columns are new as of 9.2, so there is no
compatibility impact from redefining them.
Do some minor copy-editing in the documentation, too.
This patch adjusts the treatment of parameterized paths so that all paths
with the same parameterization (same set of required outer rels) for the
same relation will have the same rowcount estimate. We cache the rowcount
estimates to ensure that property, and hopefully save a few cycles too.
Doing this makes it practical for add_path_precheck to operate without
a rowcount estimate: it need only assume that paths with different
parameterizations never dominate each other, which is close enough to
true anyway for coarse filtering, because normally a more-parameterized
path should yield fewer rows thanks to having more join clauses to apply.
In add_path, we do the full nine yards of comparing rowcount estimates
along with everything else, so that we can discard parameterized paths that
don't actually have an advantage. This fixes some issues I'd found with
add_path rejecting parameterized paths on the grounds that they were more
expensive than not-parameterized ones, even though they yielded many fewer
rows and hence would be cheaper once subsequent joining was considered.
To make the same-rowcounts assumption valid, we have to require that any
parameterized path enforce *all* join clauses that could be obtained from
the particular set of outer rels, even if not all of them are useful for
indexing. This is required at both base scans and joins. It's a good
thing anyway since the net impact is that join quals are checked at the
lowest practical level in the join tree. Hence, discard the original
rather ad-hoc mechanism for choosing parameterization joinquals, and build
a better one that has a more principled rule for when clauses can be moved.
The original rule was actually buggy anyway for lack of knowledge about
which relations are part of an outer join's outer side; getting this right
requires adding an outer_relids field to RestrictInfo.
Remove lots of outdated information that is duplicated by the
better-maintained SGML documentation. In particular, remove the
outdated listing of contrib modules. Update the installation
instructions to mention CREATE EXTENSION, but don't go into too much
detail.
default tablespace, but part of a database that is in a user-defined
tablespace. Caused "file not found" error during upgrade.
Per bug report from Ants Aasma.
Backpatch to 9.1 and 9.0.
There's no need to sit there and increment the stats when we know all the
increments would be zero anyway. The actual additions might not be very
expensive, but skipping acquisition of the spinlock seems like a good
thing. Pushing the logic about initialization of the usage count down into
entry_alloc() allows us to do that while making the code actually simpler,
not more complex. Expansion on a suggestion by Peter Geoghegan.
This patch addresses a deficiency in the previous pg_stat_statements patch.
We want to give sticky entries an initial "usage" factor high enough that
they probably will stick around until their query is completed. However,
if the query never completes (eg it gets an error during execution), the
entry shouldn't persist indefinitely. Manage this by starting out with
a usage setting equal to the (approximate) median usage value within the
whole hashtable, but decaying the value much more aggressively than we
do for normal entries.
Peter Geoghegan
If we make the initially-called function return the table physical-size
estimate, acquire_inherited_sample_rows will be able to use that to
allocate numbers of samples among child tables, when the day comes that
we want to support foreign tables in inheritance trees.
ANALYZE now accepts foreign tables and allows the table's FDW to control
how the sample rows are collected. (But only manual ANALYZEs will touch
foreign tables, for the moment, since among other things it's not very
clear how to handle remote permissions checks in an auto-analyze.)
contrib/file_fdw is extended to support this.
Etsuro Fujita, reviewed by Shigeru Hanada, some further tweaking by me.
This patch provides a test case for libpq's row processor API.
contrib/dblink can deal with very large result sets by dumping them into
a tuplestore (which can spill to disk) --- but until now, the intermediate
storage of the query result in a PGresult meant memory bloat for any large
result. Now we use a row processor to convert the data to tuple form and
dump it directly into the tuplestore.
A limitation is that this only works for plain dblink() queries, not
dblink_send_query() followed by dblink_get_result(). In the latter
case we don't know the desired tuple rowtype soon enough. While hack
solutions to that are possible, a different user-level API would
probably be a better answer.
Kyotaro Horiguchi, reviewed by Marko Kreen and Tom Lane
dblink_exec leaked temporary database connections if any error occurred
after connection setup, for example
SELECT dblink_exec('...connect string...', 'select 1/0');
Add a PG_TRY block to ensure PQfinish gets done when it is needed.
(dblink_record_internal is on the hairy edge of needing similar treatment,
but seems not to be actively broken at the moment.)
Also, in 9.0 and up, only one of the three functions using tuplestore
return mode was properly checking that the query context would allow
a tuplestore result.
Noted while reviewing dblink patch. Back-patch to all supported branches.
The DBLINK_GET_CONN and DBLINK_GET_NAMED_CONN macros did not set the
surrounding function's conname variable, causing errors to be incorrectly
reported as having occurred on the "unnamed" connection in some cases.
This bug was actually visible in two cases in the regression tests,
but apparently whoever added those cases wasn't paying attention.
Noted by Kyotaro Horiguchi, though this is different from his proposed
patch.
Back-patch to 8.4; 8.3 does not have the same type of error reporting
so the patch is not relevant.
It's actually more useful for the module to ignore these. Ignoring
EXECUTE (and not incrementing the nesting level) allows the executor
hooks to charge the time to the underlying prepared query, which
shows up as a stats entry with the original PREPARE as query string
(possibly modified by suppression of constants, which might not be
terribly useful here but it's not worth avoiding). This is much more
useful than cluttering the stats table with a distinct entry for each
textually distinct EXECUTE.
Experimentation with this idea shows that it's also preferable to ignore
PREPARE. If we don't, we get two stats table entries, one with the query
string hash and one with the jumble-derived hash, but with the same visible
query string (modulo those constants). This is confusing and not very
helpful, since the first entry will only receive costs associated with
initial planning of the query, which is not something counted at all
normally by pg_stat_statements. (And if we do start tracking planning
costs, we'd want them blamed on the other hash table entry anyway.)
When tracking nested statements, contrib/pg_stat_statements formerly
double-counted the execution costs of utility statements that directly
contain an executable statement, such as EXPLAIN and DECLARE CURSOR.
This was not obvious since the ProcessUtility and Executor hooks
would each add their measured costs to the same stats table entry.
However, with the new implementation that hashes utility and plannable
statements differently, this showed up as seemingly-duplicate stats
entries. Fix that by disabling the Executor hooks when the query has a
queryId of zero, which was the case already for such statements but is now
more clearly specified in the code. (The zero queryId was causing problems
anyway because all such statements would add to a single bogus entry.)
The PREPARE/EXECUTE case still results in counting the same execution
in two different stats table entries, but it should be much less surprising
to users that there are two entries in such cases.
In passing, include a CommonTableExpr's ctename in the query hash.
I had left it out originally on the grounds that we wanted to omit all
inessential aliases, but since RTE_CTE RTEs are hashing their referenced
names, we'd better hash the CTE names too to make sure we don't hash
semantically different queries the same.
pg_stat_statements now hashes selected fields of the analyzed parse tree
to assign a "fingerprint" to each query, and groups all queries with the
same fingerprint into a single entry in the pg_stat_statements view.
In practice it is expected that queries with the same fingerprint will be
equivalent except for values of literal constants. To make the display
more useful, such constants are replaced by "?" in the displayed query
strings.
This mechanism currently supports only optimizable queries (SELECT,
INSERT, UPDATE, DELETE). Utility commands are still matched on the
basis of their literal query strings.
There remain some open questions about how to deal with utility statements
that contain optimizable queries (such as EXPLAIN and SELECT INTO) and how
to deal with expiring speculative hashtable entries that are made to save
the normalized form of a query string. However, fixing these issues should
require only localized changes, and since there are other open patches
involving contrib/pg_stat_statements, it seems best to go ahead and commit
what we've got.
Peter Geoghegan, reviewed by Daniel Farina
Instead of just stopping after removing an arbitrary subset of orphaned
large objects, commit and start a new transaction after each -l objects.
This is just as effective as the original patch at limiting the number of
locks used, and it doesn't require doing the OID collection process
repeatedly to get everything. Since the option no longer changes the
fundamental behavior of vacuumlo, and it avoids a known server-side
limitation, enable it by default (with a default limit of 1000 LOs per
transaction).
In passing, be more careful about properly quoting the names of tables
and fields, and do some other cosmetic cleanup.
the non-development install. Instead, use the LOAD mechanism to check
for the pg_upgrade_support shared object, like we do for other shared
object checks.
Backpatch to 9.1.
Report from Àlvaro
This is intended as infrastructure to allow sepgsql to cooperate with
connection pooling software, by allowing the effective security label
to be set for each new connection.
KaiGai Kohei, reviewed by Yeb Havinga.
Extracted from a larger patch by Jaime Casanova, reviewed by Noah Misch.
I think this error message could use some more extensive revision, but
this at least makes the handling of spgist consistent with what we do for
other types of indexes that this code doesn't know how to handle.
add ability to control permissions of created files
have psql echo its queries for easier debugging
output four separate log files, and delete them on success
add -r/--retain option to keep log files after success
make logs file append-only
remove -g/-G/-l logging options
sugggest tailing appropriate log file on failure
enhance -v/--verbose behavior
Further reflection shows that a single callback isn't very workable if we
desire to let FDWs generate multiple Paths, because that forces the FDW to
do all work necessary to generate a valid Plan node for each Path. Instead
split the former PlanForeignScan API into three steps: GetForeignRelSize,
GetForeignPaths, GetForeignPlan. We had already bit the bullet of breaking
the 9.1 FDW API for 9.2, so this shouldn't cause very much additional pain,
and it's substantially more flexible for complex FDWs.
Add an fdw_private field to RelOptInfo so that the new functions can save
state there rather than possibly having to recalculate information two or
three times.
In addition, we'd not thought through what would be needed to allow an FDW
to set up subexpressions of its choice for runtime execution. We could
treat ForeignScan.fdw_private as an executable expression but that seems
likely to break existing FDWs unnecessarily (in particular, it would
restrict the set of node types allowable in fdw_private to those supported
by expression_tree_walker). Instead, invent a separate field fdw_exprs
which will receive the postprocessing appropriate for expression trees.
(One field is enough since it can be a list of expressions; also, we assume
the corresponding expression state tree(s) will be held within fdw_state,
so we don't need to add anything to ForeignScanState.)
Per review of Hanada Shigeru's pgsql_fdw patch. We may need to tweak this
further as we continue to work on that patch, but to me it feels a lot
closer to being right now.
GetForeignColumnOptions provides some abstraction for accessing
column-specific FDW options, on a par with the access functions that were
already provided here for other FDW-related information.
Adjust file_fdw.c to use GetForeignColumnOptions instead of equivalent
hand-rolled code.
In addition, add some SGML documentation for the functions exported by
foreign.c that are meant for use by FDW authors.
(This is the fdw_helper portion of the proposed pgsql_fdw patch.)
Hanada Shigeru, reviewed by KaiGai Kohei
The original API specification only allowed an FDW to create a single
access path, which doesn't seem like a terribly good idea in hindsight.
Instead, move the responsibility for building the Path node and calling
add_path() into the FDW's PlanForeignScan function. Now, it can do that
more than once if appropriate. There is no longer any need for the
transient FdwPlan struct, so get rid of that.
Etsuro Fujita, Shigeru Hanada, Tom Lane
Since the current version is 1.1, the 1.0 file isn't really needed. We do
need the 1.0--1.1 upgrade file, so people on 1.0 can upgrade.
Per recent discussion on pgsql-hackers.
The array intersection code would give wrong results if the first entry of
the correct output array would be "1". (I think only this value could be
at risk, since the previous word would always be a lower-bound entry with
that fixed value.)
Problem spotted by Julien Rouhaud, initial patch by Guillaume Lelarge,
cosmetic improvements by me.
This is some preliminary refactoring related to a pending patch
to allow sepgsql-enable sessions to make dynamic label transitions.
But this commit doesn't involve any functional change: it just puts
some bits of code in more logical places.
KaiGai Kohei
The hstore and json datatypes both have record-conversion functions that
pay attention to column names in the composite values they're handed.
We used to not worry about inserting correct field names into tuple
descriptors generated at runtime, but given these examples it seems
useful to do so. Observe the nicer-looking results in the regression
tests whose results changed.
catversion bump because there is a subtle change in requirements for stored
rule parsetrees: RowExprs from ROW() constructs now have to include field
names.
Andrew Dunstan and Tom Lane
Sometimes it may be useful to get actual row counts out of EXPLAIN
(ANALYZE) without paying the cost of timing every node entry/exit.
With this patch, you can say EXPLAIN (ANALYZE, TIMING OFF) to get that.
Tomas Vondra, reviewed by Eric Theise, with minor doc changes by me.
In dry-run mode, just the name of the file to be removed is printed to
stdout; this is so the user can easily plug it into another program
through a pipe. If debug mode is also specified, a more verbose message
is printed to stderr.
Author: Gabriele Bartolini
Reviewer: Josh Kupershmidt
Due to oversights, the encrypt_iv() and decrypt_iv() functions failed to
report certain types of invalid-input errors, and would instead return
random garbage values.
Marko Kreen, per report from Stefan Kaltenbrunner
have pg_upgrade allocate a maximum fixed size buffer for testing the
library file name, rather than base the allocation on the library name.
Backpatch to 9.1.
The original coding examined the next character before verifying that
there *is* a next character. In the worst case with the input buffer
right up against the end of memory, this would result in a segfault.
Problem spotted by Paul Guyot; this commit extends his patch to fix an
additional case. In addition, make the code a tad more readable by not
overloading the usage of *tlen.
All supported platforms support the C89 standard function atexit()
(SunOS 4 probably being the last one not to), and supporting both
makes the code clumsy.
Instead, add a function pg_tablespace_location(oid) used to return
the same information, and do this by reading the symbolic link.
Doing it this way makes it possible to relocate a tablespace when the
database is down by simply changing the symbolic link.
It runs the regression tests, runs pg_upgrade on the populated
database, and compares the before and after dumps. While not actually
a cross-version upgrade, this does detect omissions and bugs in the
involved tools from time to time. It's also possible to do a
cross-version upgrade by manually supplying parameters.
If the existing citext type has not merely been created, but used in any
tables, then the upgrade script wasn't doing enough. We have to update
attcollation for each citext table column, and indcollation for each citext
index column, as well. Per report from Rudolf van der Leeden.
Since PostgreSQL 9.0, we've emitted a warning message when an operator
named => is created, because the SQL standard now reserves that token
for another use. But we've also shipped such an operator with hstore.
Use of the function hstore(text, text) has been recommended in
preference to =>(text, text). Per discussion, it's now time to take
the next step and stop shipping the operator. This will allow us to
prohibit the use of => as an operator name in a future release if and
when we wish to support the SQL standard use of this token.
The release notes should mention this incompatibility.
Patch by me, reviewed by David Wheeler, Dimitri Fontaine and Tom Lane.
Make it use t_isspace() to identify whitespace, rather than relying on
sscanf which is known to get it wrong on some platform/locale combinations.
Get rid of fixed-size buffers. Make it actually continue to parse the file
after ignoring a line with untranslatable characters, as was obviously
intended.
The first of these issues is per gripe from J Smith, though not exactly
either of his proposed patches.
Both dict_int and dict_xsyn were blithely assuming that whatever memory
palloc gives back will be pre-zeroed. This would typically work for
just about long enough to run their regression tests, and no longer :-(.
The pre-9.0 code in dict_xsyn was even lamer than that, as it would
happily give back a pointer to the result of palloc(0), encouraging
its caller to access off the end of memory. Again, this would just
barely fail to fail as long as memory contained nothing but zeroes.
Per a report from Rodrigo Hjort that code based on these examples
didn't work reliably.
We have seen one too many reports of people trying to use 9.1 extension
files in the old-fashioned way of sourcing them in psql. Not only does
that usually not work (due to failure to substitute for MODULE_PATHNAME
and/or @extschema@), but if it did work they'd get a collection of loose
objects not an extension. To prevent this, insert an \echo ... \quit
line that prints a suitable error message into each extension script file,
and teach commands/extension.c to ignore lines starting with \echo.
That should not only prevent any adverse consequences of loading a script
file the wrong way, but make it crystal clear to users that they need to
do it differently now.
Tom Lane, following an idea of Andrew Dunstan's. Back-patch into 9.1
... there is not going to be much value in this if we wait till 9.2.
A similar problem for pgstattuple() was fixed in April of 2010 by commit
33065ef8bc, but pgstatindex() seems to have
been overlooked.
Back-patch all the way, as with that commit, though not to 7.4 through
8.1, since those are now EOL.
Arrange for any problems with pre-existing settings to be reported as
WARNING not ERROR, so that we don't undesirably abort the loading of the
incoming add-on module. The bad setting is just discarded, as though it
had never been applied at all. (This requires a change in the API of
set_config_option. After some thought I decided the most potentially
useful addition was to allow callers to just pass in a desired elevel.)
Arrange to restore the complete stacked state of the variable, rather than
cheesily reinstalling only the active value. This ensures that custom GUCs
will behave unsurprisingly even when the module loading operation occurs
within nested subtransactions that have changed the active value. Since a
module load could occur as a result of, eg, a PL function call, this is not
an unlikely scenario.
Since gtrgm_penalty() is usually called many times in a row with the same
"newval" (to determine which item on an index page newval fits into best),
the makesign() calculation is repetitious. It's expensive enough to make
it worth caching the result, so do so. On my machine this is good for
more than a 40% savings in the time needed to build a trigram index on
/usr/share/dict/words. This is all per a suggestion of Heikki's.
In passing, make some mostly-cosmetic improvements in the caching logic in
the other functions in this file that rely on caching info in fn_extra.
Because these tests require root privileges, not to mention invasive
changes to the security configuration of the host system, it's not
reasonable for them to be invoked by a regular "make check" or "make
installcheck". Instead, dike out the Makefile's knowledge of the tests,
and change chkselinuxenv (now renamed "test_sepgsql") into a script that
verifies the environment is workable and then runs the tests. It's
expected that test_sepgsql will only be run manually.
While at it, do some cleanup in the error checking in the script, and
do some wordsmithing in the documentation.
This is implemented as a per-column boolean option, rather than trying
to match COPY's convention of a single option listing the column names.
Shigeru Hanada, reviewed by KaiGai Kohei
Rewrite plancache.c so that a "cached plan" (which is rather a misnomer
at this point) can support generation of custom, parameter-value-dependent
plans, and can make an intelligent choice between using custom plans and
the traditional generic-plan approach. The specific choice algorithm
implemented here can probably be improved in future, but this commit is
all about getting the mechanism in place, not the policy.
In addition, restructure the API to greatly reduce the amount of extraneous
data copying needed. The main compromise needed to make that possible was
to split the initial creation of a CachedPlanSource into two steps. It's
worth noting in particular that SPI_saveplan is now deprecated in favor of
SPI_keepplan, which accomplishes the same end result with zero data
copying, and no need to then spend even more cycles throwing away the
original SPIPlan. The risk of long-term memory leaks while manipulating
SPIPlans has also been greatly reduced. Most of this improvement is based
on use of the recently-added MemoryContextSetParent primitive.
This addresses only those cases that are easy to fix by adding or
moving a const qualifier or removing an unnecessary cast. There are
many more complicated cases remaining.
Add __attribute__ decorations for printf format checking to the places that
were missing them. Fix the resulting warnings. Add
-Wmissing-format-attribute to the standard set of warnings for GCC, so these
don't happen again.
The warning fixes here are relatively harmless. The one serious problem
discovered by this was already committed earlier in
cf15fb5cab.
As per my recent proposal, this refactors things so that these typedefs and
macros are available in a header that can be included in frontend-ish code.
I also changed various headers that were undesirably including
utils/timestamp.h to include datatype/timestamp.h instead. Unsurprisingly,
this showed that half the system was getting utils/timestamp.h by way of
xlog.h.
No actual code changes here, just header refactoring.
walsender.h should depend on xlog.h, not vice versa. (Actually, the
inclusion was circular until a couple hours ago, which was even sillier;
but Bruce broke it in the expedient rather than logically correct
direction.) Because of that poor decision, plus blind application of
pgrminclude, we had a situation where half the system was depending on
xlog.h to include such unrelated stuff as array.h and guc.h. Clean up
the header inclusion, and manually revert a lot of what pgrminclude had
done so things build again.
This episode reinforces my feeling that pgrminclude should not be run
without adult supervision. Inclusion changes in header files in particular
need to be reviewed with great care. More generally, it'd be good if we
had a clearer notion of module layering to dictate which headers can sanely
include which others ... but that's a big task for another day.
Don't test whether the number of labels is numerically equal to zero;
count(*) isn't going return zero anyway, and the current coding blows
up if it returns an empty string or an error.
There's no reason for this test to use the undocumented pg_prepared_xact()
function, when it can use the stable API pg_prepared_xacts instead.
Fixes breakage against 8.3, as reported by Justin Arnold.
For an empty index, the pgstatindex() function would compute 0.0/0.0 for
its avg_leaf_density and leaf_fragmentation outputs. On machines that
follow the IEEE float arithmetic standard with any care, that results in
a NaN. However, per report from Rushabh Lathia, Microsoft couldn't
manage to get this right, so you'd get a bizarre error on Windows.
Fix by forcing the results to be NaN explicitly, rather than relying on
the division operator to give that or the snprintf function to print it
correctly. I have some doubts that this is really the most useful
definition, but it seems better to remain backward-compatible with
those platforms for which the behavior wasn't completely broken.
Back-patch to 8.2, since the code is like that in all current releases.
The previous coding resulted in contrib modules unintentionally overriding
the use of CONTRIB_TESTDB. There seems no particularly good reason to
allow that (after all, the makefile can set CONTRIB_TESTDB if that's really
what it intends).
In passing, document REGRESS_OPTS where the other pgxs.mk options are
documented.
Back-patch to 9.1 --- in prior versions, there were no cases of contrib
modules setting REGRESS_OPTS without including the --dbname switch, so
while the coding was fragile there was no actual bug.
We'll have to settle for just listing the extensions' data types,
since function arguments seem to sort differently in different locales.
Per buildfarm results.
When we implemented extensions, we made findDependentObjects() treat
EXTENSION dependency links similarly to INTERNAL links. However, that
logic contained an implicit assumption that an object could have at most
one INTERNAL dependency, so it did not work correctly for objects having
both INTERNAL and DEPENDENCY links. This led to failure to drop some
extension member objects when dropping the extension. Furthermore, we'd
never actually exercised the case of recursing to an internally-referenced
(owning) object from anything other than a NORMAL dependency, and it turns
out that passing the incoming dependency's flags to the owning object is
the Wrong Thing. This led to sometimes dropping a whole extension silently
when we should have rejected the drop command for lack of CASCADE.
Since we obviously were under-testing extension drop scenarios, add some
regression test cases. Unfortunately, such test cases require some
extensions (duh), so we can't test for problems in the core regression
tests. I chose to add them to the earthdistance contrib module, which is
a good test case because it has a dependency on the cube contrib module.
Back-patch to 9.1. Arguably these are pre-existing bugs in INTERNAL
dependency handling, but since it appears that the cases can never arise
pre-9.1, I'll refrain from back-patching the logic changes further than
that.
Eliminate dependencies on "which", as we don't really need that to be
installed for proper testing. Don't number the tests, as that increases
the footprint of every patch that wants to add or remove tests. Make
the test output more informative, so that it's a bit easier to see what
went right (or wrong). Spelling and grammar improvements.
contrib/xml2 can get by without libxslt; the relevant features just
won't work. But if doesn't have libxml2, or if sepgsql doesn't have
libselinux, the link succeeds but the module then fails to work at load
time. To avoid that, link the require libraries unconditionally, so
that it will be clear at link-time that there is a problem.
Per discussion with Tom Lane and KaiGai Kohei.
Also, handle failure better: don't just blindly keep trying to delete
stuff after the transaction has already failed.
Tim Lewis, reviewed by Josh Kupershmidt, with further hacking by me.
The old check against MAX_RANDOM_VALUE is clearly irrelevant since
getrand() no longer calls random(). Instead, check whether min and max
are close enough together to avoid an overflow inside getrand(), as
suggested by Tom Lane. This is still somewhat silly, because we're
using atoi(), which doesn't check for overflow anyway and (at least on
my system) will cheerfully return 0 when given "4294967296". But that's
a problem for another commit.
glibc renders random() thread-safe by wrapping a futex lock around it;
testing reveals that this limits the performance of pgbench on machines
with many CPU cores. Rather than switching to random_r(), which is
only available on GNU systems and crashes unless you use undocumented
alchemy to initialize the random state properly, switch to our built-in
implementation of erand48(), which is both thread-safe and concurrent.
Since the list of reasons not to use the operating system's erand48()
is getting rather long, rename ours to pg_erand48() (and similarly
for our implementations of lrand48() and srand48()) and just always
use those. We were already doing this on Cygwin anyway, and the
glibc implementation is not quite thread-safe, so pgbench wouldn't
be able to use that either.
Per discussion with Tom Lane.
libxml reports some errors (like invalid xmlns attributes) via the error
handler hook, but still returns a success indicator to the library caller.
This causes us to miss some errors that are important to report. Since the
"generic" error handler hook doesn't know whether the message it's getting
is for an error, warning, or notice, stop using that and instead start
using the "structured" error handler hook, which gets enough information
to be useful.
While at it, arrange to save and restore the error handler hook setting in
each libxml-using function, rather than assuming we can set and forget the
hook. This should improve the odds of working nicely with third-party
libraries that also use libxml.
In passing, volatile-ize some local variables that get modified within
PG_TRY blocks. I noticed this while testing with an older gcc version
than I'd previously tried to compile xml.c with.
Florian Pflug and Tom Lane, with extensive review/testing by Noah Misch
There may be some other places where we should use errdetail_internal,
but they'll have to be evaluated case-by-case. This commit just hits
a bunch of places where invoking gettext is obviously a waste of cycles.
Add errno-based output to error messages where appropriate, reformat
blocks to about 72 characters per line, use spaces instead of tabs for
indentation, and other style adjustments.
This was already a runtime failure condition, but it's better to check
at validation time if possible. Lightly modified version of a patch
by Shigeru Hanada.
Certain subdirectories do not get built if corresponding options are not
selected at configure time. However, "make distprep" should visit such
directories anyway, so that constructing derived files to be included in
the tarball happens without requiring all configure options to be given
in the tarball build script. Likewise, it's better if cleanup actions
unconditionally visit all directories (for example, this ensures proper
cleanup if someone has done a manual make in such a subdirectory).
To handle this, set up a convention that subdirectories that are
conditionally included in SUBDIRS should be added to ALWAYS_SUBDIRS
instead when they are excluded.
Back-patch to 9.1, so that plpython's spiexceptions.h will get provided
in 9.1 tarballs. There don't appear to be any instances where distprep
actions got missed in previous releases, and anyway this fix requires
gmake 3.80 so we don't want to apply it before 9.1.
A password containing a character with the high bit set was misprocessed
on machines where char is signed (which is most). This could cause the
preceding one to three characters to fail to affect the hashed result,
thus weakening the password. The result was also unportable, and failed
to match some other blowfish implementations such as OpenBSD's.
Since the fix changes the output for such passwords, upstream chose
to provide a compatibility hack: password salts beginning with $2x$
(instead of the usual $2a$ for blowfish) are intentionally processed
"wrong" to give the same hash as before. Stored password hashes can
thus be modified if necessary to still match, though it'd be better
to change any affected passwords.
In passing, sync a couple other upstream changes that marginally improve
performance and/or tighten error checking.
Back-patch to all supported branches. Since this issue is already
public, no reason not to commit the fix ASAP.
This is an ugly hack to get around the fact that significant parts of the
core backend assume they don't need to worry about passing collation to
equality and hashing functions. That's true for the core string datatypes,
but citext should ideally have equality behavior that depends on the
specified collation's LC_CTYPE. However, there's no chance of fixing the
core before 9.2, so we'll have to live with this compromise arrangement for
now. Per bug #6053 from Regina Obe.
The code changes in this commit should be reverted in full once the core
code is up to speed, but be careful about reverting the docs changes:
I fixed a number of obsolete statements while at it.
For consistency, have all non-ASCII characters from contributors'
names in the source be in UTF-8. But remove some other more
gratuitous uses of non-ASCII characters.
For consistency with other tools, put the options before further usage
information.
In pg_standby, remove the supposedly deprecated -l option from the
given example invocation.
the connection; also restructure the libpq connection code.
This patch also removes the unused variable postmasterPID and fixes a
libpq structure leak that was in the testing loop.
Added a new option --extra-install to pg_regress to arrange installing
the respective contrib directory into the temporary installation.
This is currently not yet supported for Windows MSVC builds.
Updated the .gitignore files for contrib modules to ignore the
leftovers of a temp-install check run.
Changed the exit status of "make check" in a pgxs build (which still
does nothing) to 0 from 1.
Added "make check" in contrib to top-level "make check-world".
This option turns off autovacuum, prevents non-super-user connections,
and enables oid setting hooks in the backend. The code continues to use
the old autoavacuum disable settings for servers with earlier catalog
versions.
This includes a catalog version bump to identify servers that support
the -b option.
Make use of the collation attached to the index column, instead of
hard-wiring DEFAULT_COLLATION_OID. (Note: in theory this could require
reindexing btree_gist indexes on textual columns, but I rather doubt anyone
has one with a non-default declared collation as yet.)
Using DEFAULT_COLLATION_OID in the comparePartial functions was not only
a lame hack, but outright wrong, because the compare functions for
collation-aware types were already responding to the declared index
collation. So comparePartial would have the wrong expectation about
the index's sort order, possibly leading to missing matches for prefix
searches.
If someone removes the 'postgres' database from the old cluster and the
new cluster has a 'postgres' database, the number of databases will not
match. We actually could upgrade such a setup, but it would violate the
1-to-1 mapping of database counts, so we throw an error instead.
Previously they got an error during the upgrade, and not at the check
stage; PG 9.0.4 does the same.
... for some value of "properly". Instead of overriding REGRESS_OPTS,
set the variables ENCODING and NO_LOCALE, which is more expressive and
allows overriding by the user. Fix vcregress.pl to handle that.
Since collation is effectively an argument, not a property of the function,
FmgrInfo is really the wrong place for it; and this becomes critical in
cases where a cached FmgrInfo is used for varying purposes that might need
different collation settings. Fix by passing it in FunctionCallInfoData
instead. In particular this allows a clean fix for bug #5970 (record_cmp
not working). This requires touching a bit more code than the original
method, but nobody ever thought that collations would not be an invasive
patch...
This warning is new in gcc 4.6 and part of -Wall. This patch cleans
up most of the noise, but there are some still warnings that are
trickier to remove.
The previous functions of assign hooks are now split between check hooks
and assign hooks, where the former can fail but the latter shouldn't.
Aside from being conceptually clearer, this approach exposes the
"canonicalized" form of the variable value to guc.c without having to do
an actual assignment. And that lets us fix the problem recently noted by
Bernd Helmle that the auto-tune patch for wal_buffers resulted in bogus
log messages about "parameter "wal_buffers" cannot be changed without
restarting the server". There may be some speed advantage too, because
this design lets hook functions avoid re-parsing variable values when
restoring a previous state after a rollback (they can store a pre-parsed
representation of the value instead). This patch also resolves a
longstanding annoyance about custom error messages from variable assign
hooks: they should modify, not appear separately from, guc.c's own message
about "invalid parameter value".
Although there remains some debate about how CREATE TYPE should represent
the collation property, this doesn't really affect what we need to do in
citext's script, so go ahead and fix that.
The originally committed patch for modifying CTEs didn't interact well
with EXPLAIN, as noted by myself, and also had corner-case problems with
triggers, as noted by Dean Rasheed. Those problems show it is really not
practical for ExecutorEnd to call any user-defined code; so split the
cleanup duties out into a new function ExecutorFinish, which must be called
between the last ExecutorRun call and ExecutorEnd. Some Asserts have been
added to these functions to help verify correct usage.
It is no longer necessary for callers of the executor to call
AfterTriggerBeginQuery/AfterTriggerEndQuery for themselves, as this is now
done by ExecutorStart/ExecutorFinish respectively. If you really need to
suppress that and do it for yourself, pass EXEC_FLAG_SKIP_TRIGGERS to
ExecutorStart.
Also, refactor portal commit processing to allow for the possibility that
PortalDrop will invoke user-defined code. I think this is not actually
necessary just yet, since the portal-execution-strategy logic forces any
non-pure-SELECT query to be run to completion before we will consider
committing. But it seems like good future-proofing.
HeapTupleHeader's t_infomask and t_infomask2 are defined as 16-bit
unsigned integers, so when the 16th bit was set, heap_page_item was
returning them as negative values, which was ugly.
The change to pageinspect--unpackaged--1.0.sql allows a module upgraded
from 9.0 to be cleanly updated from the previous definition.
File encodings can be specified separately from client encoding.
If not specified, client encoding is used for backward compatibility.
Cases when the encoding doesn't match client encoding are slower
than matched cases because we don't have conversion procs for other
encodings. Performance improvement would be be a future work.
Original patch by Hitoshi Harada, and modified by me.
This is both very useful in its own right, and an important test case
for the core FDW support.
This commit includes a small refactoring of copy.c to expose its option
checking code as a separately callable function. The original patch
submission duplicated hundreds of lines of that code, which seemed pretty
unmaintainable.
Shigeru Hanada, reviewed by Itagaki Takahiro and Tom Lane
intarray and tsearch2 both reference core support functions in their GIN
opclasses, and the signatures of those functions changed for 9.1. We added
backwards-compatible pg_proc entries for the functions in order to allow
9.0 dump files to be restored at all, but that hack leaves the opclasses
pointing at pg_proc entries different from what they'd point to if the
contrib modules were installed fresh in 9.1. To forestall any possibility
of future problems, fix the opclasses to match fresh installs via the
expedient of direct UPDATEs on pg_amproc in the update-from-unpackaged
scripts. (Yech ... but the alternatives are worse, or require far more
effort than seems justified right now.)
Note: updating pg_amproc is sufficient because there will be no pg_depend
entries corresponding to these dependencies, since the referenced functions
are all pinned.
The initial version of the update-from-unpackaged script neglected to
include the <> operators that were added to the opclasses during 9.1.
To make this script produce the same final state as the regular install
script, use the same ALTER OPERATOR FAMILY trick as in pg_trgm.
Take care of some loose ends in the update-from-unpackaged script, and
apply some ugly hacks to ensure that it produces the same catalog state
as the fresh-install script. Per discussion, this seems like a safer
plan than having two different catalog states that both call themselves
"pg_trgm 1.0", even if it's not immediately clear that the subtle
differences would ever matter.
Also, fix the stub function gin_extract_trgm() so that it works instead
of just bleating. Needed because this function will get called during a
regular dump and reload, if there are any indexes using its opclass.
The user won't have an opportunity to update the extension till later,
so telling him to do so is unhelpful.
These are needed to support reloading dumps of 9.0 installations containing
contrib/intarray or contrib/tsearch2. Since not only regular dump/reload
but binary upgrade would fail, it seems worth the trouble to carry these
stubs for awhile. Note that the contrib opclasses referencing these
functions will still work fine, since GIN doesn't actually pay any
attention to the declared signature of a support function.
Initially it was called int_aggregate after the old SQL file, but since
the documentation just says "intagg" and that's also the directory name,
let's conform to that instead.
From first pass of testing. Notably, there seems to be no need for
adminpack--unpackaged--1.0.sql because none of the objects that the
old module creates would ever be dumped by pg_dump anyway (they are
all in pg_catalog).
It was never terribly consistent to use OR REPLACE (because of the lack of
comparable functionality for data types, operators, etc), and
experimentation shows that it's now positively pernicious in the extension
world. We really want a failure to occur if there are any conflicts, else
it's unclear what the extension-ownership state of the conflicted object
ought to be. Most of the time, CREATE EXTENSION will fail anyway because
of conflicts on other object types, but an extension defining only
functions can succeed, with bad results.
This isn't fully tested as yet, in particular I'm not sure that the
"foo--unpackaged--1.0.sql" scripts are OK. But it's time to get some
buildfarm cycles on it.
sepgsql is not converted to an extension, mainly because it seems to
require a very nonstandard installation process.
Dimitri Fontaine and Tom Lane
relative, by creating a function path_is_relative_and_below_cwd() to
check for specific requirements. It is unclear if this fixes a security
problem or not but the new code is more robust.
This follows recent discussions, so it's quite a bit different from
Dimitri's original. There will probably be more changes once we get a bit
of experience with it, but let's get it in and start playing with it.
This is still just core code. I'll start converting contrib modules
shortly.
Dimitri Fontaine and Tom Lane
This follows my proposal of yesterday, namely that we try to recreate the
previous state of the extension exactly, instead of allowing CREATE
EXTENSION to run a SQL script that might create some entirely-incompatible
on-disk state. In --binary-upgrade mode, pg_dump won't issue CREATE
EXTENSION at all, but instead uses a kluge function provided by
pg_upgrade_support to recreate the pg_extension row (and extension-level
pg_depend entries) without creating any member objects. The member objects
are then restored in the same way as if they weren't members, in particular
using pg_upgrade's normal hacks to preserve OIDs that need to be preserved.
Then, for each member object, ALTER EXTENSION ADD is issued to recreate the
pg_depend entry that marks it as an extension member.
In passing, fix breakage in pg_upgrade's enum-type support: somebody didn't
fix it when the noise word VALUE got added to ALTER TYPE ADD. Also,
rationalize parsetree representation of COMMENT ON DOMAIN and fix
get_object_address() to allow OBJECT_DOMAIN.
This adds collation support for columns and domains, a COLLATE clause
to override it per expression, and B-tree index support.
Peter Eisentraut
reviewed by Pavel Stehule, Itagaki Takahiro, Robert Haas, Noah Misch
Unlike Btree-based LIKE optimization, this works for non-left-anchored
search patterns. The effectiveness of the search depends on how many
trigrams can be extracted from the pattern. (The worst case, with no
trigrams, degrades to a full-table scan, so this isn't a panacea. But
it can be very useful.)
Alexander Korotkov, reviewed by Jan Urbanski
contrib/intarray's gettoken() uses a fixed-size buffer to collect an
integer's digits, and did not guard against overrunning the buffer.
This is at least a backend crash risk, and in principle might allow
arbitrary code execution. The code didn't check for overflow of the
integer value either, which while not presenting a crash risk was still
bad.
Thanks to Apple Inc's security team for reporting this issue and supplying
the fix.
Security: CVE-2010-4015
This is still pretty rough - among other things, the documentation
needs work, and the messages need a visit from the style police -
but this gets the basic framework in place.
KaiGai Kohei
Reduce #includes to minimum actually needed; in particular include
postgres_fe.h not postgres.h, so as to stop build failures on some
platforms.
Use get_progname() instead of hardwired program name; improve error
checking for command line syntax; bring error messages into line with
style guidelines; include strerror result in die() cases.
Un-break Windows build (I hope) by making the HAVE_FSYNC_WRITETHROUGH
code match the backend. Fix incorrect program help message. static-ize
all functions.
Actually rename the program, rather than just claiming we did. Hook it
into the build system. Get rid of useless dependency on libpq. Clean up
#include list and messy whitespace.
In particular, make hstore @> '' succeed for all hstores, likewise
hstore ?& '{}'. Previously the results were inconsistent and could
depend on whether you were using a GiST index, GIN index, or seqscan.
This applies the fix for bug #5784 to remaining places where we wish
to reject nulls in user-supplied arrays. In all these places, there's
no reason not to allow a null bitmap to be present, so long as none of
the current elements are actually null.
I did not change some other places where we are looking at system catalog
entries or aggregate transition values, as the presence of a null bitmap
in such an array would be suspicious.
The array containment operators now behave per mathematical expectation
for empty arrays (ie, an empty array is contained in anything).
Both these operators and the query_int operators now work as expected in
GiST and GIN index searches, rather than having corner cases where the
index searches gave different answers.
Also, fix unexpected failures where the operators would claim that an array
contained nulls, when in fact there was no longer any null present (similar
to bug #5784). The restriction to not have nulls is still there, as
removing it would take a lot of added code complexity and probably slow
things down significantly.
Also, remove the arbitrary restriction to 1-D arrays; unlike the other
restriction, this was buying us nothing performance-wise.
Assorted cosmetic improvements and marginal performance improvements, too.
up relations, but rather order old/new relations and use the same array
index value for both. This should speed up pg_upgrade for databases
with many relations.
which is stored in pg_largeobject_metadata.
No backpatch to 9.0 because you can't migrate from 9.0 to 9.0 with the
same catversion (because of tablespace conflict), and a pre-9.0
migration to 9.0 has not large object permissions to migrate.
Toast tables have identical pg_class.oid and pg_class.relfilenode, but
for clarity it is good to preserve the pg_class.oid.
Update comments regarding what is preserved, and do some
variable/function renaming for clarity.
Per my recent proposal(s). Null key datums can now be returned by
extractValue and extractQuery functions, and will be stored in the index.
Also, placeholder entries are made for indexable items that are NULL or
contain no keys according to extractValue. This means that the index is
now always complete, having at least one entry for every indexed heap TID,
and so we can get rid of the prohibition on full-index scans. A full-index
scan is implemented much the same way as partial-match scans were already:
we build a bitmap representing all the TIDs found in the index, and then
drive the results off that.
Also, introduce a concept of a "search mode" that can be requested by
extractQuery when the operator requires matching to empty items (this is
just as cheap as matching to a single key) or requires a full index scan
(which is not so cheap, but it sure beats failing or giving wrong answers).
The behavior remains backward compatible for opclasses that don't return
any null keys or request a non-default search mode.
Using these features, we can now make the GIN index opclass for anyarray
behave in a way that matches the actual anyarray operators for &&, <@, @>,
and = ... which it failed to do before in assorted corner cases.
This commit fixes the core GIN code and ginarrayprocs.c, updates the
documentation, and adds some simple regression test cases for the new
behaviors using the array operators. The tsearch and contrib GIN opclass
support functions still need to be looked over and probably fixed.
Another thing I intend to fix separately is that this is pretty inefficient
for cases where more than one scan condition needs a full-index search:
we'll run duplicate GinScanEntrys, each one of which builds a large bitmap.
There is some existing logic to merge duplicate GinScanEntrys but it needs
refactoring to make it work for entries belonging to different scan keys.
Note that most of gin.h has been split out into a new file gin_private.h,
so that gin.h doesn't export anything that's not supposed to be used by GIN
opclasses or the rest of the backend. I did quite a bit of other code
beautification work as well, mostly fixing comments and choosing more
appropriate names for things.
Foreign tables are a core component of SQL/MED. This commit does
not provide a working SQL/MED infrastructure, because foreign tables
cannot yet be queried. Support for foreign table scans will need to
be added in a future patch. However, this patch creates the necessary
system catalog structure, syntax support, and support for ancillary
operations such as COMMENT and SECURITY LABEL.
Shigeru Hanada, heavily revised by Robert Haas
pointers, which simplifies the code. This was not possible in 9.0 because
everything was in a single nested struct, but is possible now.
Per suggestion from Tom.
Don't insist on pg_dumpall and psql being present in the old cluster,
since they are not needed. Do insist on pg_resetxlog being present
(in both old and new), since we need it. Also check for pg_config,
but only in the new cluster. Remove the useless attempt to call
pg_config in the old cluster; we don't need to know the old value of
--pkglibdir. (In the case of a stripped-down migration installation
there might be nothing there to look at anyway, so any future change
that might reintroduce that need would have to be considered carefully.)
Per my attempts to build a minimal previous-version installation to support
pg_upgrade.
It appears that this will be faster for all but the shortest strings;
at least one some platforms, memcmp() can use word-at-a-time comparisons.
Noah Misch, somewhat pared down.
After parsing a parenthesized subexpression, we must pop all pending
ANDs and NOTs off the stack, just like the case for a simple operand.
Per bug #5793.
Also fix clones of this routine in contrib/intarray and contrib/ltree,
where input of types query_int and ltxtquery had the same problem.
Back-patch to all supported versions.
This patch replaces Guttman's generalized split method with a simple
sort-by-center-points algorithm. Since the data is only one-dimensional
we don't really need the slow and none-too-stable Guttman method.
This is in part a bug fix, since seg has the same size_alpha versus
size_beta typo that was recently fixed in contrib/cube. It seems
prudent to apply this rather aggressive fix only in HEAD, though.
Back branches will just get the typo fix.
Alexander Korotkov, reviewed by Yeb Havinga
1. Don't reimplement S_ISDIR() and S_ISREG() badly.
2. Don't reimplement access() badly.
This code appears to have been copied from ancient versions of the
corresponding backend routines, and not patched to incorporate subsequent
fixes (see my commits of 2008-03-31 and 2010-01-14 respectively).
It might be a good idea to change it to just *call* those routines,
but for now I'll just transpose these fixes over.
Most of the functions that execute XPath queries leaked the data structures
created by libxml2. This memory would not be recovered until end of
session, so it mounts up pretty quickly in any serious use of the feature.
Per report from Pavel Stehule, though this isn't his patch.
Back-patch to all supported branches.
This eliminates the need for inefficient implementions of this
functionality in both contrib/dblink and contrib/tablefunc, so remove
them. The upcoming patch implementing an in-core format() function
will also require this functionality.
In passing, add some regression tests.
Having this in src/include/port.h makes no sense, now that copydir.c lives
in src/backend/strorage rather than src/port. Along the way, remove an
obsolete comment from contrib/pg_upgrade that makes reference to the old
location.
Replace for loops in makefiles with proper dependencies. Parallel
make can now span across directories. Also, make -k and make -q work
properly.
GNU make 3.80 or newer is now required.
After much expenditure of effort, we've got this to the point where the
performance penalty is pretty minimal in typical cases.
Andrew Dunstan, reviewed by Brendan Jurd, Dean Rasheed, and Tom Lane
Various places were testing TRIGGER_FIRED_BEFORE() where what they really
meant was !TRIGGER_FIRED_AFTER(), or vice versa. This needs to be cleaned
up because there are about to be more than two possible states.
We might want to note this in the 9.1 release notes as something for
trigger authors to double-check.
For consistency's sake I also changed some places that assumed that
TRIGGER_FIRED_FOR_ROW and TRIGGER_FIRED_FOR_STATEMENT are necessarily
mutually exclusive; that's not in immediate danger of breaking, but
it's still sloppier than it should be.
Extracted from Dean Rasheed's patch for triggers on views. I'm committing
this separately since it's an identifiable separate issue, and is the
only reason for the patch to touch most of these particular files.
This is intended as infrastructure to support integration with label-based
mandatory access control systems such as SE-Linux. Further changes (mostly
hooks) will be needed, but this is a big chunk of it.
KaiGai Kohei and Robert Haas
There was an incorrect Assert in hstoreValidOldFormat(), which would cause
immediate core dumps when attempting to work with pre-9.0 hstore data,
but of course only in an assert-enabled build.
Also, ghstore_decompress() incorrectly applied DatumGetHStoreP() to a datum
that wasn't actually an hstore, but rather a ghstore (ie, a gist signature
bitstring). That used to be harmless, but could now result in misbehavior
if the hstore format conversion code happened to trigger. In reality,
since ghstore is not marked toastable (and doesn't need to be), this
function is useless anyway; we can lobotomize it down to returning the
passed-in pointer.
Both bugs found by Andrew Gierth, though this isn't exactly his proposed
patch.
functions to the core XML code. Per discussion, the former depends on
XMLOPTION while the others do not. These supersede a version previously
offered by contrib/xml2.
Mike Fowler, reviewed by Pavel Stehule
out immediately on any out-of-memory condition. It's rather pointless to
imagine that pgbench will be able to continue usefully after a malloc
failure, and in any case there were a number of unchecked mallocs.
pairs that can be handled by xslt_process().
There is much else to do here, but this patch seems useful in its own right
for as long as this code survives.
Pavel Stehule, reviewed by Mike Fowler
- Rename TSParserGetPrsid to get_ts_parser_oid.
- Rename TSDictionaryGetDictid to get_ts_dict_oid.
- Rename TSTemplateGetTmplid to get_ts_template_oid.
- Rename TSConfigGetCfgid to get_ts_config_oid.
- Rename FindConversionByName to get_conversion_oid.
- Rename GetConstraintName to get_constraint_oid.
- Add new functions get_opclass_oid, get_opfamily_oid, get_rewrite_oid,
get_rewrite_oid_without_relid, get_trigger_oid, and get_cast_oid.
The name of each function matches the corresponding catalog.
Thanks to KaiGai Kohei for the review.
Operating directly on the underlying varlena saves palloc and memcpy
overhead, which testing shows to be significant.
Extracted from a larger patch by Alexander Korotkov.