existence via open(), rather than collecting a directory listing and
looking up matching relfilenode files with sequential scans of the
array. This speeds up pg_upgrade by 2x for a large number of tables,
e.g. 16k.
Per observation by Ants Aasma.
... and have sepgsql use it to determine whether to check permissions
during certain operations. Indexes that are being created as a result
of REINDEX, for instance, do not need to have their permissions checked;
they were already checked when the index was created.
Author: KaiGai Kohei, slightly revised by me
Numerous flex and bison make rules have appeared in the source tree
over time, and they are all virtually identical, so we can replace
them by pattern rules with some variables for customization.
Users of pgxs will also be able to benefit from this.
The HINTs generated for these error cases vary across builds. We
could try to work around that, but the test cases aren't really useful
enough to justify taking any trouble.
Per buildfarm.
dblink now has its own validator function dblink_fdw_validator(), which is
better than the core function postgresql_fdw_validator() because it gets
the list of legal options from libpq instead of having a hard-wired list.
Make the dblink extension module provide a standard foreign data wrapper
dblink_fdw that encapsulates use of this validator, and recommend use of
that wrapper instead of making up wrappers on the fly.
Unfortunately, because ad-hoc wrappers *were* recommended practice
previously, it's not clear when we can get rid of postgresql_fdw_validator
without causing upgrade problems. But this is a step in the right
direction.
Shigeru Hanada, reviewed by KaiGai Kohei
This allows logging only some fraction of transactions, greatly reducing
the amount of log generated.
Tomas Vondra, reviewed by Robert Haas and Jeff Janes.
On some platforms these functions return NULL, rather than the more common
practice of returning a pointer to a zero-sized block of memory. Hack our
various wrapper functions to hide the difference by substituting a size
request of 1. This is probably not so important for the callers, who
should never touch the block anyway if they asked for size 0 --- but it's
important for the wrapper functions themselves, which mistakenly treated
the NULL result as an out-of-memory failure. This broke at least pg_dump
for the case of no user-defined aggregates, as per report from
Matthew Carrington.
Back-patch to 9.2 to fix the pg_dump issue. Given the lack of previous
complaints, it seems likely that there is no live bug in previous releases,
even though some of these functions were in place before that.
We had a number of variants on the theme of "malloc or die", with the
majority named like "pg_malloc", but by no means all. Standardize on the
names pg_malloc, pg_malloc0, pg_realloc, pg_strdup. Get rid of pg_calloc
entirely in favor of using pg_malloc0.
This is an essentially cosmetic change, so no back-patch. (I did find
a couple of places where psql and pg_dump were using plain malloc or
strdup instead of the pg_ versions, but they don't look significant
enough to bother back-patching.)
entries are not dumped. This fixes an error caused by
droping/recreating the information_schema, but other failures were also
possible.
Backpatch to 9.2.
On reflection (especially after noticing how many buildfarm critters have
__builtin_types_compatible_p but not _Static_assert), it seems like we
ought to try a bit harder to make these macros do something everywhere.
The initial cut at it would have been no help to code that is compiled only
on platforms without _Static_assert, for instance; and in any case not all
our contributors do their initial coding on the latest gcc version.
Some googling about static assertions turns up quite a bit of prior art
for making it work in compilers that lack _Static_assert. The method
that seems closest to our needs involves defining a struct with a bit-field
that has negative width if the assertion condition fails. There seems no
reliable way to get the error message string to be output, but throwing a
compile error with a confusing message is better than missing the problem
altogether.
In the same spirit, if we don't have __builtin_types_compatible_p we can at
least insist that the variable have the same width as the type. This won't
catch errors such as "wrong pointer type", but it's far better than
nothing.
In addition to changing the macro definitions, adjust a
compile-time-constant Assert in contrib/hstore to use StaticAssertStmt,
so we can get some buildfarm coverage on whether that macro behaves sanely
or not. There's surely more places that could be converted, but this is
the first one I came across.
If we call pg_ctl stop, the server might continue and thus
hold a log file for a short time after it has deleted its pid file,
(which is when pg_ctl will exit), and so a subsequent attempt to
open the log file might fail.
We therefore try to open it a few times, sleeping one second between
tries, to give the server time to exit.
This corrects an error that was observed on the buildfarm.
Backpatched to 9.2,
Call pg_dumpall using -f switch instead of redirection, to avoid
writing the output in text mode and generating spurious carriage
returns. Remove to carriage return ignoring hack introduced by
commit e442b0f0c6.
Backpatch to 9.2.
pg_upgrade opened the output from pg_dumpall in text mode and
wrote the split files in text mode. This caused unwanted eating
of intended carriage returns on input and production of spurious
carriage returns on output. To avoid this, open all these files
in binary mode. On non-Windows platforms, this change has no
effect.
Backpatch to 9.0. On 9.0 and 9.1, we also switch from redirecting
pg_dumpall's output to using pg_dumpall's -f switch, for the same
reason.
socket location. Also, prevent putting the socket in the current
directory for pre-9.1 servers in live check and non-live check mode,
because pre-9.1 pg_ctl -w can't handle it.
Backpatch to 9.2.
pg_upgrade produces a platform-specific script to remove the old
directory, but on Windows it has not been making sure that the
paths it writes as arguments for rmdir and del use the backslash
path separator, which will cause these scripts to fail.
The fix is backpatched to Release 9.0.
When starting either an old or new postmaster, force it to place its Unix
socket in the current directory. This makes it even harder for accidental
connections to occur during pg_upgrade, and also works around some
scenarios where the default socket location isn't usable. (For example,
if the default location is something other than "/tmp", it might not exist
during "make check".)
When checking an already-running old postmaster, find out its actual socket
directory location from postmaster.pid, if possible. This dodges problems
with an old postmaster having a configured location different from the
default built into pg_upgrade's libpq. We can't find that out if the old
postmaster is pre-9.1, so also document how to cope with such scenarios
manually.
In support of this, centralize handling of the connection-related command
line options passed to pg_upgrade's subsidiary programs, such as pg_dump.
This should make future changes easier.
Bruce Momjian and Tom Lane
This reduces unnecessary exposure of other headers through htup.h, which
is very widely included by many files.
I have chosen to move the function prototypes to the new file as well,
because that means htup.h no longer needs to include tupdesc.h. In
itself this doesn't have much effect in indirect inclusion of tupdesc.h
throughout the tree, because it's also required by execnodes.h; but it's
something to explore in the future, and it seemed best to do the htup.h
change now while I'm busy with it.
The previous signature made it very easy to pass something other than
the printf-format specifier in the corresponding position, without any
warning from the compiler.
While at it, move some of the escaping, redirecting and quoting
responsibilities from the callers into exec_prog() itself. This makes
the callsites cleaner.
Extraction of trigrams did not process LIKE escape sequences properly,
leading to possible misidentification of trigrams near escapes, resulting
in incorrect index search results.
Fujii Masao
libxslt offers the ability to read and write both files and URLs through
stylesheet commands, thus allowing unprivileged database users to both read
and write data with the privileges of the database server. Disable that
through proper use of libxslt's security options.
Also, remove xslt_process()'s ability to fetch documents and stylesheets
from external files/URLs. While this was a documented "feature", it was
long regarded as a terrible idea. The fix for CVE-2012-3489 broke that
capability, and rather than expend effort on trying to fix it, we're just
going to summarily remove it.
While the ability to write as well as read makes this security hole
considerably worse than CVE-2012-3489, the problem is mitigated by the fact
that xslt_process() is not available unless contrib/xml2 is installed,
and the longstanding warnings about security risks from that should have
discouraged prudent DBAs from installing it in security-exposed databases.
Reported and fixed by Peter Eisentraut.
Security: CVE-2012-3488
4741e9afb9. This was done by adding an
optional second log file parameter to exec_prog(), and closing and
reopening the log file between system() calls.
Backpatch to 9.2.
After taking awhile to digest the row-processor feature that was added to
libpq in commit 92785dac2e, we've concluded
it is over-complicated and too hard to use. Leave the core infrastructure
changes in place (that is, there's still a row processor function inside
libpq), but remove the exposed API pieces, and instead provide a "single
row" mode switch that causes PQgetResult to return one row at a time in
separate PGresult objects.
This approach incurs more overhead than proper use of a row processor
callback would, since construction of a PGresult per row adds extra cycles.
However, it is far easier to use and harder to break. The single-row mode
still affords applications the primary benefit that the row processor API
was meant to provide, namely not having to accumulate large result sets in
memory before processing them. Preliminary testing suggests that we can
probably buy back most of the extra cycles by micro-optimizing construction
of the extra results, but that task will be left for another day.
Marko Kreen
This is apparently faster than doing things the other way around when
the scale factor is large.
Along the way, adjust -n to suppress vacuuming during initialization
as well as during test runs.
Jeff Janes, with some small changes by me.
Commit 3855968f32 added syntax, pg_dump,
psql support, and documentation, but the triggers didn't actually fire.
With this commit, they now do. This is still a pretty basic facility
overall because event triggers do not get a whole lot of information
about what the user is trying to do unless you write them in C; and
there's still no option to fire them anywhere except at the very
beginning of the execution sequence, but it's better than nothing,
and a good building block for future work.
Along the way, add a regression test for ALTER LARGE OBJECT, since
testing of event triggers reveals that we haven't got one.
Dimitri Fontaine and Robert Haas
Since the scandir() emulation was taken out of pg_upgrade, there's
no longer any need for scandir_file_pattern to exist as a global
variable. Replace it with a local in the one remaining function
that was making use of it.
Error out on out-of-memory, rather than returning -1, which the sole
existing caller wasn't checking for anyway. There doesn't seem to be
any use-case for making the caller check for failure here.
Detect failure return from readdir().
Use a less platform-dependent method of calculating the entrysize.
It's possible, but not yet confirmed, that this explains bug #6733,
in which Mike Wilson reports a pg_upgrade crash that did not occur
in 9.1. (Note that load_directory is effectively new code in 9.2,
at least on platforms that have scandir().)
Fix up comments, avoid uselessly using two counters, reduce the number
of realloc calls to something sane.
The Solaris Studio compiler warns about these instances, unlike more
mainstream compilers such as gcc. But manual inspection showed that
the code is clearly not reachable, and we hope no worthy compiler will
complain about removing this code.
When reading from a text- or CSV-format file in file_fdw, the datatype
input routines can consume a significant fraction of the runtime.
Often, the query does not need all the columns, so we can get a useful
speed boost by skipping I/O conversion for unnecessary columns.
To support this, add a "convert_selectively" option to the core COPY code.
This is undocumented and not accessible from SQL (for now, anyway).
Etsuro Fujita, reviewed by KaiGai Kohei
Currently only pg_clog is copied, but some other directories could need
the same treatment as well, so create a subroutine to do it.
Extracted from my (somewhat larger) FOR KEY SHARE patch.
Now the log file not only contains the output from commands executed by
system(), but also what command it was in the first place. This
arrangement makes debugging a lot simpler.
snprintf counts trailing NUL towards the char limit. Failing to account
for that was causing an invalid value to be passed to pg_resetxlog -l,
aborting the upgrade process.
The xlogid + segno representation of a particular WAL segment doesn't make
much sense in pg_resetxlog anymore, now that we don't use that anywhere
else. Use the WAL filename instead, since that's a convenient way to name a
particular WAL segment.
I did this partially for pg_resetxlog in the original xlogid/segno -> uint64
patch, but I neglected pg_upgrade and the docs. This should now be more
complete.
The latter was already the dominant use, and it's preferable because
in C the convention is that intXX means XX bits. Therefore, allowing
mixed use of int2, int4, int8, int16, int32 is obviously confusing.
Remove the typedefs for int2 and int4 for now. They don't seem to be
widely used outside of the PostgreSQL source tree, and the few uses
can probably be cleaned up by the time this ships.
This simplifies code that needs to do arithmetic on XLogRecPtrs.
To avoid changing on-disk format of data pages, the LSN on data pages is
still stored in the old format. That should keep pg_upgrade happy. However,
we have XLogRecPtrs embedded in the control file, and in the structs that
are sent over the replication protocol, so this changes breaks compatibility
of pg_basebackup and server. I didn't do anything about this in this patch,
per discussion on -hackers, the right thing to do would to be to change the
replication protocol to be architecture-independent, so that you could use
a newer version of pg_receivexlog, for example, against an older server
version.
These days, even a wimpy system can insert 10000 tuples in the blink of
an eye, so there's no real need for this much verbosity.
Per complaint from Tatsuo Ishii.
The option --foreign-keys, used at initialization time, will create foreign
key constraints for the columns that represent references to other tables'
primary keys. This can help in benchmarking FK performance.
Jeff Janes
Before, some places didn't document the short options (-? and -V),
some documented both, some documented nothing, and they were listed in
various orders. Now this is hopefully more consistent and complete.
The simplest way to handle this is just to copy-and-paste the relevant
code block in fork_process.c, so that's what I did. (It's possible that
something more complicated would be useful to packagers who want to work
with either the old or the new API; but at this point the number of such
people is rapidly approaching zero, so let's just get the minimal thing
done.) Update relevant documentation as well.
It failed to check for error return from xsltApplyStylesheet(), as reported
by Peter Gagarinov. (So far as I can tell, libxslt provides no convenient
way to get a useful error message in failure cases. There might be some
inconvenient way, but considering that this code is deprecated it's hard to
get enthusiastic about putting lots of work into it. So I just made it say
"failed to apply stylesheet", in line with the existing error checks.)
While looking at the code I also noticed that the string returned by
xsltSaveResultToString was never freed, resulting in a session-lifespan
memory leak.
Back-patch to all supported versions.
Overly tight coding caused the password transformation loop to stop
examining input once it had processed a byte equal to 0x80. Thus, if the
given password string contained such a byte (which is possible though not
highly likely in UTF8, and perhaps also in other non-ASCII encodings), all
subsequent characters would not contribute to the hash, making the password
much weaker than it appears on the surface.
This would only affect cases where applications used DES crypt() to encode
passwords before storing them in the database. If a weak password has been
created in this fashion, the hash will stop matching after this update has
been applied, so it will be easy to tell if any passwords were unexpectedly
weak. Changing to a different password would be a good idea in such a case.
(Since DES has been considered inadequately secure for some time, changing
to a different encryption algorithm can also be recommended.)
This code, and the bug, are shared with at least PHP, FreeBSD, and OpenBSD.
Since the other projects have already published their fixes, there is no
point in trying to keep this commit private.
This bug has been assigned CVE-2012-2143, and credit for its discovery goes
to Rubin Xu and Joseph Bonneau.
Write the file to a temporary name and then rename() it into the
permanent name, to ensure it can't end up half-written and corrupt
in case of a crash during shutdown.
Unlink the file after it has been read so it's removed from the data
directory and not included in base backups going to replication slaves.
When the column name is an unqualified name, rather than table.column,
the error message complains about too many dotted names, which is
wrong. Report by Peter Eisentraut based on examination of the
sepgsql regression test output, but the problem also affects COMMENT.
New wording as suggested by Tom Lane.
We previously recognized that citext wouldn't get marked as collatable
during pg_upgrade from a pre-9.1 installation, and hacked its
create-from-unpackaged script to manually perform the necessary catalog
adjustments. However, we overlooked the fact that domains over citext,
as well as the citext[] array type, need the same adjustments. Extend
the script to handle those cases.
Also, the documentation suggested that this was only an issue in pg_upgrade
scenarios, which is quite wrong; loading any dump containing citext from a
pre-9.1 server will also result in the type being wrongly marked.
I approached the documentation problem by changing the 9.1.2 release note
paragraphs about this issue, which is historically inaccurate. But it
seems better than having the information scattered in multiple places, and
leaving incorrect info in the 9.1.2 notes would be bad anyway. We'll still
need to mention the issue again in the 9.1.4 notes, but perhaps they can
just reference 9.1.2 for fix instructions.
Per report from Evan Carroll. Back-patch into 9.1.