The last version in which these options were documented is now EOL, so
it's time to get rid of them for real. We now use GNU-style long
options instead.
Including collation in the behavior of that function promotes a world view
we do not want. Moreover, it was producing the wrong behavior for pg_dump
anyway: what we want is to dump a COLLATE clause on attributes whose
attcollation is different from the underlying type, and likewise for
domains, and the function cannot do that for us. Doing it the hard way
in pg_dump is a bit more tedious but produces more correct output.
In passing, fix initdb so that the initial entry in pg_collation is
properly pinned. It was droppable before :-(
This mostly just involves creating control, install, and
update-from-unpackaged scripts for them. However, I had to adjust plperl
and plpython to not share the same support functions between variants,
because we can't put the same function into multiple extensions.
catversion bump forced due to new contents of pg_pltemplate, and because
initdb now installs plpgsql as an extension not a bare language.
Add support for regression testing these as extensions not bare
languages.
Fix a couple of other issues that popped up while testing this: my initial
hack at pg_dump binary-upgrade support didn't work right, and we don't want
an extra schema permissions test after all.
Documentation changes still to come, but I'm committing now to see
whether the MSVC build scripts need work (likely they do).
Remove the unconditional superuser permissions check in CREATE EXTENSION,
and instead define a "superuser" extension property, which when false
(not the default) skips the superuser permissions check. In this case
the calling user only needs enough permissions to execute the commands
in the extension's installation script. The superuser property is also
enforced in the same way for ALTER EXTENSION UPDATE cases.
In other ALTER EXTENSION cases and DROP EXTENSION, test ownership of
the extension rather than superuserness. ALTER EXTENSION ADD/DROP needs
to insist on ownership of the target object as well; to do that without
duplicating code, refactor comment.c's big switch for permissions checks
into a separate function in objectaddress.c.
I also removed the superuserness checks in pg_available_extensions and
related functions; there's no strong reason why everybody shouldn't
be able to see that info.
Also invent an IF NOT EXISTS variant of CREATE EXTENSION, and use that
in pg_dump, so that dumps won't fail for installed-by-default extensions.
We don't have any of those yet, but we will soon.
This is all per discussion of wrapping the standard procedural languages
into extensions. I'll make those changes in a separate commit; this is
just putting the core infrastructure in place.
Add a fdwhandler column to pg_foreign_data_wrapper, plus HANDLER options
in the CREATE FOREIGN DATA WRAPPER and ALTER FOREIGN DATA WRAPPER commands,
plus pg_dump support for same. Also invent a new pseudotype fdw_handler
with properties similar to language_handler.
This is split out of the "FDW API" patch for ease of review; it's all stuff
we will certainly need, regardless of any other details of the FDW API.
FDW handler functions will not actually get called yet.
In passing, fix some omissions and infelicities in foreigncmds.c.
Shigeru Hanada, Jan Urbanski, Heikki Linnakangas
The previous coding would try to process all SECTION_NONE items in the
initial sequential-restore pass, which failed if they were dependencies of
not-yet-restored items. Fix by postponing such items into the parallel
processing pass once we have skipped any non-PRE_DATA item.
Back-patch into 9.0; the original parallel-restore coding in 8.4 did not
have this bug, so no need to change it.
Report and diagnosis by Arnd Hannemann.
Normally, pg_dump summarily excludes functions in pg_catalog from
consideration. However, some extensions may create functions in pg_catalog
(adminpack already does that, and extensions for procedural languages will
likely do it too). In binary-upgrade mode, we have to dump such functions,
or the extension will be incomplete after upgrading. Per experimentation
with adminpack.
- collowner field
- CREATE COLLATION
- ALTER COLLATION
- DROP COLLATION
- COMMENT ON COLLATION
- integration with extensions
- pg_dump support for the above
- dependency management
- psql tab completion
- psql \dO command
This follows recent discussions, so it's quite a bit different from
Dimitri's original. There will probably be more changes once we get a bit
of experience with it, but let's get it in and start playing with it.
This is still just core code. I'll start converting contrib modules
shortly.
Dimitri Fontaine and Tom Lane
This follows my proposal of yesterday, namely that we try to recreate the
previous state of the extension exactly, instead of allowing CREATE
EXTENSION to run a SQL script that might create some entirely-incompatible
on-disk state. In --binary-upgrade mode, pg_dump won't issue CREATE
EXTENSION at all, but instead uses a kluge function provided by
pg_upgrade_support to recreate the pg_extension row (and extension-level
pg_depend entries) without creating any member objects. The member objects
are then restored in the same way as if they weren't members, in particular
using pg_upgrade's normal hacks to preserve OIDs that need to be preserved.
Then, for each member object, ALTER EXTENSION ADD is issued to recreate the
pg_depend entry that marks it as an extension member.
In passing, fix breakage in pg_upgrade's enum-type support: somebody didn't
fix it when the noise word VALUE got added to ALTER TYPE ADD. Also,
rationalize parsetree representation of COMMENT ON DOMAIN and fix
get_object_address() to allow OBJECT_DOMAIN.
My original idea of doing extension member identification during
getDependencies() didn't work correctly: we have to mark member tables as
not-to-be-dumped rather earlier than that, else their subsidiary objects
like indexes get dumped anyway. Rearrange code to mark them early enough.
This patch adds the server infrastructure to support extensions.
There is still one significant loose end, namely how to make it play nice
with pg_upgrade, so I am not yet committing the changes that would make
all the contrib modules depend on this feature.
In passing, fix a disturbingly large amount of breakage in
AlterObjectNamespace() and callers.
Dimitri Fontaine, reviewed by Anssi Kääriäinen,
Itagaki Takahiro, Tom Lane, and numerous others
This adds collation support for columns and domains, a COLLATE clause
to override it per expression, and B-tree index support.
Peter Eisentraut
reviewed by Pavel Stehule, Itagaki Takahiro, Robert Haas, Noah Misch
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
It appears that gcc 4.5 can issue such warnings for whole structs, not
just scalar variables as in the past. Refactor some pg_dump code slightly
so that the OutputContext local variables are always initialized, even
if they won't be used. It's cheap enough to not be worth worrying about.
which is stored in pg_largeobject_metadata.
No backpatch to 9.0 because you can't migrate from 9.0 to 9.0 with the
same catversion (because of tablespace conflict), and a pre-9.0
migration to 9.0 has not large object permissions to migrate.
Toast tables have identical pg_class.oid and pg_class.relfilenode, but
for clarity it is good to preserve the pg_class.oid.
Update comments regarding what is preserved, and do some
variable/function renaming for clarity.
Foreign tables are a core component of SQL/MED. This commit does
not provide a working SQL/MED infrastructure, because foreign tables
cannot yet be queried. Support for foreign table scans will need to
be added in a future patch. However, this patch creates the necessary
system catalog structure, syntax support, and support for ancillary
operations such as COMMENT and SECURITY LABEL.
Shigeru Hanada, heavily revised by Robert Haas
The contents of an unlogged table are WAL-logged; thus, they are not
available on standby servers and are truncated whenever the database
system enters recovery. Indexes on unlogged tables are also unlogged.
Unlogged GiST indexes are not currently supported.
This privilege is required to do Streaming Replication, instead of
superuser, making it possible to set up a SR slave that doesn't
have write permissions on the master.
Superuser privileges do NOT override this check, so in order to
use the default superuser account for replication it must be
explicitly granted the REPLICATION permissions. This is backwards
incompatible change, in the interest of higher default security.
With hundreds of thousands of TOC entries, the repeated searches in
reduce_dependencies() become the dominant cost. Get rid of that searching
by constructing reverse-dependency lists, which we can do in O(N) time
during the fix_dependencies() preprocessing. I chose to store the reverse
dependencies as DumpId arrays for consistency with the forward-dependency
representation, and keep the previously-transient tocsByDumpId[] array
around to locate actual TOC entry structs quickly from dump IDs.
While this fixes the slow case reported by Vlad Arkhipov, there is still
a potential for O(N^2) behavior with sufficiently many tables:
fix_dependencies itself, as well as mark_create_done and
inhibit_data_for_failed_table, are doing repeated searches to deal with
table-to-table-data dependencies. Possibly this work could be extended
to deal with that, although the latter two functions are also used in
non-parallel restore where we currently don't run fix_dependencies.
Another TODO is that we fail to parallelize restore of multiple blobs
at all. This appears to require changes in the archive format to fix.
Back-patch to 9.0 where the problem was reported. 8.4 has potential issues
as well; but since it doesn't create a separate TOC entry for each blob,
it's at much less risk of having enough TOC entries to cause real problems.
to make it easier to reuse that code. There is no user-visible changes.
This is in preparation for the patch to add a new archive format, a directory,
to perform a custom-like dump but with each table being dumped to a separate
file (that in turn is a prerequisite for parallel pg_dump). This also makes it
easier to add new compression methods in the future, and makes the
pg_backup_custom.c code easier to read, when the compression-related code is
factored out.
Joachim Wieland, with heavy editorialization by me.
This commit adds columns amoppurpose and amopsortfamily to pg_amop, and
column amcanorderbyop to pg_am. For the moment all the entries in
amcanorderbyop are "false", since the underlying support isn't there yet.
Also, extend the CREATE OPERATOR CLASS/ALTER OPERATOR FAMILY commands with
[ FOR SEARCH | FOR ORDER BY sort_operator_family ] clauses to allow the new
columns of pg_amop to be populated, and create pg_dump support for dumping
that information.
I also added some documentation, although it's perhaps a bit premature
given that the feature doesn't do anything useful yet.
Teodor Sigaev, Robert Haas, Tom Lane
Replace for loops in makefiles with proper dependencies. Parallel
make can now span across directories. Also, make -k and make -q work
properly.
GNU make 3.80 or newer is now required.
Per C standard, these are semantically the same thing; but saying NULL
when you mean NULL is good for readability.
Marti Raudsepp, per results of INRIA's Coccinelle.
After much expenditure of effort, we've got this to the point where the
performance penalty is pretty minimal in typical cases.
Andrew Dunstan, reviewed by Brendan Jurd, Dean Rasheed, and Tom Lane
This patch adds the SQL-standard concept of an INSTEAD OF trigger, which
is fired instead of performing a physical insert/update/delete. The
trigger function is passed the entire old and/or new rows of the view,
and must figure out what to do to the underlying tables to implement
the update. So this feature can be used to implement updatable views
using trigger programming style rather than rule hacking.
In passing, this patch corrects the names of some columns in the
information_schema.triggers view. It seems the SQL committee renamed
them somewhere between SQL:99 and SQL:2003.
Dean Rasheed, reviewed by Bernd Helmle; some additional hacking by me.
This is intended as infrastructure to support integration with label-based
mandatory access control systems such as SE-Linux. Further changes (mostly
hooks) will be needed, but this is a big chunk of it.
KaiGai Kohei and Robert Haas
servers. AFAICT it's harmless at the moment because nothing can depend on
either, but as soon as we introduce an object type with such dependencies,
tableoid needs to be set or pg_dump will fail to interpret the dependencies
correctly. In theory, I guess the uninitialized garbage in tableoid could
cause the object to be mistaken for some other object with same OID as well.
The original coding tended to break down in the face of modified restore
orders, as shown in bug #5626 from Albert Ullrich, because it would flip over
into parallel-restore operation too soon. That causes problems because we
don't have sufficient dependency information in dump archives to allow safe
parallel processing of SECTION_PRE_DATA items. Even if we did, it's probably
undesirable to allow that to override the commanded restore order.
To fix the problem of omitted items causing unexpected changes in restore
order, tweak SortTocFromFile so that omitted items end up at the head of
the list not the tail. This ensures that they'll be examined and their
dependencies will be marked satisfied before we get to any interesting
items.
In HEAD and 9.0, we can easily change restore_toc_entries_parallel so that
all SECTION_PRE_DATA items are guaranteed to be processed in the initial
serial-restore loop, and hence in commanded order. Only DATA and POST_DATA
items are candidates for parallel processing. For them there might be
variations from the commanded order because of parallelism, but we should
do it in a safe order thanks to dependencies.
In 8.4 it's much harder to make such a guarantee. I settled for not
letting the initial loop break out into parallel processing mode if
it sees a DATA/POST_DATA item that's not to be restored; this at least
prevents a non-restorable item from causing premature exit from the loop.
This means that 8.4 will be more likely to fail given a badly-ordered -L
list than 9.x, but we don't really promise any such thing will work anyway.
and input file name, per bug #5617 from Leo Shklovskii. Rearrange the
corresponding code in pg_dump and pg_dumpall so that all three programs
handle this in a consistent, straightforward fashion.
Back-patch to 9.0, but no further. Although this is certainly a bug, it's
possible that people have scripts that will be broken by the added error
check, so it seems better not to change the behavior in stable branches.
I've added a quote_all_identifiers GUC which affects the behavior
of the backend, and a --quote-all-identifiers argument to pg_dump
and pg_dumpall which sets the GUC and also affects the quoting done
internally by those applications.
Design by Tom Lane; review by Alex Hunsaker; in response to bug #5488
filed by Hartmut Goebel.
to dump a PUBLIC user mapping correctly, as per bug #5560 from Shigeru Hanada.
Use the pg_user_mappings view rather than trying to access pg_user_mapping
directly, so that the code doesn't fail when run by a non-superuser. And
clean up some minor carelessness such as unsafe usage of fmtId().
Back-patch to 8.4 where this code was added.
linking both executables and shared libraries, and we add on LDFLAGS_EX when
linking executables or LDFLAGS_SL when linking shared libraries. This
provides a significantly cleaner way of dealing with link-time switches than
the former behavior. Also, make sure that the various platform-specific
%.so: %.o rules incorporate LDFLAGS and LDFLAGS_SL; most of them missed that
before. (I did not add these variables for the platforms that invoke $(LD)
directly, however. It's not clear if we can do that safely, since for the
most part we assume these variables use CC command-line syntax.)
Per gripe from Aaron Swenson and subsequent investigation.
as well as fseeko, and to not assume that fseeko(fp, 0, SEEK_CUR) proves
anything. Also improve some related comments. Per my observation that
the SEEK_CUR test didn't actually work on some platforms, and subsequent
discussion with Robert Haas.
Back-patch to 8.4. In earlier releases it's not that important whether
we get the hasSeek test right, but with parallel restore it matters.
contain data offsets (which it won't, if pg_dump thought its output wasn't
seekable). To do that, remove an unnecessarily aggressive error check, and
instead fail if we get to the end of the archive without finding the desired
data item. Also improve the error message to be more specific about the
cause of the problem. Per discussion of recent report from Igor Neyman.
Back-patch to 8.4 where parallel restore was introduced.
is specified. Per bug report from Russell Smith and ensuing discussion.
Since this is a corner case behavioral change, I'm going to be conservative
and not back-patch it.
In passing, also rename the RestoreOptions field for the -C switch to
something less generic than "create".
by joining to pg_constraint.conindid, instead of the former technique of
joining indirectly through pg_depend. This is much more straightforward
and probably faster as well. I had originally desisted from changing these
queries when conindid was added because I was worried about losing
performance, but if we join on conrelid as well as conindid then the index
on conrelid can be used when pg_constraint is large.
This was evidently broken by the CREATE TABLE OF TYPE patch. It would have
been noticed if anyone had bothered to try dumping and restoring the
regression database ...
"dumping data out of order is not supported" to "restoring data out of order
is not supported", because you get that error during pg_restore not pg_dump.
Also fix some comments that didn't look so good after being pgindented as
perhaps they did originally.
a separate archive entry for each BLOB, and use pg_dump's standard methods
for dealing with its ownership, ACL if any, and comment if any. This means
that switches like --no-owner and --no-privileges do what they're supposed
to. Preliminary testing says that performance is still reasonable even
with many blobs, though we'll have to see how that shakes out in the field.
KaiGai Kohei, revised by me
of shared or nailed system catalogs. This has two key benefits:
* The new CLUSTER-based VACUUM FULL can be applied safely to all catalogs.
* We no longer have to use an unsafe reindex-in-place approach for reindexing
shared catalogs.
CLUSTER on nailed catalogs now works too, although I left it disabled on
shared catalogs because the resulting pg_index.indisclustered update would
only be visible in one database.
Since reindexing shared system catalogs is now fully transactional and
crash-safe, the former special cases in REINDEX behavior have been removed;
shared catalogs are treated the same as non-shared.
This commit does not do anything about the recently-discussed problem of
deadlocks between VACUUM FULL/CLUSTER on a system catalog and other
concurrent queries; will address that in a separate patch. As a stopgap,
parallel_schedule has been tweaked to run vacuum.sql by itself, to avoid
such failures during the regression tests.
If expand_dbname is non-zero and dbname contains an = sign, it is taken as
a conninfo string in exactly the same way as if it had been passed to
PQconnectdb. This is equivalent to the way PQsetdbLogin() works, allowing
PQconnectdbParams() to be a complete alternative.
Also improve the way the new function is called from psql and replace a
previously missed call to PQsetdbLogin() in psql. Additionally use
PQconnectdbParams() for pg_dump and friends, and the bin/scripts
command line utilities such as vacuumdb, createdb, etc.
Finally, update the documentation for the new parameter, as well as the
nuances of precedence in cases where key words are repeated or duplicated
in the conninfo string.
Attributes can now have options, just as relations and tablespaces do, and
the reloptions code is used to parse, validate, and store them. For
simplicity and because these options are not performance critical, we store
them in a separate cache rather than the main relcache.
Thanks to Alex Hunsaker for the review.
dump IDs, because the array we're using is sized according to the highest
dump ID actually defined in the archive file. In a partial dump there could
be references to higher dump IDs that weren't dumped. Treat these the same
as references to in-range IDs that weren't dumped. (The whole thing is a
bit scary because the missing objects might have been part of dependency
chains, which we won't know about. Not much we can do though --- throwing
an error is probably overreaction.)
Also, reject parallel restore with pre-1.8 archive version (made by pre-8.0
pg_dump). In these old versions the dependency entries are OIDs, not dump
IDs, and we don't have enough information to interpret them.
Per bug #5288 from Jon Erdman.
pg_constraint before searching pg_trigger. This allows saner handling of
corner cases; in particular we now say "constraint is not deferrable"
rather than "constraint does not exist" when the command is applied to
a constraint that's inherently non-deferrable. Per a gripe several months
ago from hubert depesz lubaczewski.
To make this work without breaking user-defined constraint triggers,
we have to add entries for them to pg_constraint. However, in return
we can remove the pgconstrname column from pg_constraint, which represents
a fairly sizable space savings. I also replaced the tgisconstraint column
with tgisinternal; the old meaning of tgisconstraint can now be had by
testing for nonzero tgconstraint, while there is no other way to get
the old meaning of nonzero tgconstraint, namely that the trigger was
internally generated rather than being user-created.
In passing, fix an old misstatement in the docs and comments, namely that
pg_trigger.tgdeferrable is exactly redundant with pg_constraint.condeferrable.
Actually, we mark RI action triggers as nondeferrable even when they belong to
a nominally deferrable FK constraint. The SET CONSTRAINTS code now relies on
that instead of hard-coding a list of exception OIDs.
we're not going to support that anymore.
I did keep the 64-bit-CRC-with-32-bit-arithmetic code, since it has a
performance excuse to live. It's a bit moot since that's all ifdef'd
out, of course.
This patch only supports seq_page_cost and random_page_cost as parameters,
but it provides the infrastructure to scalably support many more.
In particular, we may want to add support for effective_io_concurrency,
but I'm leaving that as future work for now.
Thanks to Tom Lane for design help and Alvaro Herrera for the review.
Modify pg_dump --binary-upgrade and add backend support routines to
support the preservation of pg_type oids when doing a binary upgrade.
This allows user-defined composite types and arrays to be binary
upgraded.
support any indexable commutative operator, not just equality. Two rows
violate the exclusion constraint if "row1.col OP row2.col" is TRUE for
each of the columns in the constraint.
Jeff Davis, reviewed by Robert Haas
checked to determine whether the trigger should be fired.
For BEFORE triggers this is mostly a matter of spec compliance; but for AFTER
triggers it can provide a noticeable performance improvement, since queuing of
a deferred trigger event and re-fetching of the row(s) at end of statement can
be short-circuited if the trigger does not need to be fired.
Takahiro Itagaki, reviewed by KaiGai Kohei.
are named in the UPDATE's SET list.
Note: the schema of pg_trigger has not actually changed; we've just started
to use a column that was there all along. catversion bumped anyway so that
this commit is included in the history of potentially interesting changes
to system catalog contents.
Itagaki Takahiro
Add a variant of pg_get_triggerdef with a second argument "pretty" that
causes the output to be formatted in the way pg_dump used to do. Use this
variant in pg_dump with server versions >= 8.5.
This insulates pg_dump from most future trigger feature additions, such as
the upcoming column triggers patch.
Author: Itagaki Takahiro <itagaki.takahiro@oss.ntt.co.jp>
Create a new catalog pg_db_role_setting where they are now stored, and better
encapsulate the code that deals with settings into its realm. The old
datconfig and rolconfig columns are removed.
psql has gained a \drds command to display the settings.
Backwards compatibility warning: while the backwards-compatible system views
still have the config columns, they no longer completely represent the
configuration for a user or database.
Catalog version bumped.
the privileges that will be applied to subsequently-created objects.
Such adjustments are always per owning role, and can be restricted to objects
created in particular schemas too. A notable benefit is that users can
override the traditional default privilege settings, eg, the PUBLIC EXECUTE
privilege traditionally granted by default for functions.
Petr Jelinek
to create a function for it.
Procedural languages now have an additional entry point, namely a function
to execute an inline code block. This seemed a better design than trying
to hide the transient-ness of the code from the PL. As of this patch, only
plpgsql has an inline handler, but probably people will soon write handlers
for the other standard PLs.
In passing, remove the long-dead LANCOMPILER option of CREATE LANGUAGE.
Petr Jelinek
use that value when the backend is new enough to allow it. This responds
to bug report from Keh-Cheng Chu pointing out that although 2 extra digits
should be sufficient to dump and restore float8 exactly, it is possible to
need 3 extra digits for float4 values.
Update install-sh to that from Autoconf 2.63, plus our Darwin-specific
changes (which I simplified a bit). install-sh is now able to install
multiple files in one run, so we could simplify our makefiles sometime.
install-sh also now has a -d option to create directories, so we don't need
mkinstalldirs anymore.
Use AC_PROG_MKDIR_P in configure.in, so we can use mkdir -p when available
instead of install-sh -d. For consistency with the rest of the world,
the corresponding make variable has been renamed from $(mkinstalldirs) to
$(MKDIR_P).
two new lists, rather than repeatedly rescanning the main TOC list.
This avoids a potential O(N^2) slowdown, although you'd need a *lot*
of tables to make that really significant; and it might simplify future
improvements in the scheduling algorithm by making the set of ready
items more easily inspectable. The original thought that it would
in itself result in a more efficient job dispatch order doesn't seem
to have been borne out in testing, but it seems worth doing anyway.
The previous implementation got it right in most cases but failed in one:
if you pg_dump into an archive with standard_conforming_strings enabled, then
pg_restore to a script file (not directly to a database), the script will set
standard_conforming_strings = on but then emit large object data as
nonstandardly-escaped strings.
At the moment the code is made to emit hex-format bytea strings when dumping
to a script file. We might want to change to old-style escaping for backwards
compatibility, but that would be slower and bulkier. If we do, it's just a
matter of reimplementing appendByteaLiteral().
This has been broken for a long time, but given the lack of field complaints
I'm not going to worry about back-patching.
Both hex format and the traditional "escape" format are automatically
handled on input. The output format is selected by the new GUC variable
bytea_output.
As committed, bytea_output defaults to HEX, which is an *incompatible
change*. We will keep it this way for awhile for testing purposes, but
should consider whether to switch to the more backwards-compatible
default of ESCAPE before 8.5 is released.
Peter Eisentraut
The current implementation fires an AFTER ROW trigger for each tuple that
looks like it might be non-unique according to the index contents at the
time of insertion. This works well as long as there aren't many conflicts,
but won't scale to massive unique-key reassignments. Improving that case
is a TODO item.
Dean Rasheed
Changes:
Pass in the keyword lookup array instead of having it be hardwired.
(This incidentally allows elimination of some duplicate coding in ecpg.)
Re-order the token declarations in gram.y so that non-keyword tokens have
numbers that won't change when keywords are added or removed.
Add ".." and ":=" to the set of tokens recognized by scan.l. (Since these
combinations are nowhere legal in core SQL, this does not change anything
except the precise wording of the error you get when you write this.)
throwing an error as 8.4 had been doing. The error interfered with porting
old database definitions (particularly for pg_migrator) without really buying
any safety. Per bug #4817 and subsequent discussion.
an expression that's not supposed to contain variables. Per discussion
with Gevik Babakhani, this eliminates the need for an ugly kluge (namely,
specifying some unrelated relation name). Remove one such kluge from
pg_dump.
in its CREATE DATABASE commands only for databases that have settings
different from the installation defaults. This is a low-tech method of
avoiding unnecessary platform dependencies in dump files. Eventually we ought
to have a platform-independent way of specifying LC_COLLATE and LC_CTYPE, but
that's not going to happen for 8.4, and this patch at least avoids the issue
for people who aren't setting up per-database locales. ENCODING doesn't have
the platform dependency problem, but it seems consistent to make it act the
same as the locale settings.
If a currently running item needs an exclusive lock on any item that the candidate items needs
any sort of lock on, or vice versa, then the candidate item is not allowed to run now, and
must wait till later.
tablespaces in an order that has some chance of working.
Per a complaint from Kevin Bailey.
This is a pre-existing bug, but given the lack of prior complaints I'm
not sure it's worth back-patching. In most cases failure of the DROP
commands wouldn't be that important anyway.
In passing, fix syntax errors in dumpCreateDB()'s queries for old servers;
these were apparently introduced in recent binary_upgrade patch.
are using our own ports of getopt or getopt_long, those will define
the variable for themselves; and if not, we don't need these, because
we never touch the variable anyway.
In the backend, I changed only a handful of exemplary or important-looking
instances to make use of the plural support; there is probably more work
there. For the rest of the source, this should cover all relevant cases.
is still available, but you must now write the long equivalent --inserts
or --column-inserts. This change is made to eliminate confusion with the
use of -d to specify a database name in most other Postgres client programs.
Original patch by Greg Mullane, modified per subsequent discussion.
running pg_restore, which might run in parallel).
Only reopen archive file when we really need to read from it, in parallel code. Otherwise,
close it immediately in a worker, if possible.
kwlist.h, to avoid having to link the backend object file into other programs
like pg_dump. We can now simply symlink a single source file from the backend
(kwlookup.c, containing the shared routine ScanKeywordLookup) and compile it
locally, which is a lot cleaner.
wrappers (similar to procedural languages). This way we don't need to retain
the nearly empty libraries, and we are more free in how to implement the
wrapper API in the future.
post-data step is run in a separate worker child (a thread on Windows, a child
process elsewhere) up to the concurrent number specified by the new pg_restore
command-line --multi-thread | -m switch.
Andrew Dunstan, with some editing by Tom Lane.
qualifier, and add support for this in pg_dump.
This allows TOAST tables to have user-defined fillfactor, and will also
enable us to move the autovacuum parameters to reloptions without taking
away the possibility of setting values for TOAST tables.
array types for composite types. Although pg_dump understood it wasn't
supposed to dump these array types as separate objects, it must include
them in the dependency ordering analysis, and it was improperly assigning them
the same relatively-high sort priority as regular types. This resulted in
effectively moving composite types and tables up to that same high priority,
which broke any ordering requirements that weren't explicitly enforced by
dependencies. In particular user-defined operator classes, which should come
out before tables, failed to do so. Per report from Brendan Jurd.
In passing, also fix an ill-considered decision to give text search objects
the same sort priority as functions and operators --- the sort result looks
a lot nicer if different object types are kept separate. The recent
foreign-data patch had copied that decision, making the sort ordering even
messier :-(
It's not possible to do CREATE DATABASE inside a transaction, so previously
we just got a server error instead.
Backpatch to 8.2, which is where the -1 feature appeared.
performing dumps and restores in accordance with a security policy that
forbids logging in directly as superuser, but instead specifies that you
should log into an admin account and then SET ROLE to the superuser.
In passing, clean up some ugly and mostly-broken code for quoting shell
arguments in pg_dumpall.
Benedek László, with some help from Tom Lane
so that user-defined window functions are possible. For the moment you'll
have to write them in C, for lack of any interface to the WindowObject API
in the available PLs, but it's better than no support at all.
There was some debate about the best syntax for this. I ended up choosing
the "it's an attribute" position --- the other approach will inevitably be
more work, and the likely market for user-defined window functions is
probably too small to justify it.
This doesn't do any remote or external things yet, but it gives modules
like plproxy and dblink a standardized and future-proof system for
managing their connection information.
Martin Pihlak and Peter Eisentraut
("there might be triggers") rather than an exact count. This is necessary
catalog infrastructure for the upcoming patch to reduce the strength of
locking needed for trigger addition/removal. Split out and committed
separately for ease of reviewing/testing.
In passing, also get rid of the unused pg_class columns relukeys, relfkeys,
and relrefs, which haven't been maintained in many years and now have no
chance of ever being maintained (because of wishing to avoid locking).
Simon Riggs
from DateStyle, and create a new interval style that produces output matching
the SQL standard (at least for interval values that fall within the standard's
restrictions). IntervalStyle is also used to resolve the conflict between the
standard and traditional Postgres rules for interpreting negative interval
input.
Ron Mayer
ctype are now more like encoding, stored in new datcollate and datctype
columns in pg_database.
This is a stripped-down version of Radek Strnad's patch, with further
changes by me.
referenced tables are dumped before the referencing tables. This avoids
failures when the data is loaded with the FK constraints already active.
If no such ordering is possible because of circular or self-referential
constraints, print a NOTICE to warn the user about it.
returns NULL instead of a PGresult. The former coding would fail, which
is OK, but it neglected to give you the PQerrorMessage that might tell
you why. In the oldest branches, there was another problem: it'd sometimes
report PQerrorMessage from the wrong connection.
only type categories in which the previous coding made *every* type
preferred; so there is no change in effective behavior, because the function
resolution rules only do something different when faced with a choice
between preferred and non-preferred types in the same category. It just
seems safer and less surprising to have CREATE TYPE default to non-preferred
status ...
with system catalog lookups, as was foreseen to be necessary almost since
their creation. Instead put the information into two new pg_type columns,
typcategory and typispreferred. Add support for setting these when
creating a user-defined base type.
The category column is just a "char" (i.e. a poor man's enum), allowing
a crude form of user extensibility of the category list: just use an
otherwise-unused character. This seems sufficient for foreseen uses,
but we could upgrade to having an actual category catalog someday, if
there proves to be a huge demand for custom type categories.
In this patch I have attempted to hew exactly to the behavior of the
previous hardwired logic, except for introducing new type categories for
arrays, composites, and enums. In particular the default preferred state
for user-defined types remains TRUE. That seems worth revisiting, but it
should be done as a separate patch from introducing the infrastructure.
Likewise, any adjustment of the standard set of categories should be done
separately.
need to deconstruct proargmodes for each pg_proc entry inspected by
FuncnameGetCandidates(). Fixes function lookup performance regression
caused by yesterday's variadic-functions patch.
In passing, make pg_proc.probin be NULL, rather than a dummy value '-',
in cases where it is not actually used for the particular type of function.
This should buy back some of the space cost of the extra column.
so long as all the trailing arguments are of the same (non-array) type.
The function receives them as a single array argument (which is why they
have to all be the same type).
It might be useful to extend this facility to aggregates, but this patch
doesn't do that.
This patch imposes a noticeable slowdown on function lookup --- a follow-on
patch will fix that by adding a redundant column to pg_proc.
Pavel Stehule
output for CREATE FUNCTION. This makes it easier to read especially if the
function body is long.
Original idea and patch by Greg Sabino Mullane, though this is a stripped
down version of that.
sequence to be reset to its original starting value. This requires adding the
original start value to the set of parameters (columns) of a sequence object,
which is a user-visible change with potential compatibility implications;
it also forces initdb.
Also add hopefully-SQL-compatible RESTART/CONTINUE IDENTITY options to
TRUNCATE TABLE. RESTART IDENTITY executes ALTER SEQUENCE RESTART for all
sequences "owned by" any of the truncated relations. CONTINUE IDENTITY is
a no-op option.
Zoltan Boszormenyi
unnecessary #include lines in it. Also, move some tuple routine prototypes and
macros to htup.h, which allows removal of heapam.h inclusion from some .c
files.
For this to work, a new header file access/sysattr.h needed to be created,
initially containing attribute numbers of system columns, for pg_dump usage.
While at it, make contrib ltree, intarray and hstore header files more
consistent with our header style.
as those for inherited columns; that is, it's no longer allowed for a child
table to not have a check constraint matching one that exists on a parent.
This satisfies the principle of least surprise (rows selected from the parent
will always appear to meet its check constraints) and eliminates some
longstanding bogosity in pg_dump, which formerly had to guess about whether
check constraints were really inherited or not.
The implementation involves adding conislocal and coninhcount columns to
pg_constraint (paralleling attislocal and attinhcount in pg_attribute)
and refactoring various ALTER TABLE actions to be more like those for
columns.
Alex Hunsaker, Nikhil Sontakke, Tom Lane
"consistent" functions, and remove pg_amop.opreqcheck, as per recent
discussion. The main immediate benefit of this is that we no longer need
8.3's ugly hack of requiring @@@ rather than @@ to test weight-using tsquery
searches on GIN indexes. In future it should be possible to optimize some
other queries better than is done now, by detecting at runtime whether the
index match is exact or not.
Tom Lane, after an idea of Heikki's, and with some help from Teodor.
the server version check is now always enforced. Relax the version check to
allow a server that is of pg_dump's own major version but a later minor
version; this is the only case that -i was at all safe to use in.
pg_restore already enforced only a very weak version check, so this is
really just a documentation change for it.
Per discussion.
inclusions in src/include/catalog/*.h files. The main idea here is to push
function declarations for src/backend/catalog/*.c files into separate headers,
rather than sticking them into the corresponding catalog definition file as
has been done in the past. This commit only carries out that idea fully for
pg_proc, pg_type and pg_conversion, but that's enough for the moment ---
if pg_list.h ever becomes unsafe for frontend code to include, we'll need
to work a bit more.
Zdenek Kotala
dumps can be loaded into databases without the same tablespaces that the
source had. The option acts by suppressing all "SET default_tablespace"
commands, and also CREATE TABLESPACE commands in pg_dumpall's case.
Gavin Roy, with documentation and minor fixes by me.
This is to avoid uselessly requiring superuser permissions to restore
the dump without errors. Pretty grotty, but no better alternative seems
available, at least not in the near term.
useful and confuses people who think it is the same as -U. (Eventually
we might want to re-introduce it as being an alias for -U, but that should
not happen until the switch has actually not been there for a few releases.)
Likewise in pg_dump and pg_restore. Per gripe from Robert Treat and
subsequent discussion.
PQconnectionNeedsPassword function that tells the right thing for whether to
prompt for a password, and improve PQconnectionUsedPassword so that it checks
whether the password used by the connection was actually supplied as a
connection argument, instead of coming from environment or a password file.
Per bug report from Mark Cave-Ayland and subsequent discussion.
that have default expressions different from their parent. First, if the
parent table's default expression has to be split out as a separate
ALTER TABLE command, we need a dependency constraint to ensure that the
child's command is given second. This is because the ALTER TABLE on the
parent will propagate to the child. (We can't prevent that by using ONLY on
the parent's command, since it's possible that other children exist that
should receive the inherited default.) Second, if the child has a NULL
default where the parent does not, we have to explicitly say DEFAULT NULL on
the child in order for this state to be preserved after reload. (The latter
actually doesn't work right because of a backend bug, but that is a separate
issue.)
Backpatch as far as 8.0. 7.x pg_dump has enough issues with altered tables
(due to lack of dependency analysis) that trying to fix this one doesn't seem
very productive.
renumbering of encoding IDs done between 8.2 and 8.3 turns out to break 8.2
initdb and psql if they are run with an 8.3beta1 libpq.so. For the moment
we can rearrange the order of enum pg_enc to keep the same number for
everything except PG_JOHAB, which isn't a problem since there are no direct
references to it in the 8.2 programs anyway. (This does force initdb
unfortunately.)
Going forward, we want to fix things so that encoding IDs can be changed
without an ABI break, and this commit includes the changes needed to allow
libpq's encoding IDs to be treated as fully independent of the backend's.
The main issue is that libpq clients should not include pg_wchar.h or
otherwise assume they know the specific values of libpq's encoding IDs,
since they might encounter version skew between pg_wchar.h and the libpq.so
they are using. To fix, have libpq officially export functions needed for
encoding name<=>ID conversion and validity checking; it was doing this
anyway unofficially.
It's still the case that we can't renumber backend encoding IDs until the
next bump in libpq's major version number, since doing so will break the
8.2-era client programs. However the code is now prepared to avoid this
type of problem in future.
Note that initdb is no longer a libpq client: we just pull in the two
source files we need directly. The patch also fixes a few places that
were being sloppy about checking for an unrecognized encoding name.
duplicative -DFRONTEND flags from many Makefiles. We still need Makefile
control of the symbol in a few places that compile frontend-or-backend
src/port/ files, but it's a lot cleaner than before.
Hiroshi Saito
There are still some loose ends: I didn't do anything about the SET FROM
CURRENT idea yet, and it's not real clear whether we are happy with the
interaction of SET LOCAL with function-local settings. The documentation
is a bit spartan, too.
read from the temp file didn't match the file length reported by ftello(),
the wrong variable's value was printed, and so the message made no sense.
Clean up a couple other coding infelicities while at it.
init options of the template as top-level options in the syntax. This also
makes ALTER a bit easier to use, since options can be replaced individually.
I also made these statements verify that the tmplinit method will accept
the new settings before they get stored; in the original coding you didn't
find out about mistakes until the dictionary got invoked.
Under the hood, init methods now get options as a List of DefElem instead
of a raw text string --- that lets tsearch use existing options-pushing code
instead of duplicating functionality.
Oleg Bartunov and Teodor Sigaev, but I did a lot of editorializing,
so anything that's broken is probably my fault.
Documentation is nonexistent as yet, but let's land the patch so we can
get some portability testing done.
literally, whether quoted or not. Since we allow $ as a character within
identifiers, this behavior is useful, whereas the previous behavior of
treating it as the regexp ending anchor was nearly useless given that the
pattern is automatically anchored anyway. This affects the arguments of
psql's \d commands as well as pg_dump's -n and -t switches. Per discussion.
error message, by using PQconnectionUsedPassword() instead. Someday
we might be able to localize that error message, but not until this
coding technique has disappeared everywhere.
unreserved according to the grammar. The list of unreserved words has gotten
extensive enough that the unnecessary quoting is becoming a bit of an eyesore.
To do this, add knowledge of the keyword category to keywords.c's table.
(Someday we might be able to generate keywords.c's table and the keyword lists
in gram.y from a common source.) For the moment, lie about WITH's status in
the table so it will still get quoted --- this is because of the expectation
that WITH will become reserved when the SQL recursive-queries patch gets done.
I didn't force initdb because this affects nothing on-disk; but note that a
few regression tests have changed expected output.
and views (but not system catalogs, nor sequences or toast tables). Get rid
of the hardwired convention that a type's array type is named exactly "_type",
instead using a new column pg_type.typarray to provide the linkage. (It still
will be named "_type", though, except in odd corner cases such as
maximum-length type names.)
Along the way, make tracking of owner and schema dependencies for types more
uniform: a type directly created by the user has these dependencies, while a
table rowtype or auto-generated array type does not have them, but depends on
its parent object instead.
David Fetter, Andrew Dunstan, Tom Lane
sequence for dumping without also selecting its owning table. Make it not try
to emit ALTER SEQUENCE OWNED BY in this situation.
Per report from Michael Nolan.
A DBA is allowed to create a language in his database if it's marked
"tmpldbacreate" in pg_pltemplate. The factory default is that this is set
for all standard trusted languages, but of course a superuser may adjust
the settings. In service of this, add the long-foreseen owner column to
pg_language; renaming, dropping, and altering owner of a PL now follow
normal ownership rules instead of being superuser-only.
Jeremy Drake, with some editorialization by Tom Lane.
rules to be defined with different, per session controllable, behaviors
for replication purposes.
This will allow replication systems like Slony-I and, as has been stated
on pgsql-hackers, other products to control the firing mechanism of
triggers and rewrite rules without modifying the system catalog directly.
The firing mechanisms are controlled by a new superuser-only GUC
variable, session_replication_role, together with a change to
pg_trigger.tgenabled and a new column pg_rewrite.ev_enabled. Both
columns are a single char data type now (tgenabled was a bool before).
The possible values in these attributes are:
'O' - Trigger/Rule fires when session_replication_role is "origin"
(default) or "local". This is the default behavior.
'D' - Trigger/Rule is disabled and fires never
'A' - Trigger/Rule fires always regardless of the setting of
session_replication_role
'R' - Trigger/Rule fires when session_replication_role is "replica"
The GUC variable can only be changed as long as the system does not have
any cached query plans. This will prevent changing the session role and
accidentally executing stored procedures or functions that have plans
cached that expand to the wrong query set due to differences in the rule
firing semantics.
The SQL syntax for changing a triggers/rules firing semantics is
ALTER TABLE <tabname> <when> TRIGGER|RULE <name>;
<when> ::= ENABLE | ENABLE ALWAYS | ENABLE REPLICA | DISABLE
psql's \d command as well as pg_dump are extended in a backward
compatible fashion.
Jan
equality checks it applies, instead of a random dependence on whatever
operators might be named "=". The equality operators will now be selected
from the opfamily of the unique index that the FK constraint depends on to
enforce uniqueness of the referenced columns; therefore they are certain to be
consistent with that index's notion of equality. Among other things this
should fix the problem noted awhile back that pg_dump may fail for foreign-key
constraints on user-defined types when the required operators aren't in the
search path. This also means that the former warning condition about "foreign
key constraint will require costly sequential scans" is gone: if the
comparison condition isn't indexable then we'll reject the constraint
entirely. All per past discussions.
Along the way, make the RI triggers look into pg_constraint for their
information, instead of using pg_trigger.tgargs; and get rid of the always
error-prone fixed-size string buffers in ri_triggers.c in favor of building up
the RI queries in StringInfo buffers.
initdb forced due to columns added to pg_constraint and pg_trigger.
where possible, and fix some sites that apparently thought that fgets()
will overwrite the buffer by one byte.
Also add some strlcpy() to eliminate some weird memory handling.
Standard English uses "may", "can", and "might" in different ways:
may - permission, "You may borrow my rake."
can - ability, "I can lift that log."
might - possibility, "It might rain today."
Unfortunately, in conversational English, their use is often mixed, as
in, "You may use this variable to do X", when in fact, "can" is a better
choice. Similarly, "It may crash" is better stated, "It might crash".
columns procost and prorows, to allow simple user adjustment of the estimated
cost of a function call, as well as control of the estimated number of rows
returned by a set-returning function. We might eventually wish to extend this
to allow function-specific estimation routines, but there seems to be
consensus that we should try a simple constant estimate first. In particular
this provides a relatively simple way to control the order in which different
WHERE clauses are applied in a plan node, which is a Good Thing in view of the
fact that the recent EquivalenceClass planner rewrite made that much less
predictable than before.
database privileges from a pre-8.2 server. This ensures that the reloaded
database will maintain the same behavior it had in the previous installation,
ie, everybody has connect privilege. Per gripe from L Bayuk.
cases. Operator classes now exist within "operator families". While most
families are equivalent to a single class, related classes can be grouped
into one family to represent the fact that they are semantically compatible.
Cross-type operators are now naturally adjunct parts of a family, without
having to wedge them into a particular opclass as we had done originally.
This commit restructures the catalogs and cleans up enough of the fallout so
that everything still works at least as well as before, but most of the work
needed to actually improve the planner's behavior will come later. Also,
there are not yet CREATE/DROP/ALTER OPERATOR FAMILY commands; the only way
to create a new family right now is to allow CREATE OPERATOR CLASS to make
one by default. I owe some more documentation work, too. But that can all
be done in smaller pieces once this infrastructure is in place.
because on that platform strftime produces localized zone names in varying
encodings. Even though it's only in a comment, this can cause encoding
errors when reloading the dump script. Per suggestion from Andreas
Seltenreich. Also, suppress %Z on Windows in the %s escape of
log_line_prefix ... not sure why this one is different from the other two,
but it shouldn't be.
(blobs) with comments, per bug #2727 from Konstantin Pelepelin.
Mea culpa for not having tested this case.
Back-patch to 8.1; prior branches don't dump blob comments at all.
one of the program's core data structures, make use of the existing
ability to selectively exclude TOC items by ID. Slightly more code but
much less likely to create future maintenance problems.
to process all inclusion switches then all exclusion switches, so that the
behavior is independent of switch ordering.
Use of -T does not cause non-table objects to be suppressed. And
the patterns are now interpreted the same way psql's \d commands do it,
rather than as pure regex commands; this allows for example -t schema.tab
to do what it should have been doing all along. Re-enable the --blobs
switch to do something useful, ie, add back blobs into a dump they were
otherwise suppressed from.
pg_dump as well as psql. Since psql already uses dumputils.c, while there's
not any code sharing in the other direction, this seems the easiest way.
Also, fix misinterpretation of patterns using regex | by adding parentheses
(same bug found previously in similar_escape()). This should be backpatched.
portable long options. But we have had portable long options for a long
time now, so this is obsolete. Now people have added options which *only*
work with -X but not as regular long option, so I'm putting a stop to this:
-X is deprecated; it still works, but it has been removed from the
documentation, and please don't add more of them.
return true for exactly the characters treated as whitespace by their flex
scanners. Per report from Victor Snezhko and subsequent investigation.
Also fix a passel of unsafe usages of <ctype.h> functions, that is, ye olde
char-vs-unsigned-char issue. I won't miss <ctype.h> when we are finally
able to stop using it.
by abandoning the idea that it should say SERIAL in the dump. Instead,
dump serial sequences and column defaults just like regular ones.
Add a new backend command ALTER SEQUENCE OWNED BY to let pg_dump recreate
the sequence-to-column dependency that was formerly created "behind the
scenes" by SERIAL. This restores SERIAL to being truly "just a macro"
consisting of component operations that can be stated explicitly in SQL.
Furthermore, the new command allows sequence ownership to be reassigned,
so that old mistakes can be cleaned up.
Also, downgrade the OWNED-BY dependency from INTERNAL to AUTO, since there
is no longer any very compelling argument why the sequence couldn't be
dropped while keeping the column. (This forces initdb, to be sure the
right kinds of dependencies are in there.)
Along the way, add checks to prevent ALTER OWNER or SET SCHEMA on an
owned sequence; you can now only do this indirectly by changing the
owning table's owner or schema. This is an oversight in previous
releases, but probably not worth back-patching.
the opportunity to treat COUNT(*) as a zero-argument aggregate instead
of the old hack that equated it to COUNT(1); this is materially cleaner
(no more weird ANYOID cases) and ought to be at least a tiny bit faster.
Original patch by Sergey Koposov; review, documentation, simple regression
tests, pg_dump and psql support by moi.
I take out patch for this as a promise. This is client-build support of
MS-VC6+.
Fix for different getaddrinfo structure ordering on Win32 for IPv6.
Hiroshi Saito
was invoking obj_description() for each large object chunk, instead of once
per large object. This code is new as of 8.1, which may explain why the
problem hadn't been noticed already.
o remove many WIN32_CLIENT_ONLY defines
o add WIN32_ONLY_COMPILER define
o add 3rd argument to open() for portability
o add include/port/win32_msvc directory for
system includes
Magnus Hagander
and there's only one place that's a kluge, ie, appendStringLiteralConn.
Note that pg_dump itself doesn't use appendStringLiteralConn, so its
behavior is not affected; only the other utility programs care.
o turns off escape_string_warning in pg_dumpall.c
o optionally use E'' for \password (undocumented option?)
o honor standard_conforming-strings for \copy (but not
support literal E'' strings)
o optionally use E'' for \d commands
o turn off escape_string_warning for createdb, createuser,
droplang
and standard_conforming_strings; likewise for the other client programs
that need it. As per previous discussion, a pg_dump dump now conforms
to the standard_conforming_strings setting of the source database.
We don't use E'' syntax in the dump, thereby improving portability of
the SQL. I added a SET escape_strings_warning = off command to keep
the dumps from getting a lot of back-chatter from that.
Per Coverity bug #304. Thanks to Martijn van Oosterhout for reporting it.
Zero out the pointer fields of PGresult so that these mistakes are more
easily catched, per discussion.
'off'. This allows pg_dump output with standard_conforming_strings =
'on' to generate proper strings that can be loaded into other databases
without the backslash doubling we typically do. I have added the
dumping of the standard_conforming_strings value to pg_dump.
I also added standard backslash handling for plpgsql.
identically named user and group: we merge these into a single entity
with LOGIN permission. Also, add ORDER BY commands to ensure consistent
dump ordering, for ease of comparing outputs from different installations.
make use of the recently added ability to create a shell type explicitly.
I also put in place some infrastructure to allow dump/no dump decisions
to be made separately for each database object, rather than the former
hardwired 'dump if in a dumpable schema' policy. This was needed anyway
for shell types so now seemed a convenient time to do it. The flexibility
isn't exposed to the user yet, but is ready for future extensions.
by decompiling the typdefaultbin expression, not just printing the typdefault
text which may be out-of-date or assume the wrong schema search path. (It's
the same hazard as for adbin vs adsrc in column defaults.) The catalogs.sgml
spec for pg_type implies that the correct procedure is to look to
typdefaultbin first and consider typdefault only if typdefaultbin is NULL.
I made dumping of both domains and base types do that, even though in the
current backend code typdefaultbin is always correct for domains and
typdefault for base types --- might as well try to future-proof it a little.
Per bug report from Alexander Galler.
comments on cluster global objects like databases, tablespaces, and
roles.
It touches a lot of places, but not much in the way of big changes. The
only design decision I made was to duplicate the query and manipulation
functions rather than to try and have them handle both shared and local
comments. I believe this is simpler for the code and not an issue for
callers because they know what type of object they are dealing with.
This has resulted in a shobj_description function analagous to
obj_description and backend functions [Create/Delete]SharedComments
mirroring the existing [Create/Delete]Comments functions.
pg_shdescription.h goes into src/include/catalog/
Kris Jurka
not print the owner name in the object comment.
eg:
--
-- Name: actor; Type: TABLE; Schema: public; Owner: chriskl; Tablespace:
--
Becomes:
--
-- Name: actor; Type: TABLE; Schema: public; Owner: -; Tablespace:
--
This makes it far easier to do 'user independent' dumps. Especially for
distribution to third parties.
Christopher Kings-Lynne
Continue to support GRANT ON [TABLE] for sequences for backward
compatibility; issue warning for invalid sequence permissions.
[Backward compatibility warning message.]
Add USAGE permission for sequences that allows only currval() and
nextval(), not setval().
Mention object name in grant/revoke warnings because of possible
multi-object operations.
operator names. This is needed when dumping operator definitions that have
COMMUTATOR (or similar) links to operators in other schemas.
Apparently Daniel Whitter is the first person ever to try this :-(
than owned by nobody. This results in cleaner display of language ACLs,
since the backend's aclchk.c uses the same convention. AFAICS there is
no practical difference but it's nice to avoid emitting SET SESSION
AUTHORIZATION; also this will make it easier to transition pg_dump to
some future version in which we may include an explicit ownership column
in pg_language. Per gripe from David Begley.
comment line where output as too long, and update typedefs for /lib
directory. Also fix case where identifiers were used as variable names
in the backend, but as typedefs in ecpg (favor the backend for
indenting).
Backpatch to 8.1.X.
emit when given the --clean option, in favor of individual DROP ROLE
commands. The old technique could not possibly work in 8.1, and was
never a very good idea anyway IMHO. The DROP ROLE approach has the
defect that the DROPs will fail for roles that own objects or have
privileges, but perhaps we can improve that later.
> found in a pg_dump archive. It had problems with dollar-quote tags
broken across bufferload boundaries (this may explain bug report from
Rod Taylor), also with dollar-quote literals of the form $a$a$...,
and was also confused about the rules for backslash in double quoted
identifiers (ie, they're not special). Also put in placeholder support
for E'...' literals --- this will need more work later.
really the source or destination of the archive. I think this will
resolve recent complaints that password prompting is broken in pg_restore
on Windows. Note that password prompting and reading from stdin is an
unworkable combination on Windows ... but that was true anyway.
as per my recent proposal. For now the template data is hard-wired in
proclang.c --- this should be replaced later by a new shared system
catalog, but we don't want to force initdb during 8.1 beta. This change
lets us cleanly load existing dump files even if they contain outright
wrong information about a PL's support functions, such as a wrong path
to the shared library or a missing validator function. Also, we can
revert the recent kluges to make pg_dump dump PL support functions that
are stored in pg_catalog.
While at it, I removed the code in pg_regress that replaced $libdir
with a hardcoded path for temporary installations. This is no longer
needed given our support for relocatable installations.
use these instead of its previous hack of changing pg_class.reltriggers.
Documentation is lacking, will add that later.
Patch by Satoshi Nagayasu, review and some extra work by Tom Lane.
erroring out as it has done for the last couple weeks. Document that this
form is now ignored because indexes can't usefully have different owners
from their parent tables. Fix pg_dump to not generate ALTER OWNER commands
for indexes.
whenever we generate a new OID. This prevents occasional duplicate-OID
errors that can otherwise occur once the OID counter has wrapped around.
Duplicate relfilenode values are also checked for when creating new
physical files. Per my recent proposal.
pg_strcasecmp and pg_strncasecmp ... but I see some of the former have
crept back in.
Eternal vigilance is the price of locale independence, apparently.
The Problem: Occassionally a DBA needs to dump a database to a new
encoding. In instances where the current encoding, (or lack of an
encoding, like SQL_ASCII) is poorly supported on the target database
server, it can be useful to dump into a particular encoding. But,
currently the only way to set the encoding of a pg_dump file is to
change client_encoding in postgresql.conf and restart postmaster.
This is more than a little awkward for production systems.
Magnus Hagander
into pg_catalog rather than public, and supports dumping languages whose
handlers are found there. This will make it easier to drop the public
schema if desired.
Unlike the previous patch, the comments have been updated and I have
reformatted some code to meet Alvarro's request to stick to 80 cols. (I
actually aghree with this - it makes printing the code much nicer).
I think I did the right thing w.r.t versions earlier than 7.3, but I
have no real way of checking, so that should be checked by someone with
more/older knowledge than me ;-)
Andrew Dunstan
inspection of shared catalogs. This allows pg_dumpall to continue to
work with pre-8.1 servers that likely won't have a database named postgres.
Also, suppress output of SYSID options for users and groups, since server
no longer does anything with these except emit a rude message.
There is much more to be done to update pg_dumpall for the roles feature,
but this at least makes it usable again. Per gripe from Chris K-L.
name matches the name of any parent-table constraint, without looking
at the constraint text. This is a not-very-bulletproof workaround for
the problem exhibited by Berend Tober last month. We really ought to
record constraint inheritance status in pg_constraint, but it's looking
like that may not get done for 8.1 --- and even if it does, we will
need this kluge for dumping from older servers.
literally.
Add GUC variables:
"escape_string_warning" - warn about backslashes in non-E strings
"escape_string_syntax" - supports E'' syntax?
"standard_compliant_strings" - treats backslashes literally in ''
Update code to use E'' when escapes are used.
(1) The code doesn't initialize `sum', so the initial "does the checksum
match?" test is wrong.
(2) The loop that is intended to check for a "null block" just checks
the first byte of the tar block 512 times, rather than each of the
512 bytes one time (!), which I'm guessing was the intent.
It was only through sheer luck that this worked in the first place.
Per Coverity static analysis performed by EnterpriseDB.
using the recently added lo_create() function. The restore logic in
pg_restore is greatly simplified as well, since there's no need anymore
to try to adjust database references to match a new set of blob OIDs.
unlike template0 and template1 does not have any special status in
terms of backend functionality. However, all external utilities such
as createuser and createdb now connect to "postgres" instead of
template1, and the documentation is changed to encourage people to use
"postgres" instead of template1 as a play area. This should fix some
longstanding gotchas involving unexpected propagation of database
objects by createdb (when you used template1 without understanding
the implications), as well as ameliorating the problem that CREATE
DATABASE is unhappy if anyone else is connected to template1.
Patch by Dave Page, minor editing by Tom Lane. All per recent
pghackers discussions.
pg_restore. It restores the given schemaname only. It can be used in
conjunction with the -t and other switches to make the selection very
fine grained.
Richard van den Bergg, CISSP
a warning when a variable is used as a format string for printf()
and similar functions (if the variable is derived from untrusted
data, it could include unexpected formatting sequences). This
emits too many warnings to be enabled by default, but it does
flag a few dubious constructs in the Postgres tree. This patch
fixes up the obvious variants: functions that are passed a variable
format string but no additional arguments.
Most of these are harmless (e.g. the ruleutils stuff), but there
is at least one actual bug here: if you create a trigger named
"%sfoo", pg_dump will read uninitialized memory and fail to dump
the trigger correctly.
which induced bug #1597 in addition to having several other misbehaviors
(like labeling the dump with a completion time having nothing to do with
reality). Instead just print out the desired strings where RestoreArchive
was already emitting the 'PostgreSQL database dump' and
'PostgreSQL database dump complete' strings.
be supported for all datatypes. Add CREATE AGGREGATE and pg_dump support
too. Add specialized min/max aggregates for bpchar, instead of depending
on text's min/max, because otherwise the possible use of bpchar indexes
cannot be recognized.
initdb forced because of catalog changes.
column values in -d mode. Per report from Marty Scholes. This doesn't
completely solve the issue, because we still need multiple copies of the
field value, but at least one copy can be got rid of painlessly ...
pre-7.3 pg_dump archive files: namespace isn't there, and in some cases
te->tag may already be quotified. Per report from Alan Pevec and
followup testing.
Also performed an initial run through of upgrading our Copyright date to
extend to 2005 ... first run here was very simple ... change everything
where: grep 1996-2004 && the word 'Copyright' ... scanned through the
generated list with 'less' first, and after, to make sure that I only
picked up the right entries ...
thought there couldn't be any, but the folly of this was exposed by an
example from andrew@supernews.com 5-Dec-2004. The patch applies the
identical logic already used for table constraints and defaults to ON
SELECT rules, so I have reasonable confidence in it even though it might
look like complicated logic.
be emitted too soon. The previous code got this right in the case where
the CHECK was emitted as a separate ALTER TABLE command, but not in the
case where the CHECK is emitted right in CREATE TABLE. Per report from
Slawomir Sudnik.
Note: this code is pretty ugly; it'd perhaps be better to treat comments
as independently sortable dump objects. That'd be much too invasive a
change for RC time though.
/*
* Some compilers with throw a warning knowing this test can never be
* true because off_t can't exceed the compared maximum.
*/
if (th->fileLen > MAX_TAR_MEMBER_FILELEN)
die_horribly(AH, modulename, "archive member too large for tar format\n");
clause implicitly whenever one is not given explicitly. Remove concept
of a schema having an associated tablespace, and simplify the rules for
selecting a default tablespace for a table or index. It's now just
(a) explicit TABLESPACE clause; (b) default_tablespace if that's not an
empty string; (c) database's default. This will allow pg_dump to use
SET commands instead of tablespace clauses to determine object locations
(but I didn't actually make it do so). All per recent discussions.
multiline command or to rerun the command easily later.
Whereas displaying the failed SQL command is a matter of fixing the
error
messages.
The latter is complicated by failed COPY commands which, with
die-on-errors
off, results in the data being processed as a command, so dumping the
command will dump all of the data.
In the case of long commands, should the whole command be dumped? eg.
(eg.
several pages of function definition).
In the case of the COPY command, I'm not sure what to do. Obviously, it
would be best to avoid sending the data, but the data and command are
combined (from memory). Also, the 'data' may be in the form of INSERT
statements.
Attached patch produces the first 125 chars of the command:
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC Entry 26; 1255 16449270
FUNCTION
plpgsql_call_handler() pjw
pg_restore: [archiver (db)] could not execute query: ERROR: function
"plpgsql_call_handler" already exists with same argument types
Command was: CREATE FUNCTION plpgsql_call_handler() RETURNS
language_handler
AS '/var/lib/pgsql-8.0b1/lib/plpgsql', 'plpgsql_call_han...
pg_restore: [archiver (db)] Error from TOC Entry 27; 1255 16449271
FUNCTION
plpgsql_validator(oid) pjw
pg_restore: [archiver (db)] could not execute query: ERROR: function
"plpgsql_validator" already exists with same argument types
Command was: CREATE FUNCTION plpgsql_validator(oid) RETURNS void
AS '/var/lib/pgsql-8.0b1/lib/plpgsql', 'plpgsql_validator'
LANGU...
Philip Warner
will treat any unquoted string that starts with a $ and has no preceding
identifier chars as a potential $-quote tag, it then makes sure that the
tag chars are valid. If so, it processes the $-quote.
Philip Warner
> pg_restore, as it seems that some people have scripts that rely on the
> previous "abort on error" default behavior when restoring data with a
> direct connection.
>
> Fabien Coelho
CurrentMemoryContext is DLLIMPORT on Win32. Work around that by
creating stubs in the backend for palloc/pstrdup.
Also fix pg_dumpall to do proper quoting on Win32.
Instead of putting all the OWNER TO commands at the end, it dumps then
after each object. This is WAY more readable and nice. ACLs are still
at the end.
Christopher Kings-Lynne
* Fix help text ordering
* Add back --set-session-authorization to pg_dumpall. Updated the docs
for that. Updated help for that.
* Dump ALTER USER commands for the cluster owner ("pgsql"). These are
dumped AFTER the create user and create database commands in case the
permissions to do these have been revoked.
* Dump ALTER OWNER for public schema (because it's possible to change
it). This was done by adding TOC entries for the public schema, and
filtering them out at archiver time. I also save the owner in the TOC
entry just for the public schema.
* Suppress dumping single quotes around schema_path and DateStyle
options when they are set using ALTER USER or ALTER DATABASE. Added a
comment to the steps in guc.c to remind people to update that list.
* Fix dumping in --clean mode against a pre-7.3 server. It just sets
all drop statements to assume the public schema, allowing it to restore
without error.
* Cleaned up text output. eg. Don't output -- Tablespaces comment if
there are none. Same for groups and users.
* Make the commands to DELETE FROM pg_shadow and DELETE FROM pg_group
only be output when -c mode is enabled. I'm not sure why that hasn't
been done before?!?!
This should be good for application asap, after which I will start on
regression dumping 7.0-7.4 databases.
Christopher Kings-Lynne
AUTHORIZATION commands by default. Move all GRANT and REVOKE commands
to the end of the dump to avoid restore failures in several situations.
Bring back --use-set-session-authorization option to get previous SET
behaviour
Christopher Kings-Lyne
live in database or schema's default tablespace, as per today's discussion.
Also, remove some unused keywords from the grammar (PATH, PENDANT,
VERSION), and fix ALSO, which was added as a keyword but not added
to the keyword classification lists, thus making it worse-than-reserved.
This eliminates the assumption that a serial column's sequence will
have the same name on reload that it was given in the original database.
Christopher Kings-Lynne
There are various things left to do: contrib dbsize and oid2name modules
need work, and so does the documentation. Also someone should think about
COMMENT ON TABLESPACE and maybe RENAME TABLESPACE. Also initlocation is
dead, it just doesn't know it yet.
Gavin Sherry and Tom Lane.
environment variable processing to libpq.
The patch also adds code to our client apps so we set the environment
variable directly based on our binary location, unless it is already
set. This will allow our applications to emit proper locale messages
that are generated in libpq.
timezone code and other places.
Remove elog() calls from find_my_exec; do fprintf(stderr) instead. We
can then remove the exec.c handling in the makefile because it doesn't
have to be built to suppress elog calls.
find_my_exec/find_other_exec(). Remove passing of progname to these
functions as they can find that out from argv[0], which they already
have.
Make get_progname return const char *, and update all progname variables
to be const char *.
all the code that looks for other binaries. I move FindExec into
port/exec.c (and renamed it to find_my_binary()). I also added
find_other_binary that looks for another binary in the same directory as
the calling program, and checks the version string.
The only behavior change was that initdb and pg_dump would look in the
hard-coded bindir directory if it can't find the requested binary in the
same directory as the caller. The new code throws an error. The old
behavior seemed too error prone for version mismatches.
conversion of basic ASCII letters. Remove all uses of strcasecmp and
strncasecmp in favor of new functions pg_strcasecmp and pg_strncasecmp;
remove most but not all direct uses of toupper and tolower in favor of
pg_toupper and pg_tolower. These functions use the same notions of
case folding already developed for identifier case conversion. I left
the straight locale-based folding in place for situations where we are
just manipulating user data and not trying to match it to built-in
strings --- for example, the SQL upper() function is still locale
dependent. Perhaps this will prove not to be what's wanted, but at
the moment we can initdb and pass regression tests in Turkish locale.
errors. This is the second submission, which integrates Tom comments about
localisation and exit code. I also added some comments about one sql
command which is not ignored.
Fabien COELHO
WITH/WITHOUT OIDS in dump files. This makes dump files more portable.
I have updated the pg_dump version so old binary dumps will load fine.
Pre-7.5 dumps use WITHOUT OIDS in SQL were needed, so they should be
fine.
in one query, rather than making a separate query for each object that
could have a comment. This costs relatively little space (a few tens of
K typically) and saves substantial time in databases with many objects.
I find it reduces the runtime of 'pg_dump -s regression' by about a
third.
object types, rather than by OID. This should help ensure consistent
dump output from databases that are logically the same but have different
histories, per recent discussion about 'diffing' databases. The patch
is bulky because of renaming of fields, but not very complicated.
Also, do some tweaking to cause BLOB restoration to be done in a better
order, and clean up pg_restore's textual output to exactly match pg_dump.
any restore operation, thereby ensuring that dumped data is interpreted
the same way it was dumped even if the target database has a different
encoding. Per suggestions from Pavel Stehule and others. Also,
simplify scheme for handling check_function_bodies ... we may as well
just set that at the head of the script.
reduce the number of times TopoSort() has to be executed by trying to
extract multiple dependency loops from each pass, instead of only one.
This saves about another factor of ten on the regression database.
This could be considered as another exercise in grokking Fred Brooks'
maxim: Representation *is* the essence of programming.
one (use a priority heap to keep track of items ready to output, instead
of searching the input array each time). This brings the runtime of
pg_dump back to about what it was in 7.4.
pg_depend to determine a safe dump order. Defaults and check constraints
can be emitted either as part of a table or domain definition, or
separately if that's needed to break a dependency loop. Lots of old
half-baked code for controlling dump order removed.
proposal for eventually deprecating OIDs on user tables that I posted
earlier to pgsql-hackers. pg_dump now always specifies WITH OIDS or
WITHOUT OIDS when dumping a table. The documentation has been updated.
Neil Conway
large objects. Dump all these in pg_dump; also add code to pg_dump
user-defined conversions. Make psql's large object code rely on
the backend for inserting/deleting LOB comments, instead of trying to
hack pg_description directly. Documentation and regression tests added.
Christopher Kings-Lynne, code reviewed by Tom
on pgsql-hackers.
A cast is included in the dump output if any of the objects does
not belong to a system namespace and all of the non-system namespace
objects belong to dumped namespaces. System namespace is defined
as nspname begins with "pg_".
Jan
AUTHORIZATION clause to specify the desired owner. This allows a
superuser to restore schemas owned by users without CREATE-SCHEMA
permissions (ie, schemas originally created by a superuser using
AUTHORIZATION). --no-owner can be specified to suppress the
AUTHORIZATION clause if need be.
to control object ownership. The use-set-session-authorization and
no-reconnect switches are obsolete (still accepted on the command line,
but they don't do anything). This is a precursor to fixing handling
of CREATE SCHEMA, which will be a separate commit.
getopt_long(). This is more or less the same problem as we saw earlier
with getaddrinfo() and struct addrinfo, and for the same reason: random
user-added libraries might contain the subroutine, but there's no
guarantee we will find the matching header files.