Commit Graph

27 Commits

Author SHA1 Message Date
Alvaro Herrera afc7e0ad55
Raise error on concurrent drop of partitioned index
We were already raising an error for DROP INDEX CONCURRENTLY on a
partitioned table, albeit a different and confusing one:
  ERROR:  DROP INDEX CONCURRENTLY must be first action in transaction

Change that to throw a more comprehensible error:
  ERROR:  cannot drop partitioned index \"%s\" concurrently

Michael Paquier authored the test case for indexes on temporary
partitioned tables.

Backpatch to 11, where indexes on partitioned tables were added.

Reported-by: Jan Mussler <jan.mussler@zalando.de>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/16594-d2956ca909585067@postgresql.org
2020-09-01 13:40:43 -04:00
Michael Paquier 0b48f1335d Fix assertion failure with ALTER TABLE ATTACH PARTITION and indexes
Using ALTER TABLE ATTACH PARTITION causes an assertion failure when
attempting to work on a partitioned index, because partitioned indexes
cannot have partition bounds.

The grammar of ALTER TABLE ATTACH PARTITION requires partition bounds,
but not ALTER INDEX, so mixing ALTER TABLE with partitioned indexes is
confusing.  Hence, on HEAD, prevent ALTER TABLE to attach a partition if
the relation involved is a partitioned index.  On back-branches, as
applications may rely on the existing behavior, just remove the
culprit assertion.

Reported-by: Alexander Lakhin
Author: Amit Langote, Michael Paquier
Discussion: https://postgr.es/m/16276-5cd1dcc8fb8be7b5@postgresql.org
Backpatch-through: 11
2020-03-03 13:55:41 +09:00
Michael Paquier 1df5875d39 Fix dependency handling of column drop with partitioned tables
When dropping a column on a partitioned table which has one or more
partitioned indexes, the operation was failing as dependencies with
partitioned indexes using the column dropped were not getting removed in
a way consistent with the columns involved across all the relations part
of an inheritance tree.

This commit refactors the code executing column drop so as all the
columns from an inheritance tree to remove are gathered first, and
dropped all at the end.  This way, we let the dependency machinery sort
out by itself the deletion of all the columns with the partitioned
indexes across a partition tree.

This issue has been introduced by 1d92a0c, so backpatch down to
REL_12_STABLE.

Author: Amit Langote, Michael Paquier
Reviewed-by: Álvaro Herrera, Ashutosh Sharma
Discussion: https://postgr.es/m/CA+HiwqE9kuBsZ3b5pob2-cvE8ofzPWs-og+g8bKKGnu6b4-yTQ@mail.gmail.com
Backpatch-through: 12
2019-10-13 17:51:55 +09:00
Tom Lane f4a3fdfbdc Avoid order-of-execution problems with ALTER TABLE ADD PRIMARY KEY.
Up to now, DefineIndex() was responsible for adding attnotnull constraints
to the columns of a primary key, in any case where it hadn't been
convenient for transformIndexConstraint() to mark those columns as
is_not_null.  It (or rather its minion index_check_primary_key) did this
by executing an ALTER TABLE SET NOT NULL command for the target table.

The trouble with this solution is that if we're creating the index due
to ALTER TABLE ADD PRIMARY KEY, and the outer ALTER TABLE has additional
sub-commands, the inner ALTER TABLE's operations executed at the wrong
time with respect to the outer ALTER TABLE's operations.  In particular,
the inner ALTER would perform a validation scan at a point where the
table's storage might be inconsistent with its catalog entries.  (This is
on the hairy edge of being a security problem, but AFAICS it isn't one
because the inner scan would only be interested in the tuples' null
bitmaps.)  This can result in unexpected failures, such as the one seen
in bug #15580 from Allison Kaptur.

To fix, let's remove the attempt to do SET NOT NULL from DefineIndex(),
reducing index_check_primary_key's role to verifying that the columns are
already not null.  (It shouldn't ever see such a case, but it seems wise
to keep the check for safety.)  Instead, make transformIndexConstraint()
generate ALTER TABLE SET NOT NULL subcommands to be executed ahead of
the ADD PRIMARY KEY operation in every case where it can't force the
column to be created already-not-null.  This requires only minor surgery
in parse_utilcmd.c, and it makes for a much more satisfying spec for
transformIndexConstraint(): it's no longer having to take it on faith
that someone else will handle addition of NOT NULL constraints.

To make that work, we have to move the execution of AT_SetNotNull into
an ALTER pass that executes ahead of AT_PASS_ADD_INDEX.  I moved it to
AT_PASS_COL_ATTRS, and put that after AT_PASS_ADD_COL to avoid failure
when the column is being added in the same command.  This incidentally
fixes a bug in the only previous usage of AT_PASS_COL_ATTRS, for
AT_SetIdentity: it didn't work either for a newly-added column.

Playing around with this exposed a separate bug in ALTER TABLE ONLY ...
ADD PRIMARY KEY for partitioned tables.  The intent of the ONLY modifier
in that context is to prevent doing anything that would require holding
lock for a long time --- but the implied SET NOT NULL would recurse to
the child partitions, and do an expensive validation scan for any child
where the column(s) were not already NOT NULL.  To fix that, invent a
new ALTER subcommand AT_CheckNotNull that just insists that a child
column be already NOT NULL, and apply that, not AT_SetNotNull, when
recursing to children in this scenario.  This results in a slightly laxer
definition of ALTER TABLE ONLY ... SET NOT NULL for partitioned tables,
too: that command will now work as long as all children are already NOT
NULL, whereas before it just threw up its hands if there were any
partitions.

In passing, clean up the API of generateClonedIndexStmt(): remove a
useless argument, ensure that the output argument is not left undefined,
update the header comment.

A small side effect of this change is that no-such-column errors in ALTER
TABLE ADD PRIMARY KEY now produce a different message that includes the
table name, because they are now detected by the SET NOT NULL step which
has historically worded its error that way.  That seems fine to me, so
I didn't make any effort to avoid the wording change.

The basic bug #15580 is of very long standing, and these other bugs
aren't new in v12 either.  However, this is a pretty significant change
in the way ALTER TABLE ADD PRIMARY KEY works.  On balance it seems best
not to back-patch, at least not till we get some more confidence that
this patch has no new bugs.

Patch by me, but thanks to Jie Zhang for a preliminary version.

Discussion: https://postgr.es/m/15580-d1a6de5a3d65da51@postgresql.org
Discussion: https://postgr.es/m/1396E95157071C4EBBA51892C5368521017F2E6E63@G08CNEXMBPEKD02.g08.fujitsu.local
2019-04-23 12:25:27 -04:00
Tom Lane f72d9a5e7d Remove redundant and ineffective test for btree insertion fast path.
indexing.sql's test for this feature was added along with the
feature in commit 2b2727343.  However, shortly later that test was
rendered ineffective by commit 074251db6, which limited when the
optimization would be applied, so that the test didn't test it.
Since then, commit dd299df81 added new tests (in btree_index.sql)
that actually do test the feature.  Code coverage comparisons
confirm that this test sequence adds no meaningful coverage, and
it's rather expensive, accounting for nearly half of the runtime
of indexing.sql according to my measurements.  So let's remove it.

Per advice from Peter Geoghegan.

Discussion: https://postgr.es/m/735.1554935715@sss.pgh.pa.us
2019-04-11 13:15:59 -04:00
Alvaro Herrera 126d631222 Fix partitioned index creation bug with dropped columns
ALTER INDEX .. ATTACH PARTITION fails if the partitioned table where the
index is defined contains more dropped columns than its partition, with
this message:
  ERROR:  incorrect attribute map
The cause was that one caller of CompareIndexInfo was passing the number
of attributes of the partition rather than the parent, which confused
the length check.  Repair.

This can cause pg_upgrade to fail when used on such a database.  Leave
some more objects around after regression tests, so that the case is
detected by pg_upgrade test suite.

Remove some spurious empty lines noticed while looking for other cases
of the same problem.

Discussion: https://postgr.es/m/20190326213924.GA2322@alvherre.pgsql
2019-03-26 20:19:28 -03:00
Alvaro Herrera 7e7c57bbb2 Fix dependency recording bug for partitioned PKs
When DefineIndex recurses to create constraints on partitions, it needs
to use the value returned by index_constraint_create to set up partition
dependencies.  However, in the course of fixing the DEPENDENCY_INTERNAL_AUTO
mess, commit 1d92a0c9f7 introduced some code to that function that
clobbered the return value, causing the recorded OID to be of the wrong
object.  Close examination of pg_depend after creating the tables leads
to indescribable objects :-( My sin (in commit bdc3d7fa23, while
preparing for DDL deparsing in event triggers) was to use a variable
name for the return value that's typically used for throwaway objects in
dependency-setting calls ("referenced").  Fix by changing the variable
names to match extended practice (the return value is "myself" rather
than "referenced".)

The pg_upgrade test notices the problem (in an indirect way: the pg_dump
outputs are in different order), but only if you create the objects in a
specific way that wasn't being used in the existing tests.  Add a stanza
to leave some objects around that shows the bug.

Catversion bump because preexisting databases might have bogus pg_depend
entries.

Discussion: https://postgr.es/m/20190318204235.GA30360@alvherre.pgsql
2019-03-21 18:34:29 -03:00
Tom Lane 1d92a0c9f7 Redesign the partition dependency mechanism.
The original setup for dependencies of partitioned objects had
serious problems:

1. It did not verify that a drop cascading to a partition-child object
also cascaded to at least one of the object's partition parents.  Now,
normally a child object would share all its dependencies with one or
another parent (e.g. a child index's opclass dependencies would be shared
with the parent index), so that this oversight is usually harmless.
But if some dependency failed to fit this pattern, the child could be
dropped while all its parents remain, creating a logically broken
situation.  (It's easy to construct artificial cases that break it,
such as attaching an unrelated extension dependency to the child object
and then dropping the extension.  I'm not sure if any less-artificial
cases exist.)

2. Management of partition dependencies during ATTACH/DETACH PARTITION
was complicated and buggy; for example, after detaching a partition
table it was possible to create cases where a formerly-child index
should be dropped and was not, because the correct set of dependencies
had not been reconstructed.

Less seriously, because multiple partition relationships were
represented identically in pg_depend, there was an order-of-traversal
dependency on which partition parent was cited in error messages.
We also had some pre-existing order-of-traversal hazards for error
messages related to internal and extension dependencies.  This is
cosmetic to users but causes testing problems.

To fix #1, add a check at the end of the partition tree traversal
to ensure that at least one partition parent got deleted.  To fix #2,
establish a new policy that partition dependencies are in addition to,
not instead of, a child object's usual dependencies; in this way
ATTACH/DETACH PARTITION need not cope with adding or removing the
usual dependencies.

To fix the cosmetic problem, distinguish between primary and secondary
partition dependency entries in pg_depend, by giving them different
deptypes.  (They behave identically except for having different
priorities for being cited in error messages.)  This means that the
former 'I' dependency type is replaced with new 'P' and 'S' types.

This also fixes a longstanding bug that after handling an internal
dependency by recursing to the owning object, findDependentObjects
did not verify that the current target was now scheduled for deletion,
and did not apply the current recursion level's objflags to it.
Perhaps that should be back-patched; but in the back branches it
would only matter if some concurrent transaction had removed the
internal-linkage pg_depend entry before the recursive call found it,
or the recursive call somehow failed to find it, both of which seem
unlikely.

Catversion bump because the contents of pg_depend change for
partitioning relationships.

Patch HEAD only.  It's annoying that we're not fixing #2 in v11,
but there seems no practical way to do so given that the problem
is exactly a poor choice of what entries to put in pg_depend.
We can't really fix that while staying compatible with what's
in pg_depend in existing v11 installations.

Discussion: https://postgr.es/m/CAH2-Wzkypv1R+teZrr71U23J578NnTBt2X8+Y=Odr4pOdW1rXg@mail.gmail.com
2019-02-11 14:41:17 -05:00
Alvaro Herrera ae366aa577 Detach constraints when partitions are detached
I (Álvaro) forgot to do this in eb7ed3f306, leading to undroppable
constraints after partitions are detached.  Repair.

Reported-by: Amit Langote
Author: Amit Langote
Discussion: https://postgr.es/m/c1c9b688-b886-84f7-4048-1e4ebe9b1d06@lab.ntt.co.jp
2019-01-24 00:01:32 -03:00
Alvaro Herrera 0ad41cf537 Fix unique INCLUDE indexes on partitioned tables
We were considering the INCLUDE columns as part of the key, allowing
unicity-violating rows to be inserted in different partitions.

Concurrent development conflict in eb7ed3f306 and 8224de4f42.

Reported-by: Justin Pryzby
Discussion: https://postgr.es/m/20190109065109.GA4285@telsasoft.com
2019-01-14 19:28:10 -03:00
Michael Paquier f94cec6447 Include partitioned indexes to system view pg_indexes
pg_tables already includes partitioned tables, so for consistency
pg_indexes should show partitioned indexes.

Author: Suraj Kharage
Reviewed-by: Amit Langote, Michael Paquier
Discussion: https://postgr.es/m/CAF1DzPVrYo4XNTEnc=PqVp6aLJc7LFYpYR4rX=_5pV=wJ2KdZg@mail.gmail.com
2018-12-18 16:37:51 +09:00
Michael Paquier 17f206fbc8 Set pg_class.relhassubclass for partitioned indexes
Like for relations, switching this parameter is optimistic by turning it
on each time a partitioned index gains a partition.  So seeing this
parameter set to true means that the partitioned index has or has had
partitions.  The flag cannot be reset yet for partitioned indexes, which
is something not obvious anyway as partitioned relations exist to have
partitions.

This allows to track more conveniently partition trees for indexes,
which will come in use with an upcoming patch helping in listing
partition trees with an SQL-callable function.

Author: Amit Langote
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/80306490-b5fc-ea34-4427-f29c52156052@lab.ntt.co.jp
2018-10-22 11:04:48 +09:00
Alvaro Herrera fb9e93a2c5 Fix duplicate primary keys in partitions
When using the CREATE TABLE .. PARTITION OF syntax, it's possible to
cause a partition to get two primary keys if the parent already has one.
Tighten the check to disallow that.

Reported-by: Rajkumar Raghuwanshi
Author: Amul Sul
Discussion: https://postgr.es/m/CAKcux6=OnSV3-qd8Gb6W=KPPwcCz6Fe_O_MQYjTa24__Xn8XxA@mail.gmail.com
2018-10-04 11:40:36 -03:00
Tom Lane aaf10f32a3 Fix assorted bugs in pg_get_partition_constraintdef().
It failed if passed a nonexistent relation OID, or one that was a non-heap
relation, because of blindly applying heap_open to a user-supplied OID.
This is not OK behavior for a SQL-exposed function; we have a project
policy that we should return NULL in such cases.  Moreover, since
pg_get_partition_constraintdef ought now to work on indexes, restricting
it to heaps is flat wrong anyway.

The underlying function generate_partition_qual() wasn't on board with
indexes having partition quals either, nor for that matter with rels
having relispartition set but yet null relpartbound.  (One wonders
whether the person who wrote the function comment blocks claiming that
these functions allow a missing relpartbound had ever tested it.)

Fix by testing relispartition before opening the rel, and by using
relation_open not heap_open.  (If any other relkinds ever grow the
ability to have relispartition set, the code will work with them
automatically.)  Also, don't reject null relpartbound in
generate_partition_qual.

Back-patch to v11, and all but the null-relpartbound change to v10.
(It's not really necessary to change generate_partition_qual at all
in v10, but I thought s/heap_open/relation_open/ would be a good
idea anyway just to keep the code in sync with later branches.)

Per report from Justin Pryzby.

Discussion: https://postgr.es/m/20180927200020.GJ776@telsasoft.com
2018-09-27 18:15:17 -04:00
Alvaro Herrera 41372071df Fix crash when ALTER TABLE recreates indexes on partitions
The skip_build flag was not being passed correctly when recursing to
indexes on partitions, leading to attempts to rebuild indexes when they
were not yet ready to be rebuilt.

Reported-by: Rajkumar Raghuwanshi
Discussion: https://postgr.es/m/CAKcux6mxNCGsgATwf5CGMF8g4WSupCXicCVMeKUTuWbyxHOMsQ@mail.gmail.com
2018-06-29 11:49:30 -04:00
Alvaro Herrera 475be5e790 When index recurses to a partition, map columns numbers
Two out of three code paths were mapping column numbers correctly if a
partition had different column numbers than parent table, but the most
commonly used one (recursing in CREATE INDEX to a new index on a
partition) failed to map attribute numbers in expressions.  Oddly
enough, attnums in WHERE clauses are already handled correctly
everywhere.

Reported-by: Amit Langote
Author: Amit Langote
Discussion: https://postgr.es/m/dce1fda4-e0f0-94c9-6abb-f5956a98c057@lab.ntt.co.jp
Reviewed-by: Álvaro Herrera
2018-06-22 16:45:48 -04:00
Teodor Sigaev 524054598f Fix interference between covering indexes and partitioned tables
The bug is caused due to the original IndexStmt that DefineIndex receives
being overwritten when processing the INCLUDE columns. Use separate list of
index params to propagate to child tables. Add tests covering this case.

Amit Langote and Alexander Korotkov.

Re-commit 5c6110c6a9 because it discovered a bug
fixed in c266ed31a8

Discussion: https://www.postgresql.org/message-id/CAJGNTeO%3DBguEyG8wxMpU_Vgvg3nGGzy71zUQ0RpzEn_mb0bSWA%40mail.gmail.com
2018-04-12 17:25:13 +03:00
Teodor Sigaev 92899992e1 Temporary revert 5c6110c6a9
It discovers one more bug in CompareIndexInfo(), should be fixed first.
2018-04-11 19:32:19 +03:00
Teodor Sigaev 5c6110c6a9 Fix interference between cavering indexes and partitioned tables
The bug is caused due to the original IndexStmt that DefineIndex receives
being overwritten when processing the INCLUDE columns. Use separate list of
index params to propagate to child tables. Add tests covering this case.

Amit Langote and Alexander Korotkov.
2018-04-11 16:44:26 +03:00
Andrew Dunstan 1d494b622f Remove two tests inadvertently added in 2b27273435 2018-03-26 22:53:02 +10:30
Andrew Dunstan 2b27273435 Optimize btree insertions for common case of increasing values
Remember the last page of an index insert if it's the rightmost leaf
page. If the next entry belongs on and can fit in the remembered page,
insert the new entry there as long as we can get a lock on the page.
Otherwise, fall back on the more expensive method of searching for
the right place to insert the entry.

This provides a performance improvement for the common case where an
index entry is for monotonically increasing or nearly monotonically
increasing value such as an identity field or a current timestamp.

Pavan Deolasee
Reviewed by Claudio Freire, Simon Riggs and Peter Geoghegan

Discussion: https://postgr.es/m/CABOikdM9DrupjyKZZFM5k8-0RCDs1wk6JzEkg7UgSW6QzOwMZw@mail.gmail.com
2018-03-26 22:39:24 +10:30
Alvaro Herrera 1f8a3327a9 Avoid having two PKs in a partition
If a table containing a primary key is attach as partition to a
partitioned table which has a primary key with a different definition,
we would happily create a second one in the new partition.  Oops.  It
turns out that this is because an error check in DefineIndex is executed
only if you tell it that it's being run by ALTER TABLE, and the original
code here wasn't.  Change it so that it does.

Added a couple of test cases for this, also.  A previously working test
started to fail in a different way than before patch because the new
check is called earlier; change the PK to plain UNIQUE so that the new
behavior isn't invoked, so that the test continues to verify what we
want it to verify.

Reported by: Noriyoshi Shinoda
Discussion: https://postgr.es/m/DF4PR8401MB102060EC2615EC9227CC73F7EEDF0@DF4PR8401MB1020.NAMPRD84.PROD.OUTLOOK.COM
2018-03-12 19:42:32 -03:00
Alvaro Herrera 63cbee6a78 doc: Reword restriction on partition keys in unique indexes
New wording from David G. Johnston, who noticed the unreadable original
also.  Include his suggested test case as well.

Fix a typo I noticed elsewhere while doing this.

Discussion: https://postgr.es/m/CAKFQuwY4Ld7ecxL_KAmaxwt0FUu5VcPPN2L4dh+3BeYbrdBa5g@mail.gmail.com
2018-03-12 13:32:28 -03:00
Alvaro Herrera eb7ed3f306 Allow UNIQUE indexes on partitioned tables
If we restrict unique constraints on partitioned tables so that they
must always include the partition key, then our standard approach to
unique indexes already works --- each unique key is forced to exist
within a single partition, so enforcing the unique restriction in each
index individually is enough to have it enforced globally.  Therefore we
can implement unique indexes on partitions by simply removing a few
restrictions (and adding others.)

Discussion: https://postgr.es/m/20171222212921.hi6hg6pem2w2t36z@alvherre.pgsql
Discussion: https://postgr.es/m/20171229230607.3iib6b62fn3uaf47@alvherre.pgsql
Reviewed-by: Simon Riggs, Jesper Pedersen, Peter Eisentraut, Jaime
	Casanova, Amit Langote
2018-02-19 17:40:00 -03:00
Alvaro Herrera 05fb5d6619 Ignore partitioned indexes where appropriate
get_relation_info() was too optimistic about opening indexes in
partitioned tables, which would raise errors when any queries were
planned on such tables.  Fix by ignoring any indexes of the partitioned
kind.

CLUSTER (and ALTER TABLE CLUSTER ON) had a similar problem.  Fix by
disallowing these commands in partitioned tables.

Fallout from 8b08f7d482.
2018-01-25 16:12:15 -03:00
Alvaro Herrera 189d0ff588 Fix regression tests for better stability
Per buildfarm
2018-01-19 12:31:34 -03:00
Alvaro Herrera 8b08f7d482 Local partitioned indexes
When CREATE INDEX is run on a partitioned table, create catalog entries
for an index on the partitioned table (which is just a placeholder since
the table proper has no data of its own), and recurse to create actual
indexes on the existing partitions; create them in future partitions
also.

As a convenience gadget, if the new index definition matches some
existing index in partitions, these are picked up and used instead of
creating new ones.  Whichever way these indexes come about, they become
attached to the index on the parent table and are dropped alongside it,
and cannot be dropped on isolation unless they are detached first.

To support pg_dump'ing these indexes, add commands
    CREATE INDEX ON ONLY <table>
(which creates the index on the parent partitioned table, without
recursing) and
    ALTER INDEX ATTACH PARTITION
(which is used after the indexes have been created individually on each
partition, to attach them to the parent index).  These reconstruct prior
database state exactly.

Reviewed-by: (in alphabetical order) Peter Eisentraut, Robert Haas, Amit
	Langote, Jesper Pedersen, Simon Riggs, David Rowley
Discussion: https://postgr.es/m/20171113170646.gzweigyrgg6pwsg4@alvherre.pgsql
2018-01-19 11:49:22 -03:00