Avoid assuming that index-only scan data matches the index's rowtype.

In general the data returned by an index-only scan should have the
datatypes originally computed by FormIndexDatum.  If the index opclasses
use "storage" datatypes different from their input datatypes, the scan
tuple will not have the same rowtype attributed to the index; but we had
a hard-wired assumption that that was true in nodeIndexonlyscan.c.  We'd
already hacked around the issue for the one case where the types are
different in btree indexes (btree name_ops), but this would definitely
come back to bite us if we ever implement index-only scans in GiST.

To fix, require the index AM to explicitly provide the tupdesc for the
tuple it is returning.  btree can just pass back the index's tupdesc, but
GiST will have to work harder when and if it supports index-only scans.

I had previously proposed fixing this by allowing the index AM to fill the
scan tuple slot directly; but on reflection that seemed like a module
layering violation, since TupleTableSlots are creatures of the executor.
At least in the btree case, it would also be less efficient, since the
tuple deconstruction work would occur even for rows later found to be
invisible to the scan's snapshot.
This commit is contained in:
Tom Lane 2011-10-16 19:15:04 -04:00
parent e661c3dfd3
commit 336c1d7a51
5 changed files with 17 additions and 10 deletions

View File

@ -396,7 +396,8 @@ amgettuple (IndexScanDesc scan,
row), then on success it must also check
<literal>scan-&gt;xs_want_itup</>, and if that is true it must return
the original indexed data for the index entry, in the form of an
<structname>IndexTuple</> pointer stored at <literal>scan-&gt;xs_itup</>.
<structname>IndexTuple</> pointer stored at <literal>scan-&gt;xs_itup</>,
with tuple descriptor <literal>scan-&gt;xs_itupdesc</>.
(Management of the data referenced by the pointer is the access method's
responsibility. The data must remain good at least until the next
<function>amgettuple</>, <function>amrescan</>, or <function>amendscan</>

View File

@ -112,6 +112,7 @@ RelationGetIndexScan(Relation indexRelation, int nkeys, int norderbys)
scan->opaque = NULL;
scan->xs_itup = NULL;
scan->xs_itupdesc = NULL;
ItemPointerSetInvalid(&scan->xs_ctup.t_self);
scan->xs_ctup.t_data = NULL;

View File

@ -433,12 +433,15 @@ btbeginscan(PG_FUNCTION_ARGS)
/*
* We don't know yet whether the scan will be index-only, so we do not
* allocate the tuple workspace arrays until btrescan.
* allocate the tuple workspace arrays until btrescan. However, we set up
* scan->xs_itupdesc whether we'll need it or not, since that's so cheap.
*/
so->currTuples = so->markTuples = NULL;
so->currPos.nextTupleOffset = 0;
so->markPos.nextTupleOffset = 0;
scan->xs_itupdesc = RelationGetDescr(rel);
scan->opaque = so;
PG_RETURN_POINTER(scan);

View File

@ -36,7 +36,7 @@
static TupleTableSlot *IndexOnlyNext(IndexOnlyScanState *node);
static void StoreIndexTuple(TupleTableSlot *slot, IndexTuple itup,
Relation indexRel);
TupleDesc itupdesc);
/* ----------------------------------------------------------------
@ -114,7 +114,7 @@ IndexOnlyNext(IndexOnlyScanState *node)
/*
* Fill the scan tuple slot with data from the index.
*/
StoreIndexTuple(slot, scandesc->xs_itup, scandesc->indexRelation);
StoreIndexTuple(slot, scandesc->xs_itup, scandesc->xs_itupdesc);
/*
* If the index was lossy, we have to recheck the index quals.
@ -151,25 +151,25 @@ IndexOnlyNext(IndexOnlyScanState *node)
* right now we don't need it elsewhere.
*/
static void
StoreIndexTuple(TupleTableSlot *slot, IndexTuple itup, Relation indexRel)
StoreIndexTuple(TupleTableSlot *slot, IndexTuple itup, TupleDesc itupdesc)
{
TupleDesc indexDesc = RelationGetDescr(indexRel);
int nindexatts = indexDesc->natts;
int nindexatts = itupdesc->natts;
Datum *values = slot->tts_values;
bool *isnull = slot->tts_isnull;
int i;
/*
* Note: we must use the index relation's tupdesc in index_getattr, not
* Note: we must use the tupdesc supplied by the AM in index_getattr, not
* the slot's tupdesc, in case the latter has different datatypes (this
* happens for btree name_ops in particular). They'd better have the same
* number of columns though.
* number of columns though, as well as being datatype-compatible which
* is something we can't so easily check.
*/
Assert(slot->tts_tupleDescriptor->natts == nindexatts);
ExecClearTuple(slot);
for (i = 0; i < nindexatts; i++)
values[i] = index_getattr(itup, i + 1, indexDesc, &isnull[i]);
values[i] = index_getattr(itup, i + 1, itupdesc, &isnull[i]);
ExecStoreVirtualTuple(slot);
}

View File

@ -17,6 +17,7 @@
#include "access/genam.h"
#include "access/heapam.h"
#include "access/itup.h"
#include "access/tupdesc.h"
typedef struct HeapScanDescData
@ -80,6 +81,7 @@ typedef struct IndexScanDescData
/* in an index-only scan, this is valid after a successful amgettuple */
IndexTuple xs_itup; /* index tuple returned by AM */
TupleDesc xs_itupdesc; /* rowtype descriptor of xs_itup */
/* xs_ctup/xs_cbuf/xs_recheck are valid after a successful index_getnext */
HeapTupleData xs_ctup; /* current heap tuple, if any */