postgresql/src/include/access/tsmapi.h

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

83 lines
2.6 KiB
C
Raw Normal View History

Redesign tablesample method API, and do extensive code review. The original implementation of TABLESAMPLE modeled the tablesample method API on index access methods, which wasn't a good choice because, without specialized DDL commands, there's no way to build an extension that can implement a TSM. (Raw inserts into system catalogs are not an acceptable thing to do, because we can't undo them during DROP EXTENSION, nor will pg_upgrade behave sanely.) Instead adopt an API more like procedural language handlers or foreign data wrappers, wherein the only SQL-level support object needed is a single handler function identified by having a special return type. This lets us get rid of the supporting catalog altogether, so that no custom DDL support is needed for the feature. Adjust the API so that it can support non-constant tablesample arguments (the original coding assumed we could evaluate the argument expressions at ExecInitSampleScan time, which is undesirable even if it weren't outright unsafe), and discourage sampling methods from looking at invisible tuples. Make sure that the BERNOULLI and SYSTEM methods are genuinely repeatable within and across queries, as required by the SQL standard, and deal more honestly with methods that can't support that requirement. Make a full code-review pass over the tablesample additions, and fix assorted bugs, omissions, infelicities, and cosmetic issues (such as failure to put the added code stanzas in a consistent ordering). Improve EXPLAIN's output of tablesample plans, too. Back-patch to 9.5 so that we don't have to support the original API in production.
2015-07-25 20:39:00 +02:00
/*-------------------------------------------------------------------------
*
* tsmapi.h
* API for tablesample methods
*
* Copyright (c) 2015-2022, PostgreSQL Global Development Group
Redesign tablesample method API, and do extensive code review. The original implementation of TABLESAMPLE modeled the tablesample method API on index access methods, which wasn't a good choice because, without specialized DDL commands, there's no way to build an extension that can implement a TSM. (Raw inserts into system catalogs are not an acceptable thing to do, because we can't undo them during DROP EXTENSION, nor will pg_upgrade behave sanely.) Instead adopt an API more like procedural language handlers or foreign data wrappers, wherein the only SQL-level support object needed is a single handler function identified by having a special return type. This lets us get rid of the supporting catalog altogether, so that no custom DDL support is needed for the feature. Adjust the API so that it can support non-constant tablesample arguments (the original coding assumed we could evaluate the argument expressions at ExecInitSampleScan time, which is undesirable even if it weren't outright unsafe), and discourage sampling methods from looking at invisible tuples. Make sure that the BERNOULLI and SYSTEM methods are genuinely repeatable within and across queries, as required by the SQL standard, and deal more honestly with methods that can't support that requirement. Make a full code-review pass over the tablesample additions, and fix assorted bugs, omissions, infelicities, and cosmetic issues (such as failure to put the added code stanzas in a consistent ordering). Improve EXPLAIN's output of tablesample plans, too. Back-patch to 9.5 so that we don't have to support the original API in production.
2015-07-25 20:39:00 +02:00
*
* src/include/access/tsmapi.h
*
*-------------------------------------------------------------------------
*/
#ifndef TSMAPI_H
#define TSMAPI_H
#include "nodes/execnodes.h"
#include "nodes/pathnodes.h"
Redesign tablesample method API, and do extensive code review. The original implementation of TABLESAMPLE modeled the tablesample method API on index access methods, which wasn't a good choice because, without specialized DDL commands, there's no way to build an extension that can implement a TSM. (Raw inserts into system catalogs are not an acceptable thing to do, because we can't undo them during DROP EXTENSION, nor will pg_upgrade behave sanely.) Instead adopt an API more like procedural language handlers or foreign data wrappers, wherein the only SQL-level support object needed is a single handler function identified by having a special return type. This lets us get rid of the supporting catalog altogether, so that no custom DDL support is needed for the feature. Adjust the API so that it can support non-constant tablesample arguments (the original coding assumed we could evaluate the argument expressions at ExecInitSampleScan time, which is undesirable even if it weren't outright unsafe), and discourage sampling methods from looking at invisible tuples. Make sure that the BERNOULLI and SYSTEM methods are genuinely repeatable within and across queries, as required by the SQL standard, and deal more honestly with methods that can't support that requirement. Make a full code-review pass over the tablesample additions, and fix assorted bugs, omissions, infelicities, and cosmetic issues (such as failure to put the added code stanzas in a consistent ordering). Improve EXPLAIN's output of tablesample plans, too. Back-patch to 9.5 so that we don't have to support the original API in production.
2015-07-25 20:39:00 +02:00
/*
* Callback function signatures --- see tablesample-method.sgml for more info.
*/
typedef void (*SampleScanGetSampleSize_function) (PlannerInfo *root,
RelOptInfo *baserel,
List *paramexprs,
BlockNumber *pages,
double *tuples);
typedef void (*InitSampleScan_function) (SampleScanState *node,
int eflags);
typedef void (*BeginSampleScan_function) (SampleScanState *node,
Datum *params,
int nparams,
uint32 seed);
typedef BlockNumber (*NextSampleBlock_function) (SampleScanState *node,
BlockNumber nblocks);
Redesign tablesample method API, and do extensive code review. The original implementation of TABLESAMPLE modeled the tablesample method API on index access methods, which wasn't a good choice because, without specialized DDL commands, there's no way to build an extension that can implement a TSM. (Raw inserts into system catalogs are not an acceptable thing to do, because we can't undo them during DROP EXTENSION, nor will pg_upgrade behave sanely.) Instead adopt an API more like procedural language handlers or foreign data wrappers, wherein the only SQL-level support object needed is a single handler function identified by having a special return type. This lets us get rid of the supporting catalog altogether, so that no custom DDL support is needed for the feature. Adjust the API so that it can support non-constant tablesample arguments (the original coding assumed we could evaluate the argument expressions at ExecInitSampleScan time, which is undesirable even if it weren't outright unsafe), and discourage sampling methods from looking at invisible tuples. Make sure that the BERNOULLI and SYSTEM methods are genuinely repeatable within and across queries, as required by the SQL standard, and deal more honestly with methods that can't support that requirement. Make a full code-review pass over the tablesample additions, and fix assorted bugs, omissions, infelicities, and cosmetic issues (such as failure to put the added code stanzas in a consistent ordering). Improve EXPLAIN's output of tablesample plans, too. Back-patch to 9.5 so that we don't have to support the original API in production.
2015-07-25 20:39:00 +02:00
typedef OffsetNumber (*NextSampleTuple_function) (SampleScanState *node,
BlockNumber blockno,
OffsetNumber maxoffset);
typedef void (*EndSampleScan_function) (SampleScanState *node);
/*
* TsmRoutine is the struct returned by a tablesample method's handler
* function. It provides pointers to the callback functions needed by the
* planner and executor, as well as additional information about the method.
*
* More function pointers are likely to be added in the future.
* Therefore it's recommended that the handler initialize the struct with
* makeNode(TsmRoutine) so that all fields are set to NULL. This will
* ensure that no fields are accidentally left undefined.
*/
typedef struct TsmRoutine
{
NodeTag type;
/* List of datatype OIDs for the arguments of the TABLESAMPLE clause */
List *parameterTypes;
/* Can method produce repeatable samples across, or even within, queries? */
bool repeatable_across_queries;
bool repeatable_across_scans;
/* Functions for planning a SampleScan on a physical table */
SampleScanGetSampleSize_function SampleScanGetSampleSize;
/* Functions for executing a SampleScan on a physical table */
InitSampleScan_function InitSampleScan; /* can be NULL */
BeginSampleScan_function BeginSampleScan;
NextSampleBlock_function NextSampleBlock; /* can be NULL */
NextSampleTuple_function NextSampleTuple;
EndSampleScan_function EndSampleScan; /* can be NULL */
} TsmRoutine;
/* Functions in access/tablesample/tablesample.c */
extern TsmRoutine *GetTsmRoutine(Oid tsmhandler);
#endif /* TSMAPI_H */