Commit Graph

8609 Commits

Author SHA1 Message Date
Tom Lane
5f5c014590 Allow RECORD and RECORD[] to be specified in function coldeflists.
We can't allow these pseudo-types to be used as table column types,
because storing an anonymous record value in a table would result
in data that couldn't be understood by other sessions.  However,
it seems like there's no harm in allowing the case in a column
definition list that's specifying what a function-returning-record
returns.  The data involved is all local to the current session,
so we should be just as able to resolve its actual tuple type as
we are for the function-returning-record's top-level tuple output.

Elvis Pranskevichus, with cosmetic changes by me

Discussion: https://postgr.es/m/11038447.kQ5A9Uj5xi@hammer.magicstack.net
2019-01-30 19:25:33 -05:00
Tom Lane
fa2cf164aa Rename nodes/relation.h to nodes/pathnodes.h.
The old name of this file was never a very good indication of what it
was for.  Now that there's also access/relation.h, we have a potential
confusion hazard as well, so let's rename it to something more apropos.
Per discussion, "pathnodes.h" is reasonable, since a good fraction of
the file is Path node definitions.

While at it, tweak a couple of other headers that were gratuitously
importing relation.h into modules that don't need it.

Discussion: https://postgr.es/m/7719.1548688728@sss.pgh.pa.us
2019-01-29 16:49:25 -05:00
Tom Lane
f09346a9c6 Refactor planner's header files.
Create a new header optimizer/optimizer.h, which exposes just the
planner functions that can be used "at arm's length", without need
to access Paths or the other planner-internal data structures defined
in nodes/relation.h.  This is intended to provide the whole planner
API seen by most of the rest of the system; although FDWs still need
to use additional stuff, and more thought is also needed about just
what selfuncs.c should rely on.

The main point of doing this now is to limit the amount of new
#include baggage that will be needed by "planner support functions",
which I expect to introduce later, and which will be in relevant
datatype modules rather than anywhere near the planner.

This commit just moves relevant declarations into optimizer.h from
other header files (a couple of which go away because everything
got moved), and adjusts #include lists to match.  There's further
cleanup that could be done if we want to decide that some stuff
being exposed by optimizer.h doesn't belong in the planner at all,
but I'll leave that for another day.

Discussion: https://postgr.es/m/11460.1548706639@sss.pgh.pa.us
2019-01-29 15:48:51 -05:00
Tom Lane
a1b8c41e99 Make some small planner API cleanups.
Move a few very simple node-creation and node-type-testing functions
from the planner's clauses.c to nodes/makefuncs and nodes/nodeFuncs.
There's nothing planner-specific about them, as evidenced by the
number of other places that were using them.

While at it, rename and_clause() etc to is_andclause() etc, to clarify
that they are node-type-testing functions not node-creation functions.
And use "static inline" implementations for the shortest ones.

Also, modify flatten_join_alias_vars() and some subsidiary functions
to take a Query not a PlannerInfo to define the join structure that
Vars should be translated according to.  They were only using the
"parse" field of the PlannerInfo anyway, so this just requires removing
one level of indirection.  The advantage is that now parse_agg.c can
use flatten_join_alias_vars() without the horrid kluge of creating an
incomplete PlannerInfo, which will allow that file to be decoupled from
relation.h in a subsequent patch.

Discussion: https://postgr.es/m/11460.1548706639@sss.pgh.pa.us
2019-01-29 15:26:44 -05:00
Andres Freund
da05eb51de Fix LLVM related headers to compile standalone (to fix cpluspluscheck).
Previously llvmjit.h #error'ed when USE_LLVM was not defined, to
prevent it from being included from code not having #ifdef USE_LLVM
guards - but that's not actually that useful after, during the
development of JIT support, LLVM related code was moved into a
separately compiled .so.  Having that #error means cpluspluscheck
doesn't work when llvm support isn't enabled, which isn't great.

Similarly add USE_LLVM guards to llvmjit_emit.h, and additionally make
sure it compiles standalone.

Per complaint from Tom Lane.

Author: Andres Freund
Discussion: https://postgr.es/m/19808.1548692361@sss.pgh.pa.us
Backpatch: 11, where JIT support was added
2019-01-28 18:05:52 -08:00
Tom Lane
4be058fe9e In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner.  It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it.  prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer.  We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about.  Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.

For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall.  However testing says that the
penalty is very small, close to the noise level.  In more complex queries,
this is able to find optimizations that we could not find before.

The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before).  To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)

Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.

Patch by me, reviewed by David Rowley and Mark Dilger

Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 17:54:23 -05:00
Andres Freund
5c11867512 Install JIT related headers.
There's no reason not to install these, and jit.h can be useful for
users of e.g. planner hooks.

Author: Donald Dong
Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/296D405F-7F95-49F1-B565-389D6AA78505@csumb.edu
Backpatch: 11-, where JIT compilation was introduced
2019-01-28 13:51:12 -08:00
Amit Kapila
a23676503b Revert "Avoid creation of the free space map for small heap relations."
This reverts commit ac88d2962a.
2019-01-28 11:31:44 +05:30
Amit Kapila
ac88d2962a Avoid creation of the free space map for small heap relations.
Previously, all heaps had FSMs. For very small tables, this means that the
FSM took up more space than the heap did. This is wasteful, so now we
refrain from creating the FSM for heaps with 4 pages or fewer. If the last
known target block has insufficient space, we still try to insert into some
other page before giving up and extending the relation, since doing
otherwise leads to table bloat. Testing showed that trying every page
penalized performance slightly, so we compromise and try every other page.
This way, we visit at most two pages. Any pages with wasted free space
become visible at next relation extension, so we still control table bloat.
As a bonus, directly attempting one or two pages can even be faster than
consulting the FSM would have been.

Once the FSM is created for a heap we don't remove it even if somebody
deletes all the rows from the corresponding relation.  We don't think it is
a useful optimization as it is quite likely that relation will again grow
to the same size.

Author: John Naylor with design inputs and some code contribution by Amit Kapila
Reviewed-by: Amit Kapila
Tested-by: Mithun C Y
Discussion: https://www.postgresql.org/message-id/CAJVSVGWvB13PzpbLEecFuGFc5V2fsO736BsdTakPiPAcdMM5tQ@mail.gmail.com
2019-01-28 08:14:06 +05:30
Andres Freund
a9c35cf85c Change function call information to be variable length.
Before this change FunctionCallInfoData, the struct arguments etc for
V1 function calls are stored in, always had space for
FUNC_MAX_ARGS/100 arguments, storing datums and their nullness in two
arrays.  For nearly every function call 100 arguments is far more than
needed, therefore wasting memory. Arg and argnull being two separate
arrays also guarantees that to access a single argument, two
cachelines have to be touched.

Change the layout so there's a single variable-length array with pairs
of value / isnull. That drastically reduces memory consumption for
most function calls (on x86-64 a two argument function now uses
64bytes, previously 936 bytes), and makes it very likely that argument
value and its nullness are on the same cacheline.

Arguments are stored in a new NullableDatum struct, which, due to
padding, needs more memory per argument than before. But as usually
far fewer arguments are stored, and individual arguments are cheaper
to access, that's still a clear win.  It's likely that there's other
places where conversion to NullableDatum arrays would make sense,
e.g. TupleTableSlots, but that's for another commit.

Because the function call information is now variable-length
allocations have to take the number of arguments into account. For
heap allocations that can be done with SizeForFunctionCallInfoData(),
for on-stack allocations there's a new LOCAL_FCINFO(name, nargs) macro
that helps to allocate an appropriately sized and aligned variable.

Some places with stack allocation function call information don't know
the number of arguments at compile time, and currently variably sized
stack allocations aren't allowed in postgres. Therefore allow for
FUNC_MAX_ARGS space in these cases. They're not that common, so for
now that seems acceptable.

Because of the need to allocate FunctionCallInfo of the appropriate
size, older extensions may need to update their code. To avoid subtle
breakages, the FunctionCallInfoData struct has been renamed to
FunctionCallInfoBaseData. Most code only references FunctionCallInfo,
so that shouldn't cause much collateral damage.

This change is also a prerequisite for more efficient expression JIT
compilation (by allocating the function call information on the stack,
allowing LLVM to optimize it away); previously the size of the call
information caused problems inside LLVM's optimizer.

Author: Andres Freund
Reviewed-By: Tom Lane
Discussion: https://postgr.es/m/20180605172952.x34m5uz6ju6enaem@alap3.anarazel.de
2019-01-26 14:17:52 -08:00
Michael Paquier
c9b75c5838 Simplify restriction handling of two-phase commit for temporary objects
There were two flags used to track the access to temporary tables and
to the temporary namespace of a session which are used to restrict
PREPARE TRANSACTION, however the first control flag is a concept
included in the second.  This removes the flag for temporary table
tracking, keeping around only the one at namespace level.

Author: Michael Paquier
Reviewed-by: Álvaro Herrera
Discussion: https://postgr.es/m/20190118053126.GH1883@paquier.xyz
2019-01-26 10:45:23 +09:00
Tom Lane
18c0da88a5 Split QTW_EXAMINE_RTES flag into QTW_EXAMINE_RTES_BEFORE/_AFTER.
This change allows callers of query_tree_walker() to choose whether
to visit an RTE before or after visiting the contents of the RTE
(i.e., prefix or postfix tree order).  All existing users of
QTW_EXAMINE_RTES want the QTW_EXAMINE_RTES_BEFORE behavior, but
an upcoming patch will want QTW_EXAMINE_RTES_AFTER, and it seems
like a potentially useful change on its own.

Andreas Karlsson (extracted from CTE inlining patch)

Discussion: https://postgr.es/m/8810.1542402910@sss.pgh.pa.us
2019-01-25 17:09:45 -05:00
Peter Eisentraut
7c079d7417 Allow generalized expression syntax for partition bounds
Previously, only literals were allowed.  This change allows general
expressions, including functions calls, which are evaluated at the
time the DDL command is executed.

Besides offering some more functionality, it simplifies the parser
structures and removes some inconsistencies in how the literals were
handled.

Author: Kyotaro Horiguchi, Tom Lane, Amit Langote
Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/9f88b5e0-6da2-5227-20d0-0d7012beaa1c@lab.ntt.co.jp/
2019-01-25 11:28:49 +01:00
Michael Paquier
289198c0d9 Remove argument isprimary from index_build()
The flag was introduced in 3fdeb18, but f66e8bf actually forgot to
finish the cleanup as index_update_stats() has simplified its
interface.

Author: Michael Paquier
Discussion: https://postgr.es/m/20190122080852.GB3873@paquier.xyz
2019-01-24 07:57:09 +09:00
Heikki Linnakangas
95931133a9 Fix misc typos in comments.
Spotted mostly by Fabien Coelho.

Discussion: https://www.postgresql.org/message-id/alpine.DEB.2.21.1901230947050.16643@lancre
2019-01-23 13:39:00 +02:00
Andres Freund
346ed70b0a Rename RelationData.rd_amroutine to rd_indam.
The upcoming table AM support makes rd_amroutine to generic, as its
only about index AMs. The new name makes that clear, and is shorter to
boot.

Author: Andres Freund
Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
2019-01-21 17:36:55 -08:00
Andres Freund
c91560defc Move remaining code from tqual.[ch] to heapam.h / heapam_visibility.c.
Given these routines are heap specific, and that there will be more
generic visibility support in via table AM, it makes sense to move the
prototypes to heapam.h (routines like HeapTupleSatisfiesVacuum will
not be exposed in a generic fashion, because they are too storage
specific).

Similarly, the code in tqual.c is specific to heap, so moving it into
access/heap/ makes sense.

Author: Andres Freund
Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
2019-01-21 17:07:10 -08:00
Andres Freund
b7eda3e0e3 Move generic snapshot related code from tqual.h to snapmgr.h.
The code in tqual.c is largely heap specific. Due to the upcoming
pluggable storage work, it therefore makes sense to move it into
access/heap/ (as the file's header notes, the tqual name isn't very
good).

But the various statically allocated snapshot and snapshot
initialization functions are now (see previous commit) generic and do
not depend on functions declared in tqual.h anymore. Therefore move.
Also move XidInMVCCSnapshot as that's useful for future AMs, and
already used outside of tqual.c.

Author: Andres Freund
Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
2019-01-21 17:06:41 -08:00
Andres Freund
63746189b2 Change snapshot type to be determined by enum rather than callback.
This is in preparation for allowing the same snapshot be used for
different table AMs. With the current callback based approach we would
need one callback for each supported AM, which clearly would not be
extensible.  Thus add a new Snapshot->snapshot_type field, and move
the dispatch into HeapTupleSatisfiesVisibility() (which is now a
function). Later work will then dispatch calls to
HeapTupleSatisfiesVisibility() and other AMs visibility functions
depending on the type of the table.  The central SnapshotType enum
also seems like a good location to centralize documentation about the
intended behaviour of various types of snapshots.

As tqual.h isn't included by bufmgr.h any more (as HeapTupleSatisfies*
isn't referenced by TestForOldSnapshot() anymore) a few files now need
to include it directly.

Author: Andres Freund, loosely based on earlier work by Haribabu Kommi
Discussion:
    https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
    https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql
2019-01-21 17:03:15 -08:00
Tom Lane
ee27584c4a Second try at fixing ecpglib thread-safety problem.
While Windows (allegedly) has _configthreadlocale() pretty far back,
it seems MinGW didn't acquire support for that till more recently.
Fortunately, we can use an autoconf probe on that toolchain,
instead of guessing whether it's there.  (Hm, I wonder whether Cygwin
will need this also.)

Per buildfarm.

Discussion: https://postgr.es/m/20190121193512.tdmcnic2yjxlufaw@alap3.anarazel.de
2019-01-21 16:17:10 -05:00
Andres Freund
e7cc78ad43 Remove superfluous tqual.h includes.
Most of these had been obsoleted by 568d4138c / the SnapshotNow
removal.

This is is preparation for moving most of tqual.[ch] into either
snapmgr.h or heapam.h, which in turn is in preparation for pluggable
table AMs.

Author: Andres Freund
Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
2019-01-21 12:15:02 -08:00
Andres Freund
4b21acf522 Introduce access/{table.h, relation.h}, for generic functions from heapam.h.
access/heapam contains functions that are very storage specific (say
heap_insert() and a lot of lower level functions), and fairly generic
infrastructure like relation_open(), heap_open() etc.  In the upcoming
pluggable storage work we're introducing a layer between table
accesses in general and heapam, to allow for different storage
methods. For a bit cleaner separation it thus seems advantageous to
move generic functions like the aforementioned to their own headers.

access/relation.h will contain relation_open() etc, and access/table.h
will contain table_open() (formerly known as heap_open()). I've decided
for table.h not to include relation.h, but we might change that at a
later stage.

relation.h already exists in another directory, but the other
plausible name (rel.h) also conflicts. It'd be nice if there were a
non-conflicting name, but nobody came up with a suggestion. It's
possible that the appropriate way to address the naming conflict would
be to rename nodes/relation.h, which isn't particularly well named.

To avoid breaking a lot of extensions that just use heap_open() etc,
table.h has macros mapping the old names to the new ones, and heapam.h
includes relation, table.h.  That also allows to keep the
bulk renaming of existing callers in a separate commit.

Author: Andres Freund
Discussion: https://postgr.es/m/20190111000539.xbv7s6w7ilcvm7dp@alap3.anarazel.de
2019-01-21 10:51:36 -08:00
Tom Lane
8eb4a9312c Avoid thread-safety problem in ecpglib.
ecpglib attempts to force the LC_NUMERIC locale to "C" while reading
server output, to avoid problems with strtod() and related functions.
Historically it's just issued setlocale() calls to do that, but that
has major problems if we're in a threaded application.  setlocale()
itself is not required by POSIX to be thread-safe (and indeed is not,
on recent OpenBSD).  Moreover, its effects are process-wide, so that
we could cause unexpected results in other threads, or another thread
could change our setting.

On platforms having uselocale(), which is required by POSIX:2008,
we can avoid these problems by using uselocale() instead.  Windows
goes its own way as usual, but we can make it safe by using
_configthreadlocale().  Platforms having neither continue to use the
old code, but that should be pretty much nobody among current systems.

This should get back-patched, but let's see what the buildfarm
thinks of it first.

Michael Meskes and Tom Lane; thanks also to Takayuki Tsunakawa.

Discussion: https://postgr.es/m/31420.1547783697@sss.pgh.pa.us
2019-01-21 12:07:02 -05:00
Tomas Vondra
31f3817402 Allow COPY FROM to filter data using WHERE conditions
Extends the COPY FROM command with a WHERE condition, which allows doing
various types of filtering while importing the data (random sampling,
condition on a data column, etc.).  Until now such filtering required
either preprocessing of the input data, or importing all data and then
filtering in the database. COPY FROM ... WHERE is an easy-to-use and
low-overhead alternative for most simple cases.

Author: Surafel Temesgen
Reviewed-by: Tomas Vondra, Masahiko Sawada, Lim Myungkyu
Discussion: https://www.postgresql.org/message-id/flat/CALAY4q_DdpWDuB5-Zyi-oTtO2uSk8pmy+dupiRe3AvAc++1imA@mail.gmail.com
2019-01-20 00:22:14 +01:00
Magnus Hagander
0301db623d Replace @postgresql.org with @lists.postgresql.org for mailinglists
Commit c0d0e54084 replaced the ones in the documentation, but missed out
on the ones in the code. Replace those as well, but unlike c0d0e54084,
don't backpatch the code changes to avoid breaking translations.
2019-01-19 19:06:35 +01:00
Tom Lane
69bcd718df Use our own getopt() on OpenBSD.
Recent OpenBSD (at least 5.9 and up) has a version of getopt(3)
that will not cope with the "-:" spec we use to accept double-dash
options in postgres.c and postmaster.c.  Admittedly, that's a hack
because POSIX only requires getopt() to allow alphanumeric option
characters.  I have no desire to find another way, however, so
let's just do what we were already doing on Solaris: force use
of our own src/port/getopt.c implementation.

In passing, improve some of the comments around said implementation.

Per buildfarm and local testing.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/30197.1547835700@sss.pgh.pa.us
2019-01-18 15:06:26 -05:00
Alvaro Herrera
03afae201f Move CloneForeignKeyConstraints to tablecmds.c
My commit 3de241dba8 introduced some code to create a clone of a
foreign key to a partition, but I put it in pg_constraint.c because it
was too close to the contents of the pg_constraint row.  With the
previous commit that split out the constraint tuple deconstruction into
its own routine, it makes more sense to have the FK-cloning function in
tablecmds.c, mostly because its static subroutine can then be used by a
future bugfix.

My initial posting of this patch had this routine as static in
tablecmds.c, but sadly this function is already part of the Postgres 11
ABI as exported from pg_constraint.c, so keep it as exported also just
to avoid breaking any possible users of it.
2019-01-18 15:00:06 -03:00
Alvaro Herrera
0080396dad Refactor duplicate code into DeconstructFkConstraintRow
My commit 3de241dba8 introduced some code (in tablecmds.c) to obtain
data from a pg_constraint row for a foreign key, that already existed in
ri_triggers.c.  Split it out into its own routine in pg_constraint.c,
where it naturally belongs.

No functional code changes, only code movement.

Backpatch to pg11, because a future bugfix is simpler after this.
2019-01-18 14:59:44 -03:00
Michael Paquier
c5660e0aa5 Restrict the use of temporary namespace in two-phase transactions
Attempting to use a temporary table within a two-phase transaction is
forbidden for ages.  However, there have been uncovered grounds for
a couple of other object types and commands which work on temporary
objects with two-phase commit.  In short, trying to create, lock or drop
an object on a temporary schema should not be authorized within a
two-phase transaction, as it would cause its state to create
dependencies with other sessions, causing all sorts of side effects with
the existing session or other sessions spawned later on trying to use
the same temporary schema name.

Regression tests are added to cover all the grounds found, the original
report mentioned function creation, but monitoring closer there are many
other patterns with LOCK, DROP or CREATE EXTENSION which are involved.
One of the symptoms resulting in combining both is that the session
which used the temporary schema is not able to shut down completely,
waiting for being able to drop the temporary schema, something that it
cannot complete because of the two-phase transaction involved with
temporary objects.  In this case the client is able to disconnect but
the session remains alive on the backend-side, potentially blocking
connection backend slots from being used.  Other problems reported could
also involve server crashes.

This is back-patched down to v10, which is where 9b013dc has introduced
MyXactFlags, something that this patch relies on.

Reported-by: Alexey Bashtanov
Author: Michael Paquier
Reviewed-by: Masahiko Sawada
Discussion: https://postgr.es/m/5d910e2e-0db8-ec06-dd5f-baec420513c3@imap.cc
Backpatch-through: 10
2019-01-18 09:21:44 +09:00
Alvaro Herrera
d723f56872 Reorganize planner code moved in b60c397599
It seems modules are better defined like this instead of the original
split.

Per complaints from David Rowley as well as Amit Langote's self review.
Discussion: https://postgr.es/m/CAKJS1f988rsyhwvLgfT-y1UCYUfXDOv67ENQk=v24OxhsZOzZw@mail.gmail.com
2019-01-16 16:27:44 -03:00
Andres Freund
285d8e1205 Move vacuumlazy.c into access/heap.
It's heap table storage specific code that can't realistically be
generalized into table AM agnostic code.

Author: Andres Freund
Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
2019-01-15 12:06:19 -08:00
Tom Lane
1c53c4dec3 Finish reverting "recheck_on_update" patch.
This reverts commit c203d6cf8 and some follow-on fixes, completing the
task begun in commit 5d28c9bd7.  If that feature is ever resurrected,
the code will look quite a bit different from this, so it seems best
to start from a clean slate.

The v11 branch is not touched; in that branch, the recheck_on_update
storage option remains present, but nonfunctional and undocumented.

Discussion: https://postgr.es/m/20190114223409.3tcvejfhlvbucrv5@alap3.anarazel.de
2019-01-15 12:07:10 -05:00
Andres Freund
0944ec54de Don't include genam.h from execnodes.h and relscan.h anymore.
This is the genam.h equivalent of 4c850ecec6 (which removed
heapam.h from a lot of other headers).  There's still a few header
includes of genam.h, but not from central headers anymore.

As a few headers are not indirectly included anymore, execnodes.h and
relscan.h need a few additional includes. Some of the depended on
types were replacable by using the underlying structs, but e.g. for
Snapshot in execnodes.h that'd have gotten more invasive than
reasonable in this commit.

Like the aforementioned commit 4c850ecec6, this requires adding new
genam.h includes to a number of backend files, which likely is also
required in a few external projects.

Author: Andres Freund
Discussion: https://postgr.es/m/20190114000701.y4ttcb74jpskkcfb@alap3.anarazel.de
2019-01-14 17:02:12 -08:00
Andres Freund
774a975c9a Make naming of tupdesc related structs more consistent with the rest of PG.
We usually don't change the name of structs between the struct name
itself and the name of the typedef. Additionally, structs that are
usually used via a typedef that hides being a pointer, are commonly
suffixed Data.  Change tupdesc code to follow those convention.

This is triggered by a future patch that intends to forward declare
TupleDescData in another header - keeping with the naming scheme makes
that easier to understand.

Author: Andres Freund
Discussion: https://postgr.es/m/20190114000701.y4ttcb74jpskkcfb@alap3.anarazel.de
2019-01-14 16:25:50 -08:00
Andres Freund
e451dd5521 Remove too generically named MissingPtr typedef.
As there's only a single user of the typedef in the entire codebase,
just use the underlying struct directly.

Per complaint from Alvaro Herrera

Author: Andres Freund
Discussion: https://postgr.es/m/201901141836.oxtm4uzc63j3@alvherre.pgsql
2019-01-14 16:25:50 -08:00
Andres Freund
4c850ecec6 Don't include heapam.h from others headers.
heapam.h previously was included in a number of widely used
headers (e.g. execnodes.h, indirectly in executor.h, ...). That's
problematic on its own, as heapam.h contains a lot of low-level
details that don't need to be exposed that widely, but becomes more
problematic with the upcoming introduction of pluggable table storage
- it seems inappropriate for heapam.h to be included that widely
afterwards.

heapam.h was largely only included in other headers to get the
HeapScanDesc typedef (which was defined in heapam.h, even though
HeapScanDescData is defined in relscan.h). The better solution here
seems to be to just use the underlying struct (forward declared where
necessary). Similar for BulkInsertState.

Another problem was that LockTupleMode was used in executor.h - parts
of the file tried to cope without heapam.h, but due to the fact that
it indirectly included it, several subsequent violations of that goal
were not not noticed. We could just reuse the approach of declaring
parameters as int, but it seems nicer to move LockTupleMode to
lockoptions.h - that's not a perfect location, but also doesn't seem
bad.

As a number of files relied on implicitly included heapam.h, a
significant number of files grew an explicit include. It's quite
probably that a few external projects will need to do the same.

Author: Andres Freund
Reviewed-By: Alvaro Herrera
Discussion: https://postgr.es/m/20190114000701.y4ttcb74jpskkcfb@alap3.anarazel.de
2019-01-14 16:24:41 -08:00
Tom Lane
1db5667bac Avoid sharing PARAM_EXEC slots between different levels of NestLoop.
Up to now, createplan.c attempted to share PARAM_EXEC slots for
NestLoopParams across different plan levels, if the same underlying Var
was being fed down to different righthand-side subplan trees by different
NestLoops.  This was, I think, more of an artifact of using subselect.c's
PlannerParamItem infrastructure than an explicit design goal, but anyway
that was the end result.

This works well enough as long as the plan tree is executing synchronously,
but the feature whereby Gather can execute the parallelized subplan locally
breaks it.  An upper NestLoop node might execute for a row retrieved from
a parallel worker, and assign a value for a PARAM_EXEC slot from that row,
while the leader's copy of the parallelized subplan is suspended with a
different active value of the row the Var comes from.  When control
eventually returns to the leader's subplan, it gets the wrong answers if
the same PARAM_EXEC slot is being used within the subplan, as reported
in bug #15577 from Bartosz Polnik.

This is pretty reminiscent of the problem fixed in commit 46c508fbc, and
the proper fix seems to be the same: don't try to share PARAM_EXEC slots
across different levels of controlling NestLoop nodes.

This requires decoupling NestLoopParam handling from PlannerParamItem
handling, although the logic remains somewhat similar.  To avoid bizarre
division of labor between subselect.c and createplan.c, I decided to move
all the param-slot-assignment logic for both cases out of those files
and put it into a new file paramassign.c.  Hopefully it's a bit better
documented now, too.

A regression test case for this might be nice, but we don't know a
test case that triggers the problem with a suitably small amount
of data.

Back-patch to 9.6 where we added Gather nodes.  It's conceivable that
related problems exist in older branches; but without some evidence
for that, I'll leave the older branches alone.

Discussion: https://postgr.es/m/15577-ca61ab18904af852@postgresql.org
2019-01-11 15:54:06 -05:00
Tom Lane
eaf0380ecc Fix C++ compile failures in headers.
Avoid using "typeid" as a parameter name in header files, since that
is a C++ keyword.  These cases were introduced recently, in 04fe805a1
and 586b98fdf.

Since I'm an incurable neatnik, also rename these parameters in the
underlying function definitions.  That's not really necessary per
project rules, but I don't like function declarations that don't
quite agree with the underlying definitions.

Per src/tools/pginclude/cpluspluscheck.
2019-01-10 14:07:01 -05:00
Tom Lane
a968d54c5d Remove unnecessary #include.
Discussion: https://postgr.es/m/4380.1547143967@sss.pgh.pa.us
2019-01-10 13:38:02 -05:00
Alvaro Herrera
b60c397599 Move inheritance expansion code into its own file
This commit moves expand_inherited_tables and underlings from
optimizer/prep/prepunionc.c to optimizer/utils/inherit.c.
Also, all of the AppendRelInfo-based expression manipulation routines
are moved to optimizer/utils/appendinfo.c.

No functional code changes.  One exception is the introduction of
make_append_rel_info, but that's still just moving around code.

Also, stop including <limits.h> in prepunion.c, which no longer needs
it since 3fc6e2d7f5.  I (Álvaro) noticed this because Amit was copying
that to inherit.c, which likewise doesn't need it.

Author: Amit Langote
Discussion: https://postgr.es/m/3be67028-a00a-502c-199a-da00eec8fb6e@lab.ntt.co.jp
2019-01-10 14:54:31 -03:00
Alvaro Herrera
6260cc550b pgbench: add \cset and \gset commands
These commands allow assignment of values produced by queries to pgbench
variables, where they can be used by further commands.  \gset terminates
a command sequence (just like a bare semicolon); \cset separates
multiple queries in a compound command, like an escaped semicolon (\;).
A prefix can be provided to the \-command and is prepended to the name
of each output column to produce the final variable name.

This feature allows pgbench scripts to react meaningfully to the actual
database contents, allowing more powerful benchmarks to be written.

Authors: Fabien Coelho, Álvaro Herrera
Reviewed-by: Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>
Reviewed-by: Stephen Frost <sfrost@snowman.net>
Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Tatsuo Ishii <ishii@sraoss.co.jp>
Reviewed-by: Rafia Sabih <rafia.sabih@enterprisedb.com>
Discussion: https://postgr.es/m/alpine.DEB.2.20.1607091005330.3412@sto
2019-01-10 13:42:20 -03:00
Tom Lane
c64d0cd5ce Use perfect hashing, instead of binary search, for keyword lookup.
We've been speculating for a long time that hash-based keyword lookup
ought to be faster than binary search, but up to now we hadn't found
a suitable tool for generating the hash function.  Joerg Sonnenberger
provided the inspiration, and sample code, to show us that rolling our
own generator wasn't a ridiculous idea.  Hence, do that.

The method used here requires a lookup table of approximately 4 bytes
per keyword, but that's less than what we saved in the predecessor commit
afb0d0712, so it's not a big problem.  The time savings is indeed
significant: preliminary testing suggests that the total time for raw
parsing (flex + bison phases) drops by ~20%.

Patch by me, but it owes its existence to Joerg Sonnenberger;
thanks also to John Naylor for review.

Discussion: https://postgr.es/m/20190103163340.GA15803@britannica.bec.de
2019-01-09 19:47:46 -05:00
Tom Lane
8ff5f824dc Reduce the size of the fmgr_builtin_oid_index[] array.
This index array was originally defined to have 10000 entries (ranging
up to FirstGenbkiObjectId), but we really only need entries up to the
last existing builtin function OID, currently 6121.  That saves close
to 8K of never-accessed space in the server executable, at the small
price of one more fetch in fmgr_isbuiltin().

We could reduce the array size still further by renumbering a few of
the highest-numbered builtin functions; but there's a small risk of
breaking clients that have chosen to hardwire those function OIDs,
so it's not clear if it'd be worth the trouble.  (We should, however,
discourage future patches from choosing function OIDs above 6K as long
as there's still lots of space below that.)

Discussion: https://postgr.es/m/12359.1547063064@sss.pgh.pa.us
2019-01-09 15:22:43 -05:00
Tom Lane
afb0d0712f Replace the data structure used for keyword lookup.
Previously, ScanKeywordLookup was passed an array of string pointers.
This had some performance deficiencies: the strings themselves might
be scattered all over the place depending on the compiler (and some
quick checking shows that at least with gcc-on-Linux, they indeed
weren't reliably close together).  That led to very cache-unfriendly
behavior as the binary search touched strings in many different pages.
Also, depending on the platform, the string pointers might need to
be adjusted at program start, so that they couldn't be simple constant
data.  And the ScanKeyword struct had been designed with an eye to
32-bit machines originally; on 64-bit it requires 16 bytes per
keyword, making it even more cache-unfriendly.

Redesign so that the keyword strings themselves are allocated
consecutively (as part of one big char-string constant), thereby
eliminating the touch-lots-of-unrelated-pages syndrome.  And get
rid of the ScanKeyword array in favor of three separate arrays:
uint16 offsets into the keyword array, uint16 token codes, and
uint8 keyword categories.  That reduces the overhead per keyword
to 5 bytes instead of 16 (even less in programs that only need
one of the token codes and categories); moreover, the binary search
only touches the offsets array, further reducing its cache footprint.
This also lets us put the token codes somewhere else than the
keyword strings are, which avoids some unpleasant build dependencies.

While we're at it, wrap the data used by ScanKeywordLookup into
a struct that can be treated as an opaque type by most callers.
That doesn't change things much right now, but it will make it
less painful to switch to a hash-based lookup method, as is being
discussed in the mailing list thread.

Most of the change here is associated with adding a generator
script that can build the new data structure from the same
list-of-PG_KEYWORD header representation we used before.
The PG_KEYWORD lists that plpgsql and ecpg used to embed in
their scanner .c files have to be moved into headers, and the
Makefiles have to be taught to invoke the generator script.
This work is also necessary if we're to consider hash-based lookup,
since the generator script is what would be responsible for
constructing a hash table.

Aside from saving a few kilobytes in each program that includes
the keyword table, this seems to speed up raw parsing (flex+bison)
by a few percent.  So it's worth doing even as it stands, though
we think we can gain even more with a follow-on patch to switch
to hash-based lookup.

John Naylor, with further hacking by me

Discussion: https://postgr.es/m/CAJVSVGXdFVU2sgym89XPL=Lv1zOS5=EHHQ8XWNzFL=mTXkKMLw@mail.gmail.com
2019-01-06 17:02:57 -05:00
Alvaro Herrera
807ae415c5 Don't create relfilenode for relations without storage
Some relation kinds had relfilenode set to some non-zero value, but
apparently the actual files did not really exist because creation was
prevented elsewhere.  Get rid of the phony pg_class.relfilenode values.

Catversion bumped, but only because the sanity_test check will fail if
run in a system initdb'd with the previous version.

Reviewed-by: Kyotaro HORIGUCHI, Michael Paquier
Discussion: https://postgr.es/m/20181206215552.fm2ypuxq6nhpwjuc@alvherre.pgsql
2019-01-04 14:51:17 -03:00
Alvaro Herrera
df5be63763 Rename macro to RELKIND_HAS_STORAGE
The original name was an unfortunate choice.

Discussion: https://postgr.es/m/20181218.145600.172055615.horiguchi.kyotaro@lab.ntt.co.jp
2019-01-04 14:34:18 -03:00
Tom Lane
d33faa285b Move the built-in conversions into the initial catalog data.
Instead of running a SQL script to create the standard conversion
functions and pg_conversion entries, put those entries into the
initial data in postgres.bki.

This shaves a few percent off the runtime of initdb, and also allows
accurate comments to be attached to the conversion functions; the
previous script labeled them with machine-generated comments that
were not quite right for multi-purpose conversion functions.
Also, we can get rid of the duplicative Makefile and MSVC perl
implementations of the generation code for that SQL script.

A functional change is that these pg_proc and pg_conversion entries
are now "pinned" by initdb.  Leaving them unpinned was perhaps a
good thing back while the conversions feature was under development,
but there seems no valid reason for it now.

Also, the conversion functions are now marked as immutable, where
before they were volatile by virtue of lacking any explicit
specification.  That seems like it was just an oversight.

To avoid using magic constants in pg_conversion.dat, extend
genbki.pl to allow encoding names to be converted, much as it
does for language, access method, etc names.

John Naylor

Discussion: https://postgr.es/m/CAJVSVGWtUqxpfAaxS88vEGvi+jKzWZb2EStu5io-UPc4p9rSJg@mail.gmail.com
2019-01-03 19:47:53 -05:00
Tom Lane
814c9019aa Use symbolic references for pg_language OIDs in the bootstrap data.
This patch teaches genbki.pl to replace pg_language names by OIDs
in much the same way as it already does for pg_am names etc, and
converts pg_proc.dat to use such symbolic references in the prolang
column.

Aside from getting rid of a few more magic numbers in the initial
catalog data, this means that Gen_fmgrtab.pl no longer needs to read
pg_language.dat, since it doesn't have to know the OID of the "internal"
language; now it's just looking for the string "internal".

No need for a catversion bump, since the contents of postgres.bki
don't actually change at all.

John Naylor

Discussion: https://postgr.es/m/CAJVSVGWtUqxpfAaxS88vEGvi+jKzWZb2EStu5io-UPc4p9rSJg@mail.gmail.com
2019-01-03 18:38:49 -05:00
Bruce Momjian
97c39498e5 Update copyright for 2019
Backpatch-through: certain files through 9.4
2019-01-02 12:44:25 -05:00
Michael Paquier
1707a0d2aa Remove configure switch --disable-strong-random
This removes a portion of infrastructure introduced by fe0a0b5 to allow
compilation of Postgres in environments where no strong random source is
available, meaning that there is no linking to OpenSSL and no
/dev/urandom (Windows having its own CryptoAPI).  No systems shipped
this century lack /dev/urandom, and the buildfarm is actually not
testing this switch at all, so just remove it.  This simplifies
particularly some backend code which included a fallback implementation
using shared memory, and removes a set of alternate regression output
files from pgcrypto.

Author: Michael Paquier
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/20181230063219.GG608@paquier.xyz
2019-01-01 20:05:51 +09:00