Add defenses against unexpected changes in the NodeTag enum list.

Having different build systems producing different contents of the
NodeTag enum would be catastrophic for extension ABI stability.
But that ordering depends on the order in which gen_node_support.pl
processes its input files.  It seems too fragile to let the Makefiles,
MSVC build scripts, and soon meson build scripts all set this order
independently.  As a klugy but serviceable solution, put a canonical
copy of the file list into gen_node_support.pl itself, and check that
against the files given on the command line.

Also, while it's fine to add and delete node tags during development,
we must not let the assigned NodeTag values change unexpectedly in
stable branches.  Add a cross-check that can be enabled when a branch
is forked off (or later, but that is a time when we're unlikely to
miss doing it).  It just checks that the last auto-assigned number
doesn't change, which is simplistic but will catch the most likely
sorts of mistakes.

From time to time we do need to add a node tag in a stable branch.
To support doing that without changing the branch's auto-assigned
tag numbers, invent pg_node_attr(nodetag_number(VALUE)) which can
be used to give such a node a hand-assigned tag above the last
auto-assigned one.

Discussion: https://postgr.es/m/1249010.1657574337@sss.pgh.pa.us
This commit is contained in:
Tom Lane 2022-07-12 11:22:52 -04:00
parent ca187d7455
commit eea9fa9b25
3 changed files with 113 additions and 22 deletions

View File

@ -34,6 +34,72 @@ sub elem
return grep { $_ eq $x } @_;
}
# This list defines the canonical set of header files to be read by this
# script, and the order they are to be processed in. We must have a stable
# processing order, else the NodeTag enum's order will vary, with catastrophic
# consequences for ABI stability across different builds.
#
# Currently, the various build systems also have copies of this list,
# so that they can do dependency checking properly. In future we may be
# able to make this list the only copy. For now, we just check that
# it matches the list of files passed on the command line.
my @all_input_files = qw(
nodes/nodes.h
nodes/primnodes.h
nodes/parsenodes.h
nodes/pathnodes.h
nodes/plannodes.h
nodes/execnodes.h
access/amapi.h
access/sdir.h
access/tableam.h
access/tsmapi.h
commands/event_trigger.h
commands/trigger.h
executor/tuptable.h
foreign/fdwapi.h
nodes/extensible.h
nodes/lockoptions.h
nodes/replnodes.h
nodes/supportnodes.h
nodes/value.h
utils/rel.h
);
# Nodes from these input files are automatically treated as nodetag_only.
# In the future we might add explicit pg_node_attr labeling to some of these
# files and remove them from this list, but for now this is the path of least
# resistance.
my @nodetag_only_files = qw(
nodes/execnodes.h
access/amapi.h
access/sdir.h
access/tableam.h
access/tsmapi.h
commands/event_trigger.h
commands/trigger.h
executor/tuptable.h
foreign/fdwapi.h
nodes/lockoptions.h
nodes/replnodes.h
nodes/supportnodes.h
);
# ARM ABI STABILITY CHECK HERE:
#
# In stable branches, set $last_nodetag to the name of the last node type
# that should receive an auto-generated nodetag number, and $last_nodetag_no
# to its number. (Find these values in the last line of the current
# nodetags.h file.) The script will then complain if those values don't
# match reality, providing a cross-check that we haven't broken ABI by
# adding or removing nodetags.
# In HEAD, these variables should be left undef, since we don't promise
# ABI stability during development.
my $last_nodetag = undef;
my $last_nodetag_no = undef;
# output file names
my @output_files;
@ -90,6 +156,9 @@ my @custom_copy_equal;
# Similarly for custom read/write implementations.
my @custom_read_write;
# Track node types with manually assigned NodeTag numbers.
my %manual_nodetag_number;
# EquivalenceClasses are never moved, so just shallow-copy the pointer
push @scalar_types, qw(EquivalenceClass* EquivalenceMember*);
@ -97,25 +166,6 @@ push @scalar_types, qw(EquivalenceClass* EquivalenceMember*);
# currently not required.
push @scalar_types, qw(QualCost);
# Nodes from these input files are automatically treated as nodetag_only.
# In the future we might add explicit pg_node_attr labeling to some of these
# files and remove them from this list, but for now this is the path of least
# resistance.
my @nodetag_only_files = qw(
nodes/execnodes.h
access/amapi.h
access/sdir.h
access/tableam.h
access/tsmapi.h
commands/event_trigger.h
commands/trigger.h
executor/tuptable.h
foreign/fdwapi.h
nodes/lockoptions.h
nodes/replnodes.h
nodes/supportnodes.h
);
# XXX various things we are not publishing right now to stay level
# with the manual system
push @no_read_write,
@ -134,8 +184,13 @@ push @no_read, qw(A_ArrayExpr A_Indices A_Indirection AlterStatsStmt
TriggerTransition TypeCast TypeName WindowDef WithClause XmlSerialize);
## check that we have the expected number of files on the command line
die "wrong number of input files, expected @all_input_files\n"
if ($#ARGV != $#all_input_files);
## read input
my $next_input_file = 0;
foreach my $infile (@ARGV)
{
my $in_struct;
@ -150,11 +205,17 @@ foreach my $infile (@ARGV)
my %my_field_types;
my %my_field_attrs;
# open file with name from command line, which may have a path prefix
open my $ifh, '<', $infile or die "could not open \"$infile\": $!";
# now shorten filename for use below
$infile =~ s!.*src/include/!!;
# check it against next member of @all_input_files
die "wrong input file ordering, expected @all_input_files\n"
if ($infile ne $all_input_files[$next_input_file]);
$next_input_file++;
my $raw_file_content = do { local $/; <$ifh> };
# strip C comments, preserving newlines so we can count lines correctly
@ -274,6 +335,10 @@ foreach my $infile (@ARGV)
# does in fact exist.
push @no_read_write, $in_struct;
}
elsif ($attr =~ /^nodetag_number\((\d+)\)$/)
{
$manual_nodetag_number{$in_struct} = $1;
}
else
{
die
@ -475,14 +540,31 @@ open my $nt, '>', 'nodetags.h' . $tmpext or die $!;
printf $nt $header_comment, 'nodetags.h';
my $i = 1;
my $tagno = 0;
my $last_tag = undef;
foreach my $n (@node_types, @extra_tags)
{
next if elem $n, @abstract_types;
print $nt "\tT_${n} = $i,\n";
$i++;
if (defined $manual_nodetag_number{$n})
{
# do not change $tagno or $last_tag
print $nt "\tT_${n} = $manual_nodetag_number{$n},\n";
}
else
{
$tagno++;
$last_tag = $n;
print $nt "\tT_${n} = $tagno,\n";
}
}
# verify that last auto-assigned nodetag stays stable
die "ABI stability break: last nodetag is $last_tag not $last_nodetag\n"
if (defined $last_nodetag && $last_nodetag ne $last_tag);
die
"ABI stability break: last nodetag number is $tagno not $last_nodetag_no\n"
if (defined $last_nodetag_no && $last_nodetag_no != $tagno);
close $nt;

View File

@ -66,6 +66,11 @@ typedef enum NodeTag
*
* - special_read_write: Has special treatment in outNode() and nodeRead().
*
* - nodetag_number(VALUE): assign the specified nodetag number instead of
* an auto-generated number. Typically this would only be used in stable
* branches, to give a newly-added node type a number without breaking ABI
* by changing the numbers of existing node types.
*
* Node types can be supertypes of other types whether or not they are marked
* abstract: if a node struct appears as the first field of another struct
* type, then it is the supertype of that type. The no_copy, no_equal, and

View File

@ -107,6 +107,10 @@ Starting a New Development Cycle
placeholder), "git rm" the previous one, and update release.sgml and
filelist.sgml to match.
* In the newly-made branch, change src/backend/nodes/gen_node_support.pl
to enforce ABI stability of the NodeTag list (see "ARM ABI STABILITY
CHECK HERE" therein).
* Notify the private committers email list, to ensure all committers
are aware of the new branch even if they're not paying close attention
to pgsql-hackers.