Clean up error messages relating to mistakes in .dat files: make sure they
provide the .dat file name and line number, not the place in the Perl
script that's reporting the problem. Adopt more uniform message phrasing,
too.
Make genbki.pl spit up on unrecognized field names in the input hashes.
Previously, it just silently ignored such fields, which could make a
misspelled field name into a very hard-to-decipher problem. (This is in
genbki.pl, *not* Catalog.pm, because we don't want reformat_dat_file.pl to
complain about unrecognized fields. We'd rather it silently dropped them,
to facilitate removing unwanted fields after a reorganization.)
Formerly, Catalog.pm turned a C array type declaration in the catalog
header files into a SQL type, e.g., 'foo[]'. Along the way, genbki.pl
turned this into '_foo' for the purpose of type lookups, but wrote 'foo[]'
to postgres.bki. During bootstrap, bootscanner.l had to have a special
case rule to tokenize this, and then MapArrayTypeName() would turn 'foo[]'
into '_foo' one more time.
This seems unnecessarily complicated, especially since nobody cares that
much about the readability of postgres.bki. Instead, make Catalog.pm
convert the C declaration into '_foo' to start with, and preserve that
representation of the type name throughout bootstrap data processing.
Then rip out the special-case code in bootscanner.l and bootstrap.c.
This changes postgres.bki to the extent that array fields are now
declared like
proconfig = _text ,
rather than
proconfig = text[] ,
No documentation update, since the SGML docs didn't mention any of this
in the first place, and it's all pretty transparent to writers of
catalog header files anyway.
John Naylor
Discussion: https://postgr.es/m/CAJVSVGUNao=-Q2-vAN3PYcdF5tnL5JAHwGwzZGuYHtq+Mk_9ng@mail.gmail.com
Make these scripts emit just one log message when they run, not one
per output file. The latter is way too verbose in the wake of
commit 372728b0d. The specific wording used is what already existed
in the MSVC scripts.
John Naylor
Discussion: https://postgr.es/m/11103.1523208822@sss.pgh.pa.us
Historically, the initial catalog data to be installed during bootstrap
has been written in DATA() lines in the catalog header files. This had
lots of disadvantages: the format was badly underdocumented, it was
very difficult to edit the data in any mechanized way, and due to the
lack of any abstraction the data was verbose, hard to read/understand,
and easy to get wrong.
Hence, move this data into separate ".dat" files and represent it in a way
that can easily be read and rewritten by Perl scripts. The new format is
essentially "key => value" for each column; while it's a bit repetitive,
explicit labeling of each value makes the data far more readable and less
error-prone. Provide a way to abbreviate entries by omitting field values
that match a specified default value for their column. This allows removal
of a large amount of repetitive boilerplate and also lowers the barrier to
adding new columns.
Also teach genbki.pl how to translate symbolic OID references into
numeric OIDs for more cases than just "regproc"-like pg_proc references.
It can now do that for regprocedure-like references (thus solving the
problem that regproc is ambiguous for overloaded functions), operators,
types, opfamilies, opclasses, and access methods. Use this to turn
nearly all OID cross-references in the initial data into symbolic form.
This represents a very large step forward in readability and error
resistance of the initial catalog data. It should also reduce the
difficulty of renumbering OID assignments in uncommitted patches.
Also, solve the longstanding problem that frontend code that would like to
use OID macros and other information from the catalog headers often had
difficulty with backend-only code in the headers. To do this, arrange for
all generated macros, plus such other declarations as we deem fit, to be
placed in "derived" header files that are safe for frontend inclusion.
(Once clients migrate to using these pg_*_d.h headers, it will be possible
to get rid of the pg_*_fn.h headers, which only exist to quarantine code
away from clients. That is left for follow-on patches, however.)
The now-automatically-generated macros include the Anum_xxx and Natts_xxx
constants that we used to have to update by hand when adding or removing
catalog columns.
Replace the former manual method of generating OID macros for pg_type
entries with an automatic method, ensuring that all built-in types have
OID macros. (But note that this patch does not change the way that
OID macros for pg_proc entries are built and used. It's not clear that
making that match the other catalogs would be worth extra code churn.)
Add SGML documentation explaining what the new data format is and how to
work with it.
Despite being a very large change in the catalog headers, there is no
catversion bump here, because postgres.bki and related output files
haven't changed at all.
John Naylor, based on ideas from various people; review and minor
additional coding by me; previous review by Alvaro Herrera
Discussion: https://postgr.es/m/CAJVSVGWO48JbbwXkJz_yBFyGYW-M9YWxnPdxJBUosDC9ou_F0Q@mail.gmail.com
Add the ability to label a column's default value in the catalog header,
and implement this for pg_attribute. A new function in Catalog.pm is
used to fill in a tuple with defaults. The build process will complain
loudly if a catalog entry is incomplete,
Commit 8137f2c323 labeled variable length columns for the C preprocessor.
Expose that label to genbki.pl so we can exclude those columns from schema
macros in a general fashion. Also, format schema macro entries according
to their types.
This means slightly less code maintenance, but more importantly it's a
proving ground for mechanisms intended to be used in later commits.
While at it, I (Álvaro) couldn't resist making some changes in
genbki.pl: rename some functions to actually indicate their purpose
instead of actively misleading onlookers; and don't iterate on the whole
of pg_type to find the entry for each catalog row, using a hash instead
of an array.
Author: John Naylor, some changes by Álvaro Herrera
Discussion: https://postgr.es/m/CAJVSVGVJHwD8sfDfZW9TbCHWKf=C1YDRM-rF=2JenRU_y+VcFg@mail.gmail.com
Formerly, the bootstrap backend looked up the OIDs corresponding to
names in regproc catalog entries using brute-force searches of pg_proc.
It was somewhat remarkable that that worked at all, since it was used
while populating other pretty-fundamental catalogs like pg_operator.
And it was also quite slow, and getting slower as pg_proc gets bigger.
This patch moves the lookup work into genbki.pl, so that the values in
postgres.bki for regproc columns are always numeric OIDs, an option
that regprocin() already supported. Perl isn't the world's speediest
language, so this about doubles the time needed to run genbki.pl (from
0.3 to 0.6 sec on my machine). But we only do that at most once per
build. The time needed to run initdb drops significantly --- on my
machine, initdb --no-sync goes from 1.8 to 1.3 seconds. So this is
a small net win even for just one initdb per build, and it becomes
quite a nice win for test sequences requiring many initdb runs.
Strip out the now-dead code for brute-force catalog searching in
regprocin. We'd also cargo-culted similar logic into regoperin
and some (not all) of the other reg*in functions. That is all
dead code too since we currently have no need to load such values
during bootstrap. I removed it all, reasoning that if we ever
need such functionality it'd be much better to do it in a similar
way to this patch.
There might be some simplifications possible in the backend now that
regprocin doesn't require doing catalog reads so early in bootstrap.
I've not looked into that, though.
Andreas Karlsson, with some small adjustments by me
Discussion: https://postgr.es/m/30896.1492006367@sss.pgh.pa.us
Fix all perlcritic warnings of severity level 5, except in
src/backend/utils/Gen_dummy_probes.pl, which is automatically generated.
Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Bootstrap determines whether a column is null based on simple builtin
rules. Those work surprisingly well, but nonetheless a few existing
columns aren't set correctly. Additionally there is at least one patch
sent to hackers where forcing the nullness of a column would be helpful.
The boostrap format has gained FORCE [NOT] NULL for this, which will be
emitted by genbki.pl when BKI_FORCE_(NOT_)?NULL is specified for a
column in a catalog header.
This patch doesn't change the marking of any existing columns.
Discussion: 20150215170014.GE15326@awork2.anarazel.de
This reverts commit 1826987a46.
The overall design was deemed unacceptable, in discussion following the
previous commit message; we might find some parts of it still
salvageable, but I don't want to be on the hook for fixing it, so let's
wait until we have a new patch.
The previous representation using a boolean column for each attribute
would not scale as well as we want to add further attributes.
Extra auxilliary functions are added to go along with this change, to
make up for the lost convenience of access of the old representation.
Catalog version bumped due to change in catalogs and the new functions.
Author: Adam Brightwell, minor tweaks by Álvaro
Reviewed by: Stephen Frost, Andres Freund, Álvaro Herrera
Per discussion with Tom and Andrew, 64bit integers are no longer a
problem for the catalogs, so go ahead and add the mapping from the C
int64 type to the int8 SQL identification to allow using them.
Patch by Adam Brightwell
The latter was already the dominant use, and it's preferable because
in C the convention is that intXX means XX bits. Therefore, allowing
mixed use of int2, int4, int8, int16, int32 is obviously confusing.
Remove the typedefs for int2 and int4 for now. They don't seem to be
widely used outside of the PostgreSQL source tree, and the few uses
can probably be cleaned up by the time this ships.
Those fields only appear in the structs so that genbki.pl can create
the BKI bootstrap files for the catalogs. But they are not actually
usable from C. So hiding them can prevent coding mistakes, saves
stack space, and can help the compiler.
In certain catalogs, the first variable-length field has been kept
visible after manual inspection. These exceptions are noted in C
comments.
reviewed by Tom Lane
files when they haven't changed. This confuses make because the build fails
to update the file timestamps, and so it keeps on doing the action over again.
pg_attribute, by having genbki.pl derive the information from the various
catalog header files. This greatly simplifies modification of the
"bootstrapped" catalogs.
This patch finally kills genbki.sh and Gen_fmgrtab.sh; we now rely entirely on
Perl scripts for those build steps. To avoid creating a Perl build dependency
where there was not one before, the output files generated by these scripts
are now treated as distprep targets, ie, they will be built and shipped in
tarballs. But you will need a reasonably modern Perl (probably at least
5.6) if you want to build from a CVS pull.
The changes to the MSVC build process are untested, and may well break ---
we'll soon find out from the buildfarm.
John Naylor, based on ideas from Robert Haas and others