Fix lquery parsing to handle repeated flag characters correctly,
and to enforce the max label length correctly in some cases where
it did not before, and to detect empty labels in some cases where
it did not before.
In a more cosmetic vein, use a switch rather than if-then chains to
handle the different states, and avoid unnecessary checks on charlen
when looking for ASCII characters, and factor out multiple copies of
the label length checking code.
Tom Lane and Dmitry Belyavsky
Discussion: https://postgr.es/m/CADqLbzLVkBuPX0812o+z=c3i6honszsZZ6VQOSKR3VPbB56P3w@mail.gmail.com
Not much to say here --- does what it says on the tin. The "binary"
representation in each case is really just the same as the text format,
though we prefix a version-number byte in case anyone ever feels
motivated to change that. Thus, there's not any expectation of improved
speed or reduced space; the point here is just to allow clients to use
binary format for all columns of a query result or COPY data.
This makes use of the recently added ALTER TYPE support to add binary
I/O functions to an existing data type. As in commit a80818605,
we can piggy-back on there already being a new-for-v13 version of the
ltree extension, so we don't need a new update script file.
Nino Floris, reviewed by Alexander Korotkov and myself
Discussion: https://postgr.es/m/CANmj9Vxx50jOo1L7iSRxd142NyTz6Bdcgg7u9P3Z8o0=HGkYyQ@mail.gmail.com
The existing implementation of the ltree ~ lquery match operator is
sufficiently complex and undocumented that it's hard to tell exactly
what it does. But one thing it clearly gets wrong is the combination
of NOT symbols (!) and '*' symbols. A pattern such as '*.!foo.*'
should, by any ordinary understanding of regular expression behavior,
match any ltree that has at least one label that's not "foo". As best
we can tell by experimentation, what it's actually matching is any
ltree in which *no* label is "foo". That's surprising, and not at all
what the documentation says.
Now, that's arguably a useful behavior, so if we rewrite to fix the
bug we should provide some other way to get it. To do so, add the
ability to attach lquery quantifiers to non-'*' items as well as '*'s.
Then the pattern '!foo{,}' expresses "any ltree in which no label is
foo". For backwards compatibility, the default quantifier for non-'*'
items has to be "{1}", although the default for '*' items is '{,}'.
I wouldn't have done it like that in a green field, but it's not
totally horrible.
Armed with that, rewrite checkCond() from scratch. Treating '*' and
non-'*' items alike makes it simpler, not more complicated, so that
the function actually gets a lot shorter than it was.
Filip Rembiałkowski, Tom Lane, Nikita Glukhov, per a very
ancient bug report from M. Palm
Discussion: https://postgr.es/m/CAP_rww=waX2Oo6q+MbMSiZ9ktdj6eaJj0cQzNu=Ry2cCDij5fw@mail.gmail.com
Ensure that the type name is mentioned in all cases (bare "syntax error"
isn't that helpful). Avoid using the term "level", since that's not
used in the documentation. Phrase error position reports as "at
character N" not "in position N"; the latter seems ambiguous, and it's
certainly not how we say it elsewhere. For the same reason, make the
character position values be 1-based not 0-based. Provide a position
in more cases. (I continued to leave that out of messages complaining
about end-of-input, where it seemed pointless, as well as messages
complaining about overall input complexity, where fingering any one part
of the input would be arbitrary.)
Discussion: https://postgr.es/m/15582.1585529626@sss.pgh.pa.us
Add more comments in ltree.h, and correct a misstatement or two.
Use a symbol, rather than hardwired constants, for the maximum length
of an ltree label. The max length is still hardwired in the associated
error messages, but I want to clean that up as part of a separate patch
to improve the error messages.
These uint16 fields could be overflowed by excessively long input,
producing strange results. Complain for invalid input.
Likewise check for out-of-range values of the repeat counts in lquery.
(We don't try too hard on that one, notably not bothering to detect
if atoi's result has overflowed.)
Also detect length overflow in ltree_concat.
In passing, be more consistent about whether "syntax error" messages
include the type name. Also, clarify the documentation about what
the size limit is.
This has been broken for a long time, so back-patch to all supported
branches.
Nikita Glukhov, reviewed by Benjie Gillam and Tomas Vondra
Discussion: https://postgr.es/m/CAP_rww=waX2Oo6q+MbMSiZ9ktdj6eaJj0cQzNu=Ry2cCDij5fw@mail.gmail.com
The basic rule we follow here is to always first include 'postgres.h' or
'postgres_fe.h' whichever is applicable, then system header includes and
then Postgres header includes. In this, we also follow that all the
Postgres header includes are in order based on their ASCII value. We
generally follow these rules, but the code has deviated in many places.
This commit makes it consistent just for contrib modules. The later
commits will enforce similar rules in other parts of code.
Author: Vignesh C
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/CALDaNm2Sznv8RR6Ex-iJO6xAdsxgWhCoETkaYX=+9DW3q0QCfA@mail.gmail.com
By project convention, these names should include "P" when dealing with a
pointer type; that is, if the result of a GETARG macro is of type FOO *,
it should be called PG_GETARG_FOO_P not just PG_GETARG_FOO. Some newer
types such as JSONB and ranges had not followed the convention, and a
number of contrib modules hadn't gotten that memo either. Rename the
offending macros to improve consistency.
In passing, fix a few places that thought PG_DETOAST_DATUM() returns
a Datum; it does not, it returns "struct varlena *". Applying
DatumGetPointer to that happens not to cause any bad effects today,
but it's formally wrong. Also, adjust an ltree macro that was designed
without any thought for what pgindent would do with it.
This is all cosmetic and shouldn't have any impact on generated code.
Mark Dilger, some further tweaks by me
Discussion: https://postgr.es/m/EA5676F4-766F-4F38-8348-ECC7DB427C6A@gmail.com
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
Because of gcc -Wmissing-prototypes, all functions in dynamically
loadable modules must have a separate prototype declaration. This is
meant to detect global functions that are not declared in header files,
but in cases where the function is called via dfmgr, this is redundant.
Besides filling up space with boilerplate, this is a frequent source of
compiler warnings in extension modules.
We can fix that by creating the function prototype as part of the
PG_FUNCTION_INFO_V1 macro, which such modules have to use anyway. That
makes the code of modules cleaner, because there is one less place where
the entry points have to be listed, and creates an additional check that
functions have the right prototype.
Remove now redundant prototypes from contrib and other modules.
Several functions, mostly type input functions, calculated an allocation
size such that the calculation wrapped to a small positive value when
arguments implied a sufficiently-large requirement. Writes past the end
of the inadvertent small allocation followed shortly thereafter.
Coverity identified the path_in() vulnerability; code inspection led to
the rest. In passing, add check_stack_depth() to prevent stack overflow
in related functions.
Back-patch to 8.4 (all supported versions). The non-comment hstore
changes touch code that did not exist in 8.4, so that part stops at 9.0.
Noah Misch and Heikki Linnakangas, reviewed by Tom Lane.
Security: CVE-2014-0064
unnecessary #include lines in it. Also, move some tuple routine prototypes and
macros to htup.h, which allows removal of heapam.h inclusion from some .c
files.
For this to work, a new header file access/sysattr.h needed to be created,
initially containing attribute numbers of system columns, for pg_dump usage.
While at it, make contrib ltree, intarray and hstore header files more
consistent with our header style.
results to contain uninitialized, unpredictable values. While this was okay
as far as the datatypes themselves were concerned, it's a problem for the
parser because occurrences of the "same" literal might not be recognized as
equal by datumIsEqual (and hence not by equal()). It seems sufficient to fix
this in the input functions since the only critical use of equal() is in the
parser's comparisons of ORDER BY and DISTINCT expressions.
Per a trouble report from Marc Cousin.
Patch all the way back. Interestingly, array_in did not have the bug before
8.2, which may explain why the issue went unnoticed for so long.
ways. I'm not totally sure that I caught everything, but at least now they pass
their regression tests with VARSIZE/SET_VARSIZE defined to reverse byte order.
return true for exactly the characters treated as whitespace by their flex
scanners. Per report from Victor Snezhko and subsequent investigation.
Also fix a passel of unsafe usages of <ctype.h> functions, that is, ye olde
char-vs-unsigned-char issue. I won't miss <ctype.h> when we are finally
able to stop using it.
more compliant with the error message style guide. In particular,
errdetail should begin with a capital letter and end with a period,
whereas errmsg should not. I also fixed a few related issues in
passing, such as fixing the repeated misspelling of "lexeme" in
contrib/tsearch2 (per Tom's suggestion).
Christopher Kings-Lynne wrote:
> I'm still getting ltree failures on 64bit freebsd:
>
> sed 's,MODULE_PATHNAME,$libdir/ltree,g' ltree.sql.in >ltree.sql
> gcc -pipe -O -g -Wall -Wmissing-prototypes -Wmissing-declarations -fpic -DPI
> C -DLOWER_NODE -I. -I../../src/include -c -o ltree_io.o ltree_io.c -MMD
> ltree_io.c: In function `ltree_in':
> ltree_io.c:57: warning: int format, different type arg (arg 3)
> ltree_io.c:63: warning: int format, different type arg (arg 4)
> ltree_io.c:68: warning: int format, different type arg (arg 3)
Teodor Sigaev