postgresql/src/include/utils/jsonapi.h

168 lines
5.5 KiB
C
Raw Normal View History

/*-------------------------------------------------------------------------
*
* jsonapi.h
* Declarations for JSON API support.
*
* Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* src/include/utils/jsonapi.h
*
*-------------------------------------------------------------------------
*/
#ifndef JSONAPI_H
#define JSONAPI_H
#include "jsonb.h"
#include "lib/stringinfo.h"
typedef enum
{
JSON_TOKEN_INVALID,
JSON_TOKEN_STRING,
JSON_TOKEN_NUMBER,
JSON_TOKEN_OBJECT_START,
JSON_TOKEN_OBJECT_END,
JSON_TOKEN_ARRAY_START,
JSON_TOKEN_ARRAY_END,
JSON_TOKEN_COMMA,
JSON_TOKEN_COLON,
JSON_TOKEN_TRUE,
JSON_TOKEN_FALSE,
JSON_TOKEN_NULL,
JSON_TOKEN_END
} JsonTokenType;
/*
* All the fields in this structure should be treated as read-only.
*
* If strval is not null, then it should contain the de-escaped value
* of the lexeme if it's a string. Otherwise most of these field names
* should be self-explanatory.
*
* line_number and line_start are principally for use by the parser's
* error reporting routines.
* token_terminator and prev_token_terminator point to the character
* AFTER the end of the token, i.e. where there would be a nul byte
* if we were using nul-terminated strings.
*/
typedef struct JsonLexContext
{
char *input;
int input_length;
char *token_start;
char *token_terminator;
char *prev_token_terminator;
JsonTokenType token_type;
int lex_level;
int line_number;
char *line_start;
StringInfo strval;
} JsonLexContext;
typedef void (*json_struct_action) (void *state);
typedef void (*json_ofield_action) (void *state, char *fname, bool isnull);
typedef void (*json_aelem_action) (void *state, bool isnull);
typedef void (*json_scalar_action) (void *state, char *token, JsonTokenType tokentype);
/*
* Semantic Action structure for use in parsing json.
* Any of these actions can be NULL, in which case nothing is done at that
* point, Likewise, semstate can be NULL. Using an all-NULL structure amounts
* to doing a pure parse with no side-effects, and is therefore exactly
* what the json input routines do.
*
* The 'fname' and 'token' strings passed to these actions are palloc'd.
* They are not free'd or used further by the parser, so the action function
* is free to do what it wishes with them.
*/
typedef struct JsonSemAction
{
void *semstate;
json_struct_action object_start;
json_struct_action object_end;
json_struct_action array_start;
json_struct_action array_end;
json_ofield_action object_field_start;
json_ofield_action object_field_end;
json_aelem_action array_element_start;
json_aelem_action array_element_end;
json_scalar_action scalar;
} JsonSemAction;
/*
* pg_parse_json will parse the string in the lex calling the
* action functions in sem at the appropriate points. It is
* up to them to keep what state they need in semstate. If they
* need access to the state of the lexer, then its pointer
* should be passed to them as a member of whatever semstate
* points to. If the action pointers are NULL the parser
* does nothing and just continues.
*/
extern void pg_parse_json(JsonLexContext *lex, JsonSemAction *sem);
Support JSON negative array subscripts everywhere Previously, there was an inconsistency across json/jsonb operators that operate on datums containing JSON arrays -- only some operators supported negative array count-from-the-end subscripting. Specifically, only a new-to-9.5 jsonb deletion operator had support (the new "jsonb - integer" operator). This inconsistency seemed likely to be counter-intuitive to users. To fix, allow all places where the user can supply an integer subscript to accept a negative subscript value, including path-orientated operators and functions, as well as other extraction operators. This will need to be called out as an incompatibility in the 9.5 release notes, since it's possible that users are relying on certain established extraction operators changed here yielding NULL in the event of a negative subscript. For the json type, this requires adding a way of cheaply getting the total JSON array element count ahead of time when parsing arrays with a negative subscript involved, necessitating an ad-hoc lex and parse. This is followed by a "conversion" from a negative subscript to its equivalent positive-wise value using the count. From there on, it's as if a positive-wise value was originally provided. Note that there is still a minor inconsistency here across jsonb deletion operators. Unlike the aforementioned new "-" deletion operator that accepts an integer on its right hand side, the new "#-" path orientated deletion variant does not throw an error when it appears like an array subscript (input that could be recognized by as an integer literal) is being used on an object, which is wrong-headed. The reason for not being stricter is that it could be the case that an object pair happens to have a key value that looks like an integer; in general, these two possibilities are impossible to differentiate with rhs path text[] argument elements. However, we still don't allow the "#-" path-orientated deletion operator to perform array-style subscripting. Rather, we just return the original left operand value in the event of a negative subscript (which seems analogous to how the established "jsonb/json #> text[]" path-orientated operator may yield NULL in the event of an invalid subscript). In passing, make SetArrayPath() stricter about not accepting cases where there is trailing non-numeric garbage bytes rather than a clean NUL byte. This means, for example, that strings like "10e10" are now not accepted as an array subscript of 10 by some new-to-9.5 path-orientated jsonb operators (e.g. the new #- operator). Finally, remove dead code for jsonb subscript deletion; arguably, this should have been done in commit b81c7b409. Peter Geoghegan and Andrew Dunstan
2015-07-18 02:56:13 +02:00
/*
* json_count_array_elements performs a fast secondary parse to determine the
* number of elements in passed array lex context. It should be called from an
* array_start action.
*/
2016-06-10 00:02:36 +02:00
extern int json_count_array_elements(JsonLexContext *lex);
Support JSON negative array subscripts everywhere Previously, there was an inconsistency across json/jsonb operators that operate on datums containing JSON arrays -- only some operators supported negative array count-from-the-end subscripting. Specifically, only a new-to-9.5 jsonb deletion operator had support (the new "jsonb - integer" operator). This inconsistency seemed likely to be counter-intuitive to users. To fix, allow all places where the user can supply an integer subscript to accept a negative subscript value, including path-orientated operators and functions, as well as other extraction operators. This will need to be called out as an incompatibility in the 9.5 release notes, since it's possible that users are relying on certain established extraction operators changed here yielding NULL in the event of a negative subscript. For the json type, this requires adding a way of cheaply getting the total JSON array element count ahead of time when parsing arrays with a negative subscript involved, necessitating an ad-hoc lex and parse. This is followed by a "conversion" from a negative subscript to its equivalent positive-wise value using the count. From there on, it's as if a positive-wise value was originally provided. Note that there is still a minor inconsistency here across jsonb deletion operators. Unlike the aforementioned new "-" deletion operator that accepts an integer on its right hand side, the new "#-" path orientated deletion variant does not throw an error when it appears like an array subscript (input that could be recognized by as an integer literal) is being used on an object, which is wrong-headed. The reason for not being stricter is that it could be the case that an object pair happens to have a key value that looks like an integer; in general, these two possibilities are impossible to differentiate with rhs path text[] argument elements. However, we still don't allow the "#-" path-orientated deletion operator to perform array-style subscripting. Rather, we just return the original left operand value in the event of a negative subscript (which seems analogous to how the established "jsonb/json #> text[]" path-orientated operator may yield NULL in the event of an invalid subscript). In passing, make SetArrayPath() stricter about not accepting cases where there is trailing non-numeric garbage bytes rather than a clean NUL byte. This means, for example, that strings like "10e10" are now not accepted as an array subscript of 10 by some new-to-9.5 path-orientated jsonb operators (e.g. the new #- operator). Finally, remove dead code for jsonb subscript deletion; arguably, this should have been done in commit b81c7b409. Peter Geoghegan and Andrew Dunstan
2015-07-18 02:56:13 +02:00
/*
* constructors for JsonLexContext, with or without strval element.
* If supplied, the strval element will contain a de-escaped version of
* the lexeme. However, doing this imposes a performance penalty, so
* it should be avoided if the de-escaped lexeme is not required.
*
* If you already have the json as a text* value, use the first of these
* functions, otherwise use makeJsonLexContextCstringLen().
*/
extern JsonLexContext *makeJsonLexContext(text *json, bool need_escapes);
extern JsonLexContext *makeJsonLexContextCstringLen(char *json,
int len,
bool need_escapes);
/*
* Utility function to check if a string is a valid JSON number.
*
* str argument does not need to be nul-terminated.
*/
2015-05-24 03:35:49 +02:00
extern bool IsValidJsonNumber(const char *str, int len);
/*
* Flag types for iterate_json(b)_values to specify what elements from a
* json(b) document we want to iterate.
*/
typedef enum JsonToIndex
{
jtiKey = 0x01,
jtiString = 0x02,
jtiNumeric = 0x04,
jtiBool = 0x08,
jtiAll = jtiKey | jtiString | jtiNumeric | jtiBool
} JsonToIndex;
/* an action that will be applied to each value in iterate_json(b)_values functions */
typedef void (*JsonIterateStringValuesAction) (void *state, char *elem_value, int elem_len);
/* an action that will be applied to each value in transform_json(b)_values functions */
typedef text *(*JsonTransformStringValuesAction) (void *state, char *elem_value, int elem_len);
extern uint32 parse_jsonb_index_flags(Jsonb *jb);
extern void iterate_jsonb_values(Jsonb *jb, uint32 flags, void *state,
JsonIterateStringValuesAction action);
extern void iterate_json_values(text *json, uint32 flags, void *action_state,
JsonIterateStringValuesAction action);
extern Jsonb *transform_jsonb_string_values(Jsonb *jsonb, void *action_state,
JsonTransformStringValuesAction transform_action);
extern text *transform_json_string_values(text *json, void *action_state,
JsonTransformStringValuesAction transform_action);
extern char *JsonEncodeDateTime(char *buf, Datum value, Oid typid,
const int *tzp);
Phase 2 of pgindent updates. Change pg_bsd_indent to follow upstream rules for placement of comments to the right of code, and remove pgindent hack that caused comments following #endif to not obey the general rule. Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using the published version of pg_bsd_indent, but a hacked-up version that tried to minimize the amount of movement of comments to the right of code. The situation of interest is where such a comment has to be moved to the right of its default placement at column 33 because there's code there. BSD indent has always moved right in units of tab stops in such cases --- but in the previous incarnation, indent was working in 8-space tab stops, while now it knows we use 4-space tabs. So the net result is that in about half the cases, such comments are placed one tab stop left of before. This is better all around: it leaves more room on the line for comment text, and it means that in such cases the comment uniformly starts at the next 4-space tab stop after the code, rather than sometimes one and sometimes two tabs after. Also, ensure that comments following #endif are indented the same as comments following other preprocessor commands such as #else. That inconsistency turns out to have been self-inflicted damage from a poorly-thought-through post-indent "fixup" in pgindent. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:18:54 +02:00
#endif /* JSONAPI_H */