postgresql

Commit Graph

Author	SHA1	Message	Date
Peter Eisentraut	7a32ac8a66	Add procedure support to pg_get_functiondef This also makes procedures work in psql's \ef and \sf commands. Reported-by: Pavel Stehule <pavel.stehule@gmail.com>	2018-02-13 15:13:44 -05:00
Alvaro Herrera	8237f27b50	get_relid_attribute_name is dead, long live get_attname The modern way is to use a missing_ok argument instead of two separate almost-identical routines, so do that. Author: Michaël Paquier Reviewed-by: Álvaro Herrera Discussion: https://postgr.es/m/20180201063212.GE6398@paquier.xyz	2018-02-12 19:33:15 -03:00
Tom Lane	0a459cec96	Support all SQL:2011 options for window frame clauses. This patch adds the ability to use "RANGE offset PRECEDING/FOLLOWING" frame boundaries in window functions. We'd punted on that back in the original patch to add window functions, because it was not clear how to do it in a reasonably data-type-extensible fashion. That problem is resolved here by adding the ability for btree operator classes to provide an "in_range" support function that defines how to add or subtract the RANGE offset value. Factoring it this way also allows the operator class to avoid overflow problems near the ends of the datatype's range, if it wishes to expend effort on that. (In the committed patch, the integer opclasses handle that issue, but it did not seem worth the trouble to avoid overflow failures for datetime types.) The patch includes in_range support for the integer_ops opfamily (int2/int4/int8) as well as the standard datetime types. Support for other numeric types has been requested, but that seems like suitable material for a follow-on patch. In addition, the patch adds GROUPS mode which counts the offset in ORDER-BY peer groups rather than rows, and it adds the frame_exclusion options specified by SQL:2011. As far as I can see, we are now fully up to spec on window framing options. Existing behaviors remain unchanged, except that I changed the errcode for a couple of existing error reports to meet the SQL spec's expectation that negative "offset" values should be reported as SQLSTATE 22013. Internally and in relevant parts of the documentation, we now consistently use the terminology "offset PRECEDING/FOLLOWING" rather than "value PRECEDING/FOLLOWING", since the term "value" is confusingly vague. Oliver Ford, reviewed and whacked around some by me Discussion: https://postgr.es/m/CAGMVOdu9sivPAxbNN0X+q19Sfv9edEPv=HibOJhB14TJv_RCQg@mail.gmail.com	2018-02-07 00:06:56 -05:00
Robert Haas	9fafa413ac	Avoid valgrind complaint about write() of uninitalized bytes. LogicalTapeFreeze() may write out its first block when it is dirty but not full, and then immediately read the first block back in from its BufFile as a BLCKSZ-width block. This can only occur in rare cases where very few tuples were written out, which is currently only possible with parallel external tuplesorts. To avoid valgrind complaints, tell it to treat the tail of logtape.c's buffer as defined. Commit `9da0cc3528` exposed this problem but did not create it. LogicalTapeFreeze() has always tended to write out some amount of garbage bytes, but previously never wrote less than one block of data in total, so the problem was masked. Per buildfarm members lousyjack and skink. Peter Geoghegan, based on a suggestion from Tom Lane and me. Some comment revisions by me.	2018-02-06 14:24:57 -05:00
Tom Lane	3492a0af0b	Fix RelationBuildPartitionKey's processing of partition key expressions. Failure to advance the list pointer while reading partition expressions from a list results in invoking an input function with inappropriate data, possibly leading to crashes or, with carefully crafted input, disclosure of arbitrary backend memory. Bug discovered independently by Álvaro Herrera and David Rowley. This patch is by Álvaro but owes something to David's proposed fix. Back-patch to v10 where the issue was introduced. Security: CVE-2018-1052	2018-02-05 10:37:30 -05:00
Robert Haas	9da0cc3528	Support parallel btree index builds. To make this work, tuplesort.c and logtape.c must also support parallelism, so this patch adds that infrastructure and then applies it to the particular case of parallel btree index builds. Testing to date shows that this can often be 2-3x faster than a serial index build. The model for deciding how many workers to use is fairly primitive at present, but it's better than not having the feature. We can refine it as we get more experience. Peter Geoghegan with some help from Rushabh Lathia. While Heikki Linnakangas is not an author of this patch, he wrote other patches without which this feature would not have been possible, and therefore the release notes should possibly credit him as an author of this feature. Reviewed by Claudio Freire, Heikki Linnakangas, Thomas Munro, Tels, Amit Kapila, me. Discussion: http://postgr.es/m/CAM3SWZQKM=Pzc=CAHzRixKjp2eO5Q0Jg1SoFQqeXFQ647JiwqQ@mail.gmail.com Discussion: http://postgr.es/m/CAH2-Wz=AxWqDoVvGU7dq856S4r6sJAj6DBn7VMtigkB33N5eyg@mail.gmail.com	2018-02-02 13:32:44 -05:00
Peter Eisentraut	a044378ce2	Add some noreturn attributes to help static analyzers	2018-01-29 20:44:35 -05:00
Tom Lane	97d4445a03	Save a few bytes by removing useless last argument to SearchCatCacheList. There's never any value in giving a fully specified cache key to SearchCatCacheList: you might as well call SearchCatCache instead, since there could be only one match. So the maximum useful number of key arguments is one less than the supported number of key columns. We might as well remove the useless extra argument and save some few bytes per call site, as well as a cycle or so per call. I believe the reason it was coded like this is that originally, callers had to write out all the dummy arguments in each call, and so it seemed less confusing if SearchCatCache and SearchCatCacheList took the same number of key arguments. But since commit `e26c539e9`, callers only write their live arguments explicitly, making that a non-factor; and there's surely been enough time for third-party modules to adapt to that coding style. So this is only an ABI break not an API break for callers. Per discussion with Oliver Ford, this might also make it less confusing how to use SearchCatCacheList correctly. Discussion: https://postgr.es/m/27788.1517069693@sss.pgh.pa.us	2018-01-29 15:13:17 -05:00
Peter Eisentraut	c1869542b3	Use abstracted SSL API in server connection log messages The existing "connection authorized" server log messages used OpenSSL API calls directly, even though similar abstracted API calls exist. Change to use the latter instead. Change the function prototype for the functions that return the TLS version and the cipher to return const char * directly instead of copying into a buffer. That makes them slightly easier to use. Add bits= to the message. psql shows that, so we might as well show the same information on the client and server. Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2018-01-26 09:50:46 -05:00
Peter Eisentraut	0b5e33f667	Remove use of byte-masking macros in record_image_cmp These were introduced in `4cbb646334`, but after further analysis and testing, they should not be necessary and probably weren't the part of that commit that fixed anything. Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2018-01-25 09:41:19 -05:00
Peter Eisentraut	7404e77cc1	Split out documentation of SSL parameters into their own section Split the "Authentication and Security" section into two separate sections "Authentication" and "SSL". The latter part has gotten much longer over time, and doesn't primarily have to do with authentication. Also, the row_security parameter was inconsistently categorized, so clean that up while we're here.	2018-01-23 07:11:38 -05:00
Peter Eisentraut	8561e4840c	Transaction control in PL procedures In each of the supplied procedural languages (PL/pgSQL, PL/Perl, PL/Python, PL/Tcl), add language-specific commit and rollback functions/commands to control transactions in procedures in that language. Add similar underlying functions to SPI. Some additional cleanup so that transaction commit or abort doesn't blow away data structures still used by the procedure call. Add execution context tracking to CALL and DO statements so that transaction control commands can only be issued in top-level procedure and block calls, not function calls or other procedure or block calls. - SPI Add a new function SPI_connect_ext() that is like SPI_connect() but allows passing option flags. The only option flag right now is SPI_OPT_NONATOMIC. A nonatomic SPI connection can execute transaction control commands, otherwise it's not allowed. This is meant to be passed down from CALL and DO statements which themselves know in which context they are called. A nonatomic SPI connection uses different memory management. A normal SPI connection allocates its memory in TopTransactionContext. For nonatomic connections we use PortalContext instead. As the comment in SPI_connect_ext() (previously SPI_connect()) indicates, one could potentially use PortalContext in all cases, but it seems safest to leave the existing uses alone, because this stuff is complicated enough already. SPI also gets new functions SPI_start_transaction(), SPI_commit(), and SPI_rollback(), which can be used by PLs to implement their transaction control logic. - portalmem.c Some adjustments were made in the code that cleans up portals at transaction abort. The portal code could already handle a command committing a transaction and continuing (e.g., VACUUM), but it was not quite prepared for a command aborting a transaction and continuing. In AtAbort_Portals(), remove the code that marks an active portal as failed. As the comment there already predicted, this doesn't work if the running command wants to keep running after transaction abort. And it's actually not necessary, because pquery.c is careful to run all portal code in a PG_TRY block and explicitly runs MarkPortalFailed() if there is an exception. So the code in AtAbort_Portals() is never used anyway. In AtAbort_Portals() and AtCleanup_Portals(), we need to be careful not to clean up active portals too much. This mirrors similar code in PreCommit_Portals(). - PL/Perl Gets new functions spi_commit() and spi_rollback() - PL/pgSQL Gets new commands COMMIT and ROLLBACK. Update the PL/SQL porting example in the documentation to reflect that transactions are now possible in procedures. - PL/Python Gets new functions plpy.commit and plpy.rollback. - PL/Tcl Gets new commands commit and rollback. Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndquadrant.com>	2018-01-22 08:43:06 -05:00
Magnus Hagander	1cc4f536ef	Support huge pages on Windows Add support for huge pages (called large pages on Windows) to the Windows build. This (probably) breaks compatibility with Windows versions prior to Windows 2003 or Windows Vista. Authors: Takayuki Tsunakawa and Thomas Munro Reviewed by: Magnus Hagander, Amit Kapila	2018-01-21 15:40:46 +01:00
Peter Eisentraut	8b9e9644dc	Replace AclObjectKind with ObjectType AclObjectKind was basically just another enumeration for object types, and we already have a preferred one for that. It's only used in aclcheck_error. By using ObjectType instead, we can also give some more precise error messages, for example "index" instead of "relation". Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2018-01-19 14:01:15 -05:00
Peter Eisentraut	2c6f37ed62	Replace GrantObjectType with ObjectType There used to be a lot of different Type and Kind symbol groups to address objects within different commands, most of which have been replaced by ObjectType, starting with `b256f24264`. But this conversion was never done for the ACL commands until now. This change ends up being just a plain replacement of the types and symbols, without any code restructuring needed, except deleting some now redundant code. Reviewed-by: Michael Paquier <michael.paquier@gmail.com> Reviewed-by: Stephen Frost <sfrost@snowman.net>	2018-01-19 14:01:14 -05:00
Alvaro Herrera	8b08f7d482	Local partitioned indexes When CREATE INDEX is run on a partitioned table, create catalog entries for an index on the partitioned table (which is just a placeholder since the table proper has no data of its own), and recurse to create actual indexes on the existing partitions; create them in future partitions also. As a convenience gadget, if the new index definition matches some existing index in partitions, these are picked up and used instead of creating new ones. Whichever way these indexes come about, they become attached to the index on the parent table and are dropped alongside it, and cannot be dropped on isolation unless they are detached first. To support pg_dump'ing these indexes, add commands CREATE INDEX ON ONLY <table> (which creates the index on the parent partitioned table, without recursing) and ALTER INDEX ATTACH PARTITION (which is used after the indexes have been created individually on each partition, to attach them to the parent index). These reconstruct prior database state exactly. Reviewed-by: (in alphabetical order) Peter Eisentraut, Robert Haas, Amit Langote, Jesper Pedersen, Simon Riggs, David Rowley Discussion: https://postgr.es/m/20171113170646.gzweigyrgg6pwsg4@alvherre.pgsql	2018-01-19 11:49:22 -03:00
Andrew Dunstan	cc4feded0a	Centralize json and jsonb handling of datetime types The creates a single function JsonEncodeDateTime which will format these data types in an efficient and consistent manner. This will be all the more important when we come to jsonpath so we don't have to implement yet more code doing the same thing in two more places. This also extends the code to handle time and timetz types which were not previously handled specially. This requires exposing the time2tm and timetz2tm functions. Patch from Nikita Glukhov	2018-01-16 19:07:13 -05:00
Peter Eisentraut	d91da5eced	Remove useless use of bit-masking macros In this case, the macros SET_8_BYTES(), GET_8_BYTES(), SET_4_BYTES(), GET_4_BYTES() are no-ops, so we can just remove them. The plan is to perhaps remove them from the source code altogether, so we'll start here. Discussion: https://www.postgresql.org/message-id/5d51721a-69ef-2053-9172-599b539f0628@2ndquadrant.com	2018-01-16 17:12:16 -05:00
Peter Eisentraut	9e945f8626	Fix Latin spelling "c.f." should be "cf.".	2018-01-11 08:32:01 -05:00
Peter Eisentraut	acc67ffd0a	Give more accurate error message for dropping pinned portal The previous code gave the same error message for attempting to drop pinned and active portals, but those are separate states, so give separate error messages.	2018-01-10 09:22:07 -05:00
Andrew Dunstan	11b623dd0a	Implement TZH and TZM timestamp format patterns These are compatible with Oracle and required for the datetime template language for jsonpath in an upcoming patch. Nikita Glukhov and Andrew Dunstan, reviewed by Pavel Stehule.	2018-01-09 14:25:05 -05:00
Peter Eisentraut	0f7c49e855	Update portal-related memory context names and API Rename PortalMemory to TopPortalContext, to avoid confusion with PortalContext and align naming with similar top-level memory contexts. Rename PortalData's "heap" field to portalContext. The "heap" naming seems quite antiquated and confusing. Also get rid of the PortalGetHeapMemory() macro and access the field directly, which we do for other portal fields, so this abstraction doesn't buy anything. Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndquadrant.com> Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>	2018-01-09 13:47:56 -05:00
Alvaro Herrera	bab2969867	Fix typo Author: Dagfinn Ilmari Mannsåker Discussion: https://postgr.es/m/d8jefpk4jtd.fsf@dalvik.ping.uio.no	2018-01-03 19:12:06 -03:00
Bruce Momjian	9d4649ca49	Update copyright for 2018 Backpatch-through: certain files through 9.3	2018-01-02 23:30:12 -05:00
Alvaro Herrera	be2343221f	Protect against hypothetical memory leaks in RelationGetPartitionKey Also, fix a comment that commit `8a0596cb65` made obsolete. Reported-by: Robert Haas Discussion: http://postgr.es/m/CA+TgmoYbpuUUUp2GhYNwWm0qkah39spiU7uOiNXLz20ASfKYoA@mail.gmail.com	2017-12-27 18:06:14 -03:00
Teodor Sigaev	ff963b393c	Add polygon opclass for SP-GiST Polygon opclass uses compress method feature of SP-GiST added earlier. For now it's a single operator class which uses this feature. SP-GiST actually indexes a bounding boxes of input polygons, so part of supported operations are lossy. Opclass uses most methods of corresponding opclass over boxes of SP-GiST and treats bounding boxes as point in 4D-space. Bump catalog version. Authors: Nikita Glukhov, Alexander Korotkov with minor editorization by me Reviewed-By: all authors + Darafei Praliaskouski Discussion: https://www.postgresql.org/message-id/flat/54907069.1030506@sigaev.ru	2017-12-25 18:59:38 +03:00
Alvaro Herrera	9373baa0f7	Minor edits to catalog files and scripts This fixes a few typos and small mistakes; it also cleans a few minor stylistic issues. The biggest functional change is that Gen_fmgrtab.pl no longer knows the OID of language 'internal'. Author: John Naylor Discussion: https://postgr.es/m/CAJVSVGXAkwbk-A9QHHHf00N905kKisyQbaYwKqaRpze_gPXGfg@mail.gmail.com	2017-12-21 19:07:32 -03:00
Alvaro Herrera	8a0596cb65	Get rid of copy_partition_key That function currently exists to avoid leaking memory in CacheMemoryContext in case of trouble while the partition key is being built, but there's a better way: allocate everything in a memcxt that goes away if the current (sub)transaction fails, and once the partition key is built and no further errors can occur, make the memcxt permanent by making it a child of CacheMemoryContext. Reviewed-by: Tom Lane Discussion: https://postgr.es/m/20171027172730.eh2domlkpn4ja62m@alvherre.pgsql	2017-12-21 14:21:39 -03:00
Tom Lane	c98c35cd08	Avoid putting build-location-dependent strings into generated files. Various Perl scripts we use to generate files were in the habit of printing things like "generated by $0" into their output files. That looks like a fine idea at first glance, but it results in non-reproducible output, because in VPATH builds $0 won't be just the name of the script file, but a full path for it. We'd prefer that you get identical results whether using VPATH or not, so this is a bad thing. Some of these places also printed their input file name(s), causing an additional hazard of the same type. Hence, establish a policy that thou shalt not print $0, nor input file pathnames, into output files (they're still allowed in error messages, though). Instead just write the script name verbatim. While we are at it, we can make these annotations more useful by giving the script's full relative path name within the PG source tree, eg instead of Gen_fmgrtab.pl let's print src/backend/utils/Gen_fmgrtab.pl. Not all of the changes made here actually affect any files shipped in finished tarballs today, but it seems best to apply the policy everyplace so that nobody copies unsafe code into places where it could matter. Christoph Berg and Tom Lane Discussion: https://postgr.es/m/20171215102223.GB31812@msg.df7cb.de	2017-12-21 10:57:06 -05:00
Andres Freund	1804284042	Add parallel-aware hash joins. Introduce parallel-aware hash joins that appear in EXPLAIN plans as Parallel Hash Join with Parallel Hash. While hash joins could already appear in parallel queries, they were previously always parallel-oblivious and had a partial subplan only on the outer side, meaning that the work of the inner subplan was duplicated in every worker. After this commit, the planner will consider using a partial subplan on the inner side too, using the Parallel Hash node to divide the work over the available CPU cores and combine its results in shared memory. If the join needs to be split into multiple batches in order to respect work_mem, then workers process different batches as much as possible and then work together on the remaining batches. The advantages of a parallel-aware hash join over a parallel-oblivious hash join used in a parallel query are that it: * avoids wasting memory on duplicated hash tables * avoids wasting disk space on duplicated batch files * divides the work of building the hash table over the CPUs One disadvantage is that there is some communication between the participating CPUs which might outweigh the benefits of parallelism in the case of small hash tables. This is avoided by the planner's existing reluctance to supply partial plans for small scans, but it may be necessary to estimate synchronization costs in future if that situation changes. Another is that outer batch 0 must be written to disk if multiple batches are required. A potential future advantage of parallel-aware hash joins is that right and full outer joins could be supported, since there is a single set of matched bits for each hashtable, but that is not yet implemented. A new GUC enable_parallel_hash is defined to control the feature, defaulting to on. Author: Thomas Munro Reviewed-By: Andres Freund, Robert Haas Tested-By: Rafia Sabih, Prabhat Sahu Discussion: https://postgr.es/m/CAEepm=2W=cOkiZxcg6qiFQP-dHUe09aqTrEMM7yJDrHMhDv_RA@mail.gmail.com https://postgr.es/m/CAEepm=37HKyJ4U6XOLi=JgfSHM3o6B-GaeO-6hkOmneTDkH+Uw@mail.gmail.com	2017-12-21 00:43:41 -08:00
Andres Freund	ab9e0e718a	Add shared tuplestores. SharedTuplestore allows multiple participants to write into it and then read the tuples back from it in parallel. Each reader receives partial results. For now it always uses disk files, but other buffering policies and other kinds of scans (ie each reader receives complete results) may be useful in future. The upcoming parallel hash join feature will use this facility. Author: Thomas Munro Reviewed-By: Peter Geoghegan, Andres Freund, Robert Haas Discussion: https://postgr.es/m/CAEepm=2W=cOkiZxcg6qiFQP-dHUe09aqTrEMM7yJDrHMhDv_RA@mail.gmail.com	2017-12-18 14:23:19 -08:00
Magnus Hagander	7731c32087	Fix typo on comment Author: David Rowley	2017-12-18 11:24:55 +01:00
Tom Lane	b31a9d7dd3	Suppress compiler warning about no function return value. Compilers that don't know that ereport(ERROR) doesn't return complained about the new coding in scanint8() introduced by commit `101c7ee3e`. Tweak coding to avoid the warning. Per buildfarm.	2017-12-17 00:41:41 -05:00
Andres Freund	9c2f0a6c3c	Fix pruning of locked and updated tuples. Previously it was possible that a tuple was not pruned during vacuum, even though its update xmax (i.e. the updating xid in a multixact with both key share lockers and an updater) was below the cutoff horizon. As the freezing code assumed, rightly so, that that's not supposed to happen, xmax would be preserved (as a member of a new multixact or xmax directly). That causes two problems: For one the tuple is below the xmin horizon, which can cause problems if the clog is truncated or once there's an xid wraparound. The bigger problem is that that will break HOT chains, which in turn can lead two to breakages: First, failing index lookups, which in turn can e.g lead to constraints being violated. Second, future hot prunes / vacuums can end up making invisible tuples visible again. There's other harmful scenarios. Fix the problem by recognizing that tuples can be DEAD instead of RECENTLY_DEAD, even if the multixactid has alive members, if the update_xid is below the xmin horizon. That's safe because newer versions of the tuple will contain the locking xids. A followup commit will harden the code somewhat against future similar bugs and already corrupted data. Author: Andres Freund, with changes by Alvaro Herrera Reported-By: Daniel Wood Analyzed-By: Andres Freund, Alvaro Herrera, Robert Haas, Peter Geoghegan, Daniel Wood, Yi Wen Wong, Michael Paquier Reviewed-By: Alvaro Herrera, Robert Haas, Michael Paquier Discussion: https://postgr.es/m/E5711E62-8FDF-4DCA-A888-C200BF6B5742@amazon.com https://postgr.es/m/20171102112019.33wb7g5wp4zpjelu@alap3.anarazel.de Backpatch: 9.3-	2017-12-14 18:20:47 -08:00
Tom Lane	9fa6f00b13	Rethink MemoryContext creation to improve performance. This patch makes a number of interrelated changes to reduce the overhead involved in creating/deleting memory contexts. The key ideas are: * Include the AllocSetContext header of an aset.c context in its first malloc request, rather than allocating it separately in TopMemoryContext. This means that we now always create an initial or "keeper" block in an aset, even if it never receives any allocation requests. * Create freelists in which we can save and recycle recently-destroyed asets (this idea is due to Robert Haas). * In the common case where the name of a context is a constant string, just store a pointer to it in the context header, rather than copying the string. The first change eliminates a palloc/pfree cycle per context, and also avoids bloat in TopMemoryContext, at the price that creating a context now involves a malloc/free cycle even if the context never receives any allocations. That would be a loser for some common usage patterns, but recycling short-lived contexts via the freelist eliminates that pain. Avoiding copying constant strings not only saves strlen() and strcpy() overhead, but is an essential part of the freelist optimization because it makes the context header size constant. Currently we make no attempt to use the freelist for contexts with non-constant names. (Perhaps someday we'll need to think harder about that, but in current usage, most contexts with custom names are long-lived anyway.) The freelist management in this initial commit is pretty simplistic, and we might want to refine it later --- but in common workloads that will never matter because the freelists will never get full anyway. To create a context with a non-constant name, one is now required to call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME option. AllocSetContextCreate becomes a wrapper macro, and it includes a test that will complain about non-string-literal context name parameters on gcc and similar compilers. An unfortunate side effect of making AllocSetContextCreate a macro is that one is now required to use the size parameter abstraction macros (ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of writing out individual size parameters no longer works unless you switch to AllocSetContextCreateExtended. Internally to the memory-context-related modules, the context creation APIs are simplified, removing the rather baroque original design whereby a context-type module called mcxt.c which then called back into the context-type module. That saved a bit of code duplication, but not much, and it prevented context-type modules from exercising control over the allocation of context headers. In passing, I converted the test-and-elog validation of aset size parameters into Asserts to save a few more cycles. The original thought was that callers might compute size parameters on the fly, but in practice nobody does that, so it's useless to expend cycles on checking those numbers in production builds. Also, mark the memory context method-pointer structs "const", just for cleanliness. Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us	2017-12-13 13:55:16 -05:00
Andres Freund	8e211f5391	Add float.h include to int8.c, for isnan(). port.h redirects isnan() to _isnan() on windows, which in turn is provided by float.h rather than math.h. Therefore include the latter as well. Per buildfarm.	2017-12-12 23:32:43 -08:00
Andres Freund	f512a6e132	Consistently use PG_INT(16\|32\|64)_(MIN\|MAX). Per buildfarm animal woodlouse.	2017-12-12 18:19:13 -08:00
Andres Freund	101c7ee3ee	Use new overflow aware integer operations. A previous commit added inline functions that provide fast(er) and correct overflow checks for signed integer math. Use them in a significant portion of backend code. There's more to touch in both backend and frontend code, but these were the easily identifiable cases. The old overflow checks are noticeable in integer heavy workloads. A secondary benefit is that getting rid of overflow checks that rely on signed integer overflow wrapping around, will allow us to get rid of -fwrapv in the future. Which in turn slows down other code. Author: Andres Freund Discussion: https://postgr.es/m/20171024103954.ztmatprlglz3rwke@alap3.anarazel.de	2017-12-12 16:55:37 -08:00
Robert Haas	95b52351fe	Remove obsolete comment. Commit `8b304b8b72` removed replacement selection, but left behind this comment text. The optimization to which the comment refers is not relevant without replacement selection, because if we had so few tuples as to require only one tape, we would have just completed the sort in memory. Peter Geoghegan Discussion: http://postgr.es/m/CAH2-WznqupLA8CMjp+vqzoe0yXu0DYYbQSNZxmgN76tLnAOZ_w@mail.gmail.com	2017-12-12 19:33:50 -05:00
Robert Haas	ab72716778	Support Parallel Append plan nodes. When we create an Append node, we can spread out the workers over the subplans instead of piling on to each subplan one at a time, which should typically be a bit more efficient, both because the startup cost of any plan executed entirely by one worker is paid only once and also because of reduced contention. We can also construct Append plans using a mix of partial and non-partial subplans, which may allow for parallelism in places that otherwise couldn't support it. Unfortunately, this patch doesn't handle the important case of parallelizing UNION ALL by running each branch in a separate worker; the executor infrastructure is added here, but more planner work is needed. Amit Khandekar, Robert Haas, Amul Sul, reviewed and tested by Ashutosh Bapat, Amit Langote, Rafia Sabih, Amit Kapila, and Rajkumar Raghuwanshi. Discussion: http://postgr.es/m/CAJ3gD9dy0K_E8r727heqXoBmWZ83HwLFwdcaSSmBQ1+S+vRuUQ@mail.gmail.com	2017-12-05 17:28:39 -05:00
Tom Lane	2069e6faa0	Clean up assorted messiness around AllocateDir() usage. This patch fixes a couple of low-probability bugs that could lead to reporting an irrelevant errno value (and hence possibly a wrong SQLSTATE) concerning directory-open or file-open failures. It also fixes places where we took shortcuts in reporting such errors, either by using elog instead of ereport or by using ereport but forgetting to specify an errcode. And it eliminates a lot of just plain redundant error-handling code. In service of all this, export fd.c's formerly-static function ReadDirExtended, so that external callers can make use of the coding pattern dir = AllocateDir(path); while ((de = ReadDirExtended(dir, path, LOG)) != NULL) if they'd like to treat directory-open failures as mere LOG conditions rather than errors. Also fix FreeDir to be a no-op if we reach it with dir == NULL, as such a coding pattern would cause. Then, remove code at many call sites that was throwing an error or log message for AllocateDir failure, as ReadDir or ReadDirExtended can handle that job just fine. Aside from being a net code savings, this gets rid of a lot of not-quite-up-to-snuff reports, as mentioned above. (In some places these changes result in replacing a custom error message such as "could not open tablespace directory" with more generic wording "could not open directory", but it was agreed that the custom wording buys little as long as we report the directory name.) In some other call sites where we can't just remove code, change the error reports to be fully project-style-compliant. Also reorder code in restoreTwoPhaseData that was acquiring a lock between AllocateDir and ReadDir; in the unlikely but surely not impossible case that LWLockAcquire changes errno, AllocateDir failures would be misreported. There is no great value in opening the directory before acquiring TwoPhaseStateLock, so just do it in the other order. Also fix CheckXLogRemoved to guarantee that it preserves errno, as quite a number of call sites are implicitly assuming. (Again, it's unlikely but I think not impossible that errno could change during a SpinLockAcquire. If so, this function was broken for its own purposes as well as breaking callers.) And change a few places that were using not-per-project-style messages, such as "could not read directory" when "could not open directory" is more correct. Back-patch the exporting of ReadDirExtended, in case we have occasion to back-patch some fix that makes use of it; it's not needed right now but surely making it global is pretty harmless. Also back-patch the restoreTwoPhaseData and CheckXLogRemoved fixes. The rest of this is essentially cosmetic and need not get back-patched. Michael Paquier, with a bit of additional work by me Discussion: https://postgr.es/m/CAB7nPqRpOCxjiirHmebEFhXVTK7V5Jvw4bz82p7Oimtsm3TyZA@mail.gmail.com	2017-12-04 17:02:56 -05:00
Peter Eisentraut	e4128ee767	SQL procedures This adds a new object type "procedure" that is similar to a function but does not have a return type and is invoked by the new CALL statement instead of SELECT or similar. This implementation is aligned with the SQL standard and compatible with or similar to other SQL implementations. This commit adds new commands CALL, CREATE/ALTER/DROP PROCEDURE, as well as ALTER/DROP ROUTINE that can refer to either a function or a procedure (or an aggregate function, as an extension to SQL). There is also support for procedures in various utility commands such as COMMENT and GRANT, as well as support in pg_dump and psql. Support for defining procedures is available in all the languages supplied by the core distribution. While this commit is mainly syntax sugar around existing functionality, future features will rely on having procedures as a separate object type. Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndquadrant.com>	2017-11-30 11:03:20 -05:00
Tom Lane	7ca25b7de6	Fix neqjoinsel's behavior for semi/anti join cases. Previously, this function estimated the selectivity as 1 minus eqjoinsel() for the negator equality operator, regardless of join type (I think there was an expectation that eqjoinsel would handle the join type). But actually this is completely wrong for semijoin cases: the fraction of the LHS that has a non-matching row is not one minus the fraction of the LHS that has a matching row. In reality a semijoin with <> will nearly always succeed: it can only fail when the RHS is empty, or it contains a single distinct value that is equal to the particular LHS value, or the LHS value is null. The only one of those things we should have much confidence in estimating is the fraction of LHS values that are null, so let's just take the selectivity as 1 minus outer nullfrac. Per coding convention, antijoin should be estimated the same as semijoin. Arguably this is a bug fix, but in view of the lack of field complaints and the risk of destabilizing plans, no back-patch. Thomas Munro, reviewed by Ashutosh Bapat Discussion: https://postgr.es/m/CAEepm=270ze2hVxWkJw-5eKzc3AB4C9KpH3L2kih75R5pdSogg@mail.gmail.com	2017-11-29 22:00:37 -05:00
Robert Haas	eaedf0df71	Update typedefs.list and re-run pgindent Discussion: http://postgr.es/m/CA+TgmoaA9=1RWKtBWpDaj+sF3Stgc8sHgf5z=KGtbjwPLQVDMA@mail.gmail.com	2017-11-29 09:24:24 -05:00
Joe Conway	752714dd9d	Make has_sequence_privilege support WITH GRANT OPTION The various has_*_privilege() functions all support an optional WITH GRANT OPTION added to the supported privilege types to test whether the privilege is held with grant option. That is, all except has_sequence_privilege() variations. Fix that. Back-patch to all supported branches. Discussion: https://postgr.es/m/005147f6-8280-42e9-5a03-dd2c1e4397ef@joeconway.com	2017-11-26 09:49:40 -08:00
Tom Lane	df3a66e282	Improve planner's handling of set-returning functions in grouping columns. Improve query_is_distinct_for() to accept SRFs in the targetlist when we can prove distinctness from a DISTINCT clause. In that case the de-duplication will surely happen after SRF expansion, so the proof still works. Continue to punt in the case where we'd try to prove distinctness from GROUP BY (or, in the future, source relations). To do that, we'd have to determine whether the SRFs were in the grouping columns or elsewhere in the tlist, and it still doesn't seem worth the trouble. But this trivial change allows us to recognize that "SELECT DISTINCT unnest(foo) FROM ..." produces unique-ified output, which seems worth having. Also, fix estimate_num_groups() to consider the possibility of SRFs in the grouping columns. Its failure to do so was masked before v10 because grouping_planner() scaled up plan rowcount estimates by the estimated SRF multiplier after performing grouping. That doesn't happen anymore, which is more correct, but it means we need an adjustment in the estimate for the number of groups. Failure to do this leads to an underestimate for the number of output rows of subqueries like "SELECT DISTINCT unnest(foo)" compared to what 9.6 and earlier estimated, thus breaking plan choices in some cases. Per report from Dmitry Shalashov. Back-patch to v10 to avoid degraded plan choices compared to previous releases. Discussion: https://postgr.es/m/CAKPeCUGAeHgoh5O=SvcQxREVkoX7UdeJUMj1F5=aBNvoTa+O8w@mail.gmail.com	2017-11-25 11:48:09 -05:00
Tom Lane	0f2458ff5f	Improve valgrind logic in aset.c, and fix multiple issues in generation.c. Revise aset.c so that all the "private" fields of chunk headers are marked NOACCESS when outside the module, improving on the previous coding which protected only requested_size. Fix a couple of corner case bugs, such as failing to re-protect the header during a failure exit from AllocSetRealloc, and wrong padding-size calculation for an oversize allocation request. Apply the same design to generation.c, and also fix several bugs therein that I found by dint of hacking the code to use generation.c as the standard allocator and then running the core regression tests with it. Notably, we have to track the actual size of each block, else the wipe_mem call in GenerationReset clears the wrong amount of memory for an oversize-chunk block; and GenerationCheck needs a way of identifying freed chunks that isn't fooled by palloc(0). I chose to fix the latter by resetting the context pointer to NULL in a freed chunk, roughly like what happens in a freed aset.c chunk. Discussion: https://postgr.es/m/E1eHa4J-0006hI-Q8@gemulon.postgresql.org	2017-11-24 19:28:19 -05:00
Tom Lane	f65d21b258	Mostly-cosmetic improvements in memory chunk header alignment coding. Add commentary about what we're doing and why. Apply the method used for padding in GenerationChunk to AllocChunkData, replacing the rather ad-hoc solution used in commit `7e3aa03b4`. Reorder fields in GenerationChunk so that the padding calculation will work even if sizeof(size_t) is different from sizeof(void *) --- likely that will never happen, but we don't need the assumption if we do it like this. Improve static assertions about alignment. In passing, fix a couple of oversights in the "large chunk" path in GenerationAlloc(). Discussion: https://postgr.es/m/E1eHa4J-0006hI-Q8@gemulon.postgresql.org	2017-11-24 15:50:22 -05:00
Tom Lane	cc3c4af4a9	Fix bug in generation.c's valgrind support. This doesn't look like the last such bug, but it's one that the test_decoding regression test is tripping over. Per buildfarm. Tomas Vondra Discussion: https://postgr.es/m/c903f275-2150-fa52-64bf-dca7b53ebf8d@fuzzy.cz	2017-11-24 13:43:34 -05:00
Tom Lane	07bd77b95a	Ensure sizeof(GenerationChunk) is maxaligned. Per buildfarm. Also improve some comments.	2017-11-23 17:02:15 -05:00
Simon Riggs	b99661c2ff	Tweak code for older compilers Attempt to quiesce build farm Author: Tomas Vondra	2017-11-23 06:55:18 +11:00
Simon Riggs	a4ccc1cef5	Generational memory allocator Add new style of memory allocator, known as Generational appropriate for use in cases where memory is allocated and then freed in roughly oldest first order (FIFO). Use new allocator for logical decoding’s reorderbuffer to significantly reduce memory usage and improve performance. Author: Tomas Vondra Reviewed-by: Simon Riggs	2017-11-23 05:45:07 +11:00
Simon Riggs	2ede45c3a4	Fix pg_control_checkpoint from commit `4b0d28de06` Author: Simon Riggs <simon@2ndQuadrant.com> Reported-By: Andreas Seltenreich <seltenreich@gmx.de>	2017-11-21 08:00:54 +11:00
Tom Lane	52f63bd916	Fix compiler warning in rangetypes_spgist.c. On gcc 7.2.0, comparing pointer to (Datum) 0 produces a warning. Treat it as a simple pointer to avoid that; this is more consistent with comparable code elsewhere, anyway. Tomas Vondra Discussion: https://postgr.es/m/99410021-61ef-9a9a-9bc8-f733ece637ee@2ndquadrant.com	2017-11-18 16:46:29 -05:00
Tom Lane	4797f9b519	Merge near-duplicate code in RI triggers. Merge ri_restrict_del and ri_restrict_upd into one function ri_restrict. Create a function ri_setnull that is the common implementation of RI_FKey_setnull_del and RI_FKey_setnull_upd. Likewise create a function ri_setdefault that is the common implementation of RI_FKey_setdefault_del and RI_FKey_setdefault_upd. All of these pairs of functions were identical except for needing to check for no-actual-key-change in the UPDATE cases; the one extra if-test is a small price to pay for saving so much code. Aside from removing about 400 lines of essentially duplicate code, this allows us to recognize that we were uselessly caching two identical plans whenever there were pairs of triggers using these duplicated functions (which is likely very common). Ildar Musin, reviewed by Ildus Kurbangaliev Discussion: https://postgr.es/m/ca7064a7-6adc-6f22-ca47-8615ba9425a5@postgrespro.ru	2017-11-18 16:24:05 -05:00
Tom Lane	976a1a48fc	Improve to_date/to_number/to_timestamp behavior with multibyte characters. The documentation says that these functions skip one input character per literal (non-pattern) format character. Actually, though, they skipped one input byte per literal byte, which could be hugely confusing if either data or format contained multibyte characters. To fix, adjust the FormatNode representation and parse_format() so that multibyte format characters are stored as one FormatNode not several, and adjust the data-skipping bits to advance by pg_mblen() not necessarily one byte. There's no user-visible behavior change on the to_char() side, although the internal representation changes. Commit `e87d4965b` had already fixed most places where we skip characters on the basis of non-literal format patterns to advance by characters not bytes, but this gets one more place, the SKIP_THth macro. I think everything in formatting.c gets that right now. It'd be nice to have some regression test cases covering this behavior; but of course there's no way to do so in an encoding-agnostic way, and many of the interesting aspects would also require unportable locale selections. So I've not bothered here. Discussion: https://postgr.es/m/28186.1510957703@sss.pgh.pa.us	2017-11-18 12:42:52 -05:00
Tom Lane	63ca86318d	Fix quoted-substring handling in format parsing for to_char/to_number/etc. This code evidently intended to treat backslash as an escape character within double-quoted substrings, but it was sufficiently confused that cases like ..."foo\\"... did not work right: the second backslash managed to quote the double-quote after it, despite being quoted itself. Rewrite to get that right, while preserving the existing behavior outside double-quoted substrings, which is that backslash isn't special except in the combination \". Comparing to Oracle, it seems that their version of to_char() for timestamps allows literal alphanumerics only within double quotes, while non-alphanumerics are allowed outside quotes; backslashes aren't special anywhere; there is no way at all to emit a literal double quote. (Bizarrely, their to_char() for numbers is different; it doesn't allow literal text at all AFAICT.) The fact that they don't treat backslash as special justifies our existing behavior for backslash outside double quotes. I considered making backslash inside double quotes act the same way (ie, special only if before "), which in a green field would be a more consistent behavior. But that would likely break more existing SQL code than what this patch does. Add some test cases illustrating this behavior. (Only the last new case actually changes behavior in this commit.) Little of this behavior was documented, either, so fix that. Discussion: https://postgr.es/m/3626.1510949486@sss.pgh.pa.us	2017-11-18 12:16:37 -05:00
Robert Haas	611fe7d479	Update postgresql.conf.sample comment for bgwriter_lru_maxpages Commit `14ca9abfbe` should have done this, but did not. Jeff Janes Discussion: http://postgr.es/m/CAMkU=1yWOvL+YFYzGM9yXSoWjxr_5_Ny78pPzLKQCkfgB7H-JQ@mail.gmail.com	2017-11-17 14:52:00 -05:00
Tom Lane	e87d4965bd	Prevent to_number() from losing data when template doesn't match exactly. Non-data template patterns would consume characters whether or not those characters were what the pattern expected, for example SELECT TO_NUMBER('1234', '9,999'); produced 134 because the '2' got eaten by the comma pattern. This seems undesirable, not least because it doesn't happen in Oracle. For the ',' and 'G' template patterns, we can fix this by consuming characters only if they match what the pattern would output. For non-data patterns such as 'L' and 'TH', it seems impractical to tighten things up to the point of consuming only exact matches to what the pattern would output; but we can improve matters quite a lot by redefining the behavior as "consume only characters that aren't digits, signs, decimal point, or comma". Also, fix it so that the behavior is to consume the number of characters the pattern would output, not the number of bytes. The old coding would do surprising things with non-ASCII currency symbols, for example. (It would be good to apply that rule for literal text as well, but this commit only fixes it for non-data patterns.) Oliver Ford, reviewed by Thomas Munro and Nathan Wagner, and whacked around a bit more by me Discussion: https://postgr.es/m/CAGMVOdvpbMqPf9XWNzOwBpzJfErkydr_fEGhmuDGa015z97mwg@mail.gmail.com	2017-11-17 12:04:13 -05:00
Tom Lane	687f096ea9	Make PL/Python handle domain-type conversions correctly. Fix PL/Python so that it can handle domains over composite, and so that it enforces domain constraints correctly in other cases that were not always done properly before. Notably, it didn't do arrays of domains right (oversight in commit `c12d570fa`), and it failed to enforce domain constraints when returning a composite type containing a domain field, and if a transform function is being used for a domain's base type then it failed to enforce domain constraints on the result. Also, in many places it missed checking domain constraints on null values, because the plpy_typeio code simply wasn't called for Py_None. Rather than try to band-aid these problems, I made a significant refactoring of the plpy_typeio logic. The existing design of recursing for array and composite members is extended to also treat domains as containers requiring recursion, and the APIs for the module are cleaned up and simplified. The patch also modifies plpy_typeio to rely on the typcache more than it did before (which was pretty much not at all). This reduces the need for repetitive lookups, and lets us get rid of an ad-hoc scheme for detecting changes in composite types. I added a couple of small features to typcache to help with that. Although some of this is fixing bugs that long predate v11, I don't think we should risk a back-patch: it's a significant amount of code churn, and there've been no complaints from the field about the bugs. Tom Lane, reviewed by Anthony Bykov Discussion: https://postgr.es/m/24449.1509393613@sss.pgh.pa.us	2017-11-16 16:23:04 -05:00
Robert Haas	79f2d63713	Update postgresql.conf.sample to match pg_settings classificaitons. A handful of settings, most notably shared_preload_libraries, were just plain the wrong place compared to their assigned config_group value in guc.c (and thus pg_settings). In other cases the names of the sections in postgresql.conf.sample were mildly different from the corresponding entries in config_group_names[]. Make it all consistent. Adrián Escoms, reviewed by me. Discussion: http://postgr.es/m/CACksPC2veEmFRYqwYepWYO9U7aFhAx6sYq+WqjTyHw7uV=E=pw@mail.gmail.com	2017-11-16 12:57:17 -05:00
Andrew Dunstan	98d54bb779	Back out the session_start and session_end hooks feature. It's become apparent during testing that there are problems with at least the testing regime. I don't think we should have it without a working test regime, and the difficulties might indicate implementation problems anyway, so I'm backing out the whole thing until that's sorted out. This reverts commits `7459484` `9989f92` `cd8ce3a`	2017-11-16 11:35:02 -05:00
Andrew Dunstan	cd8ce3a22c	Add hooks for session start and session end These hooks can be used in loadable modules. A simple test module is included. Discussion: https://postgr.es/m/20170720204733.40f2b7eb.nagata@sraoss.co.jp Fabrízio de Royes Mello and Yugo Nagata Reviewed by Michael Paquier and Aleksandr Parfenov	2017-11-15 10:16:34 -05:00
Robert Haas	ebc189e122	Fix typo. Jesper Pedersen Discussion: http://postgr.es/m/000f92d6-f623-95a5-b341-46e2c0495cea@redhat.com	2017-11-15 08:37:41 -05:00
Robert Haas	e5253fdc4f	Add parallel_leader_participation GUC. Sometimes, for testing, it's useful to have the leader do nothing but read tuples from workers; and it's possible that could work out better even in production. Thomas Munro, reviewed by Amit Kapila and by me. A few final tweaks by me. Discussion: http://postgr.es/m/CAEepm=2U++Lp3bNTv2Bv_kkr5NE2pOyHhxU=G0YTa4ZhSYhHiw@mail.gmail.com	2017-11-15 08:23:18 -05:00
Noah Misch	e02571b73f	Don't call pgwin32_message_to_UTF16() without CurrentMemoryContext. PostgreSQL running as a Windows service crashed upon calling write_stderr() before MemoryContextInit(). This fix completes work started in `5735efee15`. Messages this early contain only ASCII bytes; if we removed the CurrentMemoryContext requirement, the ensuing conversions would have no effect. Back-patch to 9.3 (all supported versions). Takayuki Tsunakawa, reviewed by Michael Paquier. Discussion: https://postgr.es/m/0A3221C70F24FB45833433255569204D1F80CC73@G01JPEXMBYT05	2017-11-12 13:03:15 -08:00
Noah Misch	2918fcedbf	Ignore XML declaration in xpath_internal(), for UTF8 databases. When a value contained an XML declaration naming some other encoding, this function interpreted UTF8 bytes as the named encoding, yielding mojibake. xml_parse() already has similar logic. This would be necessary but not sufficient for non-UTF8 databases, so preserve behavior there until the xpath facility can support such databases comprehensively. Back-patch to 9.3 (all supported versions). Pavel Stehule and Noah Misch Discussion: https://postgr.es/m/CAFj8pRC-dM=tT=QkGi+Achkm+gwPmjyOayGuUfXVumCxkDgYWg@mail.gmail.com	2017-11-11 11:10:53 -08:00
Peter Eisentraut	0e1539ba0d	Add some const decorations to prototypes Reviewed-by: Fabien COELHO <coelho@cri.ensmp.fr>	2017-11-10 13:38:57 -05:00
Robert Haas	1aba8e651a	Add hash partitioning. Hash partitioning is useful when you want to partition a growing data set evenly. This can be useful to keep table sizes reasonable, which makes maintenance operations such as VACUUM faster, or to enable partition-wise join. At present, we still depend on constraint exclusion for partitioning pruning, and the shape of the partition constraints for hash partitioning is such that that doesn't work. Work is underway to fix that, which should both improve performance and make partitioning pruning work with hash partitioning. Amul Sul, reviewed and tested by Dilip Kumar, Ashutosh Bapat, Yugo Nagata, Rajkumar Raghuwanshi, Jesper Pedersen, and by me. A few final tweaks also by me. Discussion: http://postgr.es/m/CAAJ_b96fhpJAP=ALbETmeLk1Uni_GFZD938zgenhF49qgDTjaQ@mail.gmail.com	2017-11-09 18:07:44 -05:00
Tom Lane	ae20b23a9e	Refactor permissions checks for large objects. Up to now, ACL checks for large objects happened at the level of the SQL-callable functions, which led to CVE-2017-7548 because of a missing check. Push them down to be enforced in inv_api.c as much as possible, in hopes of preventing future bugs. This does have the effect of moving read and write permission errors to happen at lo_open time not loread or lowrite time, but that seems acceptable. Michael Paquier and Tom Lane Discussion: https://postgr.es/m/CAB7nPqRHmNOYbETnc_2EjsuzSM00Z+BWKv9sy6tnvSd5gWT_JA@mail.gmail.com	2017-11-09 12:56:07 -05:00
Tom Lane	6c3a7ba5bb	Fix typo in ALTER SYSTEM output. The header comment written into postgresql.auto.conf by ALTER SYSTEM should match what initdb put there originally. Feike Steenbergen Discussion: https://postgr.es/m/CAK_s-G0KcKdO=0hqZkwb3s+tqZuuHwWqmF5BDsmoO9FtX75r0g@mail.gmail.com	2017-11-09 11:57:20 -05:00
Peter Eisentraut	2eb4a831e5	Change TRUE/FALSE to true/false The lower case spellings are C and C++ standard and are used in most parts of the PostgreSQL sources. The upper case spellings are only used in some files/modules. So standardize on the standard spellings. The APIs for ICU, Perl, and Windows define their own TRUE and FALSE, so those are left as is when using those APIs. In code comments, we use the lower-case spelling for the C concepts and keep the upper-case spelling for the SQL concepts. Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2017-11-08 11:37:28 -05:00
Simon Riggs	4b0d28de06	Remove secondary checkpoint Previously server reserved WAL for last two checkpoints, which used too much disk space for small servers. Bumps PG_CONTROL_VERSION Author: Simon Riggs <simon@2ndQuadrant.com> Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2017-11-07 12:56:30 -05:00
Simon Riggs	98267ee83e	Exclude pg_internal.init from BASE_BACKUP Add docs to explain this for other backup mechanisms Author: David Steele <david@pgmasters.net> Reviewed-by: Petr Jelinek <petr.jelinek@2ndQuadrant.com> et al	2017-11-07 12:28:35 -05:00
Noah Misch	bab3a714b6	Ignore CatalogSnapshot when checking COPY FREEZE prerequisites. This restores the ability, essentially lost in commit `ffaa44cb55`, to use COPY FREEZE under REPEATABLE READ isolation. Back-patch to 9.4, like that commit. Reviewed by Tom Lane. Discussion: https://postgr.es/m/CA+TgmoahWDm-7fperBxzU9uZ99LPMUmEpSXLTw9TmrOgzwnORw@mail.gmail.com	2017-11-05 09:25:52 -08:00
Tom Lane	af20e2d728	Fix ALTER TABLE code to update domain constraints when needed. It's possible for dropping a column, or altering its type, to require changes in domain CHECK constraint expressions; but the code was previously only expecting to find dependent table CHECK constraints. Make the necessary adjustments. This is a fairly old oversight, but it's a lot easier to encounter the problem in the context of domains over composite types than it was before. Given the lack of field complaints, I'm not going to bother with a back-patch, though I'd be willing to reconsider that decision if someone does complain. Patch by me, reviewed by Michael Paquier Discussion: https://postgr.es/m/30656.1509128130@sss.pgh.pa.us	2017-11-01 13:32:23 -04:00
Robert Haas	846fcc8516	Fix problems with the "role" GUC and parallel query. Without this fix, dropping a role can sometimes result in parallel query failures in sessions that have used "SET ROLE" to assume the dropped role, even if that setting isn't active any more. Report by Pavan Deolasee. Patch by Amit Kapila, reviewed by me. Discussion: http://postgr.es/m/CABOikdOomRcZsLsLK+Z+qENM1zxyaWnAvFh3MJZzZnnKiF+REg@mail.gmail.com	2017-10-29 12:58:40 +05:30
Tom Lane	37a795a60b	Support domains over composite types. This is the last major omission in our domains feature: you can now make a domain over anything that's not a pseudotype. The major complication from an implementation standpoint is that places that might be creating tuples of a domain type now need to be prepared to apply domain_check(). It seems better that unprepared code fail with an error like "<type> is not composite" than that it silently fail to apply domain constraints. Therefore, relevant infrastructure like get_func_result_type() and lookup_rowtype_tupdesc() has been adjusted to treat domain-over-composite as a distinct case that unprepared code won't recognize, rather than just transparently treating it the same as plain composite. This isn't a 100% solution to the possibility of overlooked domain checks, but it catches most places. In passing, improve typcache.c's support for domains (it can now cache the identity of a domain's base type), and rewrite the argument handling logic in jsonfuncs.c's populate_record[set]_worker to reduce duplicative per-call lookups. I believe this is code-complete so far as the core and contrib code go. The PLs need varying amounts of work, which will be tackled in followup patches. Discussion: https://postgr.es/m/4206.1499798337@sss.pgh.pa.us	2017-10-26 13:47:45 -04:00
Andrew Dunstan	adee9e4e31	Undo inadvertent change in capitalization in commit `18fc4ec`.	2017-10-26 08:20:00 -04:00
Andrew Dunstan	18fc4ecf4a	Process variadic arguments consistently in json functions json_build_object and json_build_array and the jsonb equivalents did not correctly process explicit VARIADIC arguments. They are modified to use the new extract_variadic_args() utility function which abstracts away the details of the call method. Michael Paquier, reviewed by Tom Lane and Dmitry Dolgov. Backpatch to 9.5 for the jsonb fixes and 9.4 for the json fixes, as that's where they originated.	2017-10-25 07:34:00 -04:00
Andrew Dunstan	f3c6e8a27a	Add a utility function to extract variadic function arguments This is epecially useful in the case or "VARIADIC ANY" functions. The caller can get the artguments and types regardless of whether or not and explicit VARIADIC array argument has been used. The function also provides an option to convert arguments on type "unknown" to to "text". Michael Paquier and me, reviewed by Tom Lane. Backpatch to 9.4 in order to support the following json bug fix.	2017-10-25 07:13:11 -04:00
Tom Lane	36ea99c84d	Fix typcache's failure to treat ranges as container types. Like the similar logic for arrays and records, it's necessary to examine the range's subtype to decide whether the range type can support hashing. We can omit checking the subtype for btree-defined operations, though, since range subtypes are required to have those operations. (Possibly that simplification for btree cases led us to overlook that it does not apply for hash cases.) This is only an issue if the subtype lacks hash support, which is not true of any built-in range type, but it's easy to demonstrate a problem with a range type over, eg, money: you can get a "could not identify a hash function" failure when the planner is misled into thinking that hash join or aggregation would work. This was born broken, so back-patch to all supported branches.	2017-10-20 17:12:27 -04:00
Tom Lane	a8f1efc8ac	Fix misimplementation of typcache logic for extended hashing. The previous coding would report that an array type supports extended hashing if its element type supports regular hashing. This bug is only latent at the moment, since AFAICS there is not yet any code that depends on checking presence of extended-hashing support to make any decisions. (And in any case it wouldn't matter unless the element type has only regular hashing, which isn't true of any core data type.) But that doesn't make it less broken. Extend the cache_array_element_properties infrastructure to check this properly.	2017-10-20 16:08:17 -04:00
Peter Eisentraut	927e1ee2cb	UCS_to_most.pl: Process encodings in sorted order Otherwise the order depends on the Perl hash implementation, making it cumbersome to scan the output when debugging.	2017-10-19 05:58:39 -04:00
Peter Eisentraut	4211673622	Exclude flex-generated code from coverage testing Flex generates a lot of functions that are not actually used. In order to avoid coverage figures being ruined by that, mark up the part of the .l files where the generated code appears by lcov exclusion markers. That way, lcov will typically only reported on coverage for the .l file, which is under our control, but not for the .c file. Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2017-10-16 16:28:11 -04:00
Tom Lane	be0ebb65f5	Allow the built-in ordered-set aggregates to share transition state. The built-in OSAs all share the same transition function, so they can share transition state as long as the final functions cooperate to not do the sort step more than once. To avoid running the tuplesort object in randomAccess mode unnecessarily, add a bit of infrastructure to nodeAgg.c to let the aggregate functions find out whether the transition state is actually being shared or not. This doesn't work for the hypothetical aggregates, since those inject a hypothetical row that isn't traceable to the shared input state. So they remain marked aggfinalmodify = 'w'. Discussion: https://postgr.es/m/CAB4ELO5RZhOamuT9Xsf72ozbenDLLXZKSk07FiSVsuJNZB861A@mail.gmail.com	2017-10-16 15:51:23 -04:00
Andres Freund	141fd1b66c	Improve sys/catcache performance. The following are the individual improvements: 1) Avoidance of FunctionCallInfo based function calls, replaced by more efficient functions with a native C argument interface. 2) Don't extract columns from a cache entry's tuple whenever matching entries - instead store them as a Datum array. This also allows to get rid of having to build dummy tuples for negative & list entries, and of a hack for dealing with cstring vs. text weirdness. 3) Reorder members of catcache.h struct, so imortant entries are more likely to be on one cacheline. 4) Allowing the compiler to specialize critical SearchCatCache for a specific number of attributes allows to unroll loops and avoid other nkeys dependant initialization. 5) Only initializing the ScanKey when necessary, i.e. catcache misses, greatly reduces cache unnecessary cpu cache misses. 6) Split of the cache-miss case from the hash lookup, reducing stack allocations etc in the common case. 7) CatCTup and their corresponding heaptuple are allocated in one piece. This results in making cache lookups themselves roughly three times as fast - full-system benchmarks obviously improve less than that. I've also evaluated further techniques: - replace open coded hash with simplehash - the list walk right now shows up in profiles. Unfortunately it's not easy to do so safely as an entry's memory location can change at various times, which doesn't work well with the refcounting and cache invalidation. - Cacheline-aligning CatCTup entries - helps some with performance, but the win isn't big and the code for it is ugly, because the tuples have to be freed as well. - add more proper functions, rather than macros for SearchSysCacheCopyN etc., but right now they don't show up in profiles. The reason the macro wrapper for syscache.c/h have to be changed, rather than just catcache, is that doing otherwise would require exposing the SysCache array to the outside. That might be a good idea anyway, but it's for another day. Author: Andres Freund Reviewed-By: Robert Haas Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de	2017-10-13 14:22:41 -07:00
Andres Freund	31079a4a8e	Replace remaining uses of pq_sendint with pq_sendint{8,16,32}. pq_sendint() remains, so extension code doesn't unnecessarily break. Author: Andres Freund Discussion: https://postgr.es/m/20170914063418.sckdzgjfrsbekae4@alap3.anarazel.de	2017-10-11 21:00:46 -07:00
Andres Freund	1de09ad8eb	Add more efficient functions to pqformat API. There's three prongs to achieve greater efficiency here: 1) Allow reusing a stringbuffer across pq_beginmessage/endmessage, with the new pq_beginmessage_reuse/endmessage_reuse. This can be beneficial both because it avoids allocating the initial buffer, and because it's more likely to already have an correctly sized buffer. 2) Replacing pq_sendint() with pq_sendint$width() inline functions. Previously unnecessary and unpredictable branches in pq_sendint() were needed. Additionally the replacement functions are implemented more efficiently. pq_sendint is now deprecated, a separate commit will convert all in-tree callers. 3) Add pq_writeint$width(), pq_writestring(). These rely on sufficient space in the StringInfo's buffer, avoiding individual space checks & potential individual resizing. To allow this to be used for strings, expose mbutil.c's MAX_CONVERSION_GROWTH. Followup commits will make use of these facilities. Author: Andres Freund Discussion: https://postgr.es/m/20170914063418.sckdzgjfrsbekae4@alap3.anarazel.de	2017-10-11 16:01:52 -07:00
Tom Lane	5fa6b0d102	Remove unnecessary PG_TRY overhead for CurrentResourceOwner changes. resowner/README contained advice to use a PG_TRY block to restore the old CurrentResourceOwner value anywhere that that variable is transiently changed. That advice was only inconsistently followed, however, and on reflection it seems like unnecessary overhead. We don't bother with such a convention for transient CurrentMemoryContext changes, on the grounds that any (sub)transaction abort will start out by resetting CurrentMemoryContext to what it wants. But the same is true of CurrentResourceOwner, so there seems no need to treat it differently. Hence, remove PG_TRY blocks that exist only to restore CurrentResourceOwner before re-throwing the error. There are a couple of places that restore it along with some other actions, and I left those alone; the restore is probably unnecessary but no noticeable gain will result from removing it. Discussion: https://postgr.es/m/5236.1507583529@sss.pgh.pa.us	2017-10-11 17:44:09 -04:00
Tom Lane	2860596832	Doc: fix missing explanation of default object privileges. The GRANT reference page, which lists the default privileges for new objects, failed to mention that USAGE is granted by default for data types and domains. As a lesser sin, it also did not specify anything about the initial privileges for sequences, FDWs, foreign servers, or large objects. Fix that, and add a comment to acldefault() in the probably vain hope of getting people to maintain this list in future. Noted by Laurenz Albe, though I editorialized on the wording a bit. Back-patch to all supported branches, since they all have this behavior. Discussion: https://postgr.es/m/1507620895.4152.1.camel@cybertec.at	2017-10-11 16:57:14 -04:00
Tom Lane	118e99c3d7	Fix low-probability loss of NOTIFY messages due to XID wraparound. Up to now async.c has used TransactionIdIsInProgress() to detect whether a notify message's source transaction is still running. However, that function has a quick-exit path that reports that XIDs before RecentXmin are no longer running. If a listening backend is doing nothing but listening, and not running any queries, there is nothing that will advance its value of RecentXmin. Once 2 billion transactions elapse, the RecentXmin check causes active transactions to be reported as not running. If they aren't committed yet according to CLOG, async.c decides they aborted and discards their messages. The timing for that is a bit tight but it can happen when multiple backends are sending notifies concurrently. The net symptom therefore is that a sufficiently-long-surviving listen-only backend starts to miss some fraction of NOTIFY traffic, but only under heavy load. The only function that updates RecentXmin is GetSnapshotData(). A brute-force fix would therefore be to take a snapshot before processing incoming notify messages. But that would add cycles, as well as contention for the ProcArrayLock. We can be smarter: having taken the snapshot, let's use that to check for running XIDs, and not call TransactionIdIsInProgress() at all. In this way we reduce the number of ProcArrayLock acquisitions from one per message to one per notify interrupt; that's the same under light load but should be a benefit under heavy load. Light testing says that this change is a wash performance-wise for normal loads. I looked around for other callers of TransactionIdIsInProgress() that might be at similar risk, and didn't find any; all of them are inside transactions that presumably have already taken a snapshot. Problem report and diagnosis by Marko Tiikkaja, patch by me. Back-patch to all supported branches, since it's been like this since 9.0. Discussion: https://postgr.es/m/20170926182935.14128.65278@wrigleys.postgresql.org	2017-10-11 14:28:33 -04:00
Andres Freund	fffd651e83	Rewrite strnlen replacement implementation from `8a241792f9`. The previous placement of the fallback implementation in libpgcommon was problematic, because libpqport functions need strnlen functionality. Move replacement into libpgport. Provide strnlen() under its posix name, instead of pg_strnlen(). Fix stupid configure bug, executing the test only when compiled with threading support. Author: Andres Freund Discussion: https://postgr.es/m/E1e1gR2-0005fB-SI@gemulon.postgresql.org	2017-10-10 14:50:30 -07:00
Andres Freund	82c117cb90	Fix pnstrdup() to not memcpy() the maximum allowed length. The previous behaviour was dangerous if the length passed wasn't the size of the underlying buffer, but the maximum size of the underlying buffer. Author: Andres Freund Discussion: https://postgr.es/m/20161003215524.mwz5p45pcverrkyk@alap3.anarazel.de	2017-10-09 15:20:42 -07:00
Robert Haas	f49842d1ee	Basic partition-wise join functionality. Instead of joining two partitioned tables in their entirety we can, if it is an equi-join on the partition keys, join the matching partitions individually. This involves teaching the planner about "other join" rels, which are related to regular join rels in the same way that other member rels are related to baserels. This can use significantly more CPU time and memory than regular join planning, because there may now be a set of "other" rels not only for every base relation but also for every join relation. In most practical cases, this probably shouldn't be a problem, because (1) it's probably unusual to join many tables each with many partitions using the partition keys for all joins and (2) if you do that scenario then you probably have a big enough machine to handle the increased memory cost of planning and (3) the resulting plan is highly likely to be better, so what you spend in planning you'll make up on the execution side. All the same, for now, turn this feature off by default. Currently, we can only perform joins between two tables whose partitioning schemes are absolutely identical. It would be nice to cope with other scenarios, such as extra partitions on one side or the other with no match on the other side, but that will have to wait for a future patch. Ashutosh Bapat, reviewed and tested by Rajkumar Raghuwanshi, Amit Langote, Rafia Sabih, Thomas Munro, Dilip Kumar, Antonin Houska, Amit Khandekar, and by me. A few final adjustments by me. Discussion: http://postgr.es/m/CAFjFpRfQ8GrQvzp3jA2wnLqrHmaXna-urjm_UY9BqXj=EaDTSA@mail.gmail.com Discussion: http://postgr.es/m/CAFjFpRcitjfrULr5jfuKWRPsGUX0LQ0k8-yG0Qw2+1LBGNpMdw@mail.gmail.com	2017-10-06 11:11:10 -04:00
Peter Eisentraut	036166f26e	Document and use SPI_result_code_string() A lot of semi-internal code just prints out numeric SPI error codes, which is not very helpful. We already have an API function to convert the codes to a string, so let's make more use of that. Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2017-10-04 22:14:21 -04:00
Peter Eisentraut	582bbcf37f	Move SPI error reporting out of ri_ReportViolation() These are two completely unrelated code paths, so it doesn't make sense to pack them into one function. Add attribute noreturn to ri_ReportViolation(). Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2017-10-04 22:14:21 -04:00
Andres Freund	212e6f34d5	Replace binary search in fmgr_isbuiltin with a lookup array. Turns out we have enough functions that the binary search is quite noticeable in profiles. Thus have Gen_fmgrtab.pl build a new mapping from a builtin function's oid to an index in the existing fmgr_builtins array. That keeps the additional memory usage at a reasonable amount. Author: Andres Freund, with input from Tom Lane Discussion: https://postgr.es/m/20170914065128.a5sk7z4xde5uy3ei@alap3.anarazel.de	2017-10-04 00:22:38 -07:00
Tom Lane	c12d570fa1	Support arrays over domains. Allowing arrays with a domain type as their element type was left un-done in the original domain patch, but not for any very good reason. This omission leads to such surprising results as array_agg() not working on a domain column, because the parser can't identify a suitable output type for the polymorphic aggregate. In order to fix this, first clean up the APIs of coerce_to_domain() and some internal functions in parse_coerce.c so that we consistently pass around a CoercionContext along with CoercionForm. Previously, we sometimes passed an "isExplicit" boolean flag instead, which is strictly less information; and coerce_to_domain() didn't even get that, but instead had to reverse-engineer isExplicit from CoercionForm. That's contrary to the documentation in primnodes.h that says that CoercionForm only affects display and not semantics. I don't think this change fixes any live bugs, but it makes things more consistent. The main reason for doing it though is that now build_coercion_expression() receives ccontext, which it needs in order to be able to recursively invoke coerce_to_target_type(). Next, reimplement ArrayCoerceExpr so that the node does not directly know any details of what has to be done to the individual array elements while performing the array coercion. Instead, the per-element processing is represented by a sub-expression whose input is a source array element and whose output is a target array element. This simplifies life in parse_coerce.c, because it can build that sub-expression by a recursive invocation of coerce_to_target_type(). The executor now handles the per-element processing as a compiled expression instead of hard-wired code. The main advantage of this is that we can use a single ArrayCoerceExpr to handle as many as three successive steps per element: base type conversion, typmod coercion, and domain constraint checking. The old code used two stacked ArrayCoerceExprs to handle type + typmod coercion, which was pretty inefficient, and adding yet another array deconstruction to do domain constraint checking seemed very unappetizing. In the case where we just need a single, very simple coercion function, doing this straightforwardly leads to a noticeable increase in the per-array-element runtime cost. Hence, add an additional shortcut evalfunc in execExprInterp.c that skips unnecessary overhead for that specific form of expression. The runtime speed of simple cases is within 1% or so of where it was before, while cases that previously required two levels of array processing are significantly faster. Finally, create an implicit array type for every domain type, as we do for base types, enums, etc. Everything except the array-coercion case seems to just work without further effort. Tom Lane, reviewed by Andrew Dunstan Discussion: https://postgr.es/m/9852.1499791473@sss.pgh.pa.us	2017-09-30 13:40:56 -04:00
Peter Eisentraut	5373bc2a08	Add background worker type Add bgw_type field to background worker structure. It is intended to be set to the same value for all workers of the same type, so they can be grouped in pg_stat_activity, for example. The backend_type column in pg_stat_activity now shows bgw_type for a background worker. The ps listing also no longer calls out that a process is a background worker but just show the bgw_type. That way, being a background worker is more of an implementation detail now that is not shown to the user. However, most log messages still refer to 'background worker "%s"'; otherwise constructing sensible and translatable log messages would become tricky. Reviewed-by: Michael Paquier <michael.paquier@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se>	2017-09-29 11:08:24 -04:00
Robert Haas	8b304b8b72	Remove replacement selection sort. At the time replacement_sort_tuples was introduced, there were still cases where replacement selection sort noticeably outperformed using quicksort even for the first run. However, those cases seem to have evaporated as a result of further improvements made since that time (and perhaps also advances in CPU technology). So remove replacement selection and the controlling GUC entirely. This makes tuplesort.c noticeably simpler and probably paves the way for further optimizations someone might want to do later. Peter Geoghegan, with review and testing by Tomas Vondra and me. Discussion: https://postgr.es/m/CAH2-WzmmNjG_K0R9nqYwMq3zjyJJK+hCbiZYNGhAy-Zyjs64GQ@mail.gmail.com	2017-09-29 10:25:44 -04:00
Tom Lane	7769fc000a	Fix behavior when converting a float infinity to numeric. float8_numeric() and float4_numeric() failed to consider the possibility that the input is an IEEE infinity. The results depended on the platform-specific behavior of sprintf(): on most platforms you'd get something like ERROR: invalid input syntax for type numeric: "inf" but at least on Windows it's possible for the conversion to succeed and deliver a finite value (typically 1), due to a nonstandard output format from sprintf and lack of syntax error checking in these functions. Since our numeric type lacks the concept of infinity, a suitable conversion is impossible; the best thing to do is throw an explicit error before letting sprintf do its thing. While at it, let's use snprintf not sprintf. Overrunning the buffer should be impossible if sprintf does what it's supposed to, but this is cheap insurance against a stack smash if it doesn't. Problem reported by Taiki Kondo. Patch by me based on fix suggestion from KaiGai Kohei. Back-patch to all supported branches. Discussion: https://postgr.es/m/12A9442FBAE80D4E8953883E0B84E088C8C7A2@BPXM01GP.gisp.nec.co.jp	2017-09-27 17:05:53 -04:00
Tom Lane	28e0727076	Revert to 9.6 treatment of ALTER TYPE enumtype ADD VALUE. This reverts commit `15bc038f9`, along with the followon commits `1635e80d3` and `984c92074` that tried to clean up the problems exposed by bug #14825. The result was incomplete because it failed to address parallel-query requirements. With 10.0 release so close upon us, now does not seem like the time to be adding more code to fix that. I hope we can un-revert this code and add the missing parallel query support during the v11 cycle. Back-patch to v10. Discussion: https://postgr.es/m/20170922185904.1448.16585@wrigleys.postgresql.org	2017-09-27 16:14:43 -04:00
Tom Lane	984c92074d	Remove heuristic same-transaction test from check_safe_enum_use(). The blacklist mechanism added by the preceding commit directly fixes most of the practical cases that the same-transaction test was meant to cover. What remains is use-cases like begin; create type e as enum('x'); alter type e add value 'y'; -- use 'y' somehow commit; However, because the same-transaction test is heuristic, it fails on small variants of that, such as renaming the type or changing its owner. Rather than try to explain the behavior to users, let's remove it and just have a rule that the newly added value can't be used before being committed, full stop. Perhaps later it will be worth the implementation effort and overhead to have a more accurate test for type-was-created-in-this-transaction. We'll wait for some field experience with v10 before deciding to do that. Back-patch to v10. Discussion: https://postgr.es/m/20170922185904.1448.16585@wrigleys.postgresql.org	2017-09-26 13:14:46 -04:00
Tom Lane	1635e80d30	Use a blacklist to distinguish original from add-on enum values. Commit `15bc038f9` allowed ALTER TYPE ADD VALUE to be executed inside transaction blocks, by disallowing the use of the added value later in the same transaction, except under limited circumstances. However, the test for "limited circumstances" was heuristic and could reject references to enum values that were created during CREATE TYPE AS ENUM, not just later. This breaks the use-case of restoring pg_dump scripts in a single transaction, as reported in bug #14825 from Balazs Szilfai. We can improve this by keeping a "blacklist" table of enum value OIDs created by ALTER TYPE ADD VALUE during the current transaction. Any visible-but-uncommitted value whose OID is not in the blacklist must have been created by CREATE TYPE AS ENUM, and can be used safely because it could not have a lifespan shorter than its parent enum type. This change also removes the restriction that a renamed enum value can't be used before being committed (unless it was on the blacklist). Andrew Dunstan, with cosmetic improvements by me. Back-patch to v10. Discussion: https://postgr.es/m/20170922185904.1448.16585@wrigleys.postgresql.org	2017-09-26 13:14:46 -04:00
Tom Lane	716ea626a8	Make construct_[md_]array return a valid empty array for zero-size input. If construct_array() or construct_md_array() were given a dimension of zero, they'd produce an array that contains no elements but has positive dimension. This violates a general expectation that empty arrays should have ndims = 0; in particular, while arrays like this print as empty, they don't compare equal to other empty arrays. Up to now we've expected callers to avoid making such calls and instead be careful to call construct_empty_array() if there would be no elements. But this has always been an easily missed case, and we've repeatedly had to fix callers to do it right. In bug #14826, Erwin Brandstetter pointed out yet another such oversight, in ts_lexize(); and a bit of examination of other call sites found at least two more with similar issues. So let's fix the problem centrally and permanently by changing these two functions to construct a proper zero-D empty array whenever the array would be empty. This renders a few explicit calls of construct_empty_array() redundant, but the only such place I found that really seemed worth changing was in ExecEvalArrayExpr(). Although this fixes some very old bugs, no back-patch: the problem is pretty minor and the risk of changing behavior seems to outweigh the benefit in stable branches. Discussion: https://postgr.es/m/20170923125723.1448.39412@wrigleys.postgresql.org Discussion: https://postgr.es/m/20570.1506198383@sss.pgh.pa.us	2017-09-25 11:55:24 -04:00
Peter Eisentraut	6dda0998af	Allow ICU to use SortSupport on Windows with UTF-8 There is no reason to ever prevent the use of SortSupport on Windows when ICU locales are used. We previously avoided SortSupport on Windows with UTF-8 server encoding and a non C-locale due to restrictions in Windows' libc functionality. This is now considered to be a restriction in one platform's libc collation provider, and not a more general platform restriction. Reported-by: Peter Geoghegan <pg@bowt.ie>	2017-09-24 07:55:24 -04:00
Peter Eisentraut	0c5803b450	Refactor new file permission handling The file handling functions from fd.c were called with a diverse mix of notations for the file permissions when they were opening new files. Almost all files created by the server should have the same permissions set. So change the API so that e.g. OpenTransientFile() automatically uses the standard permissions set, and OpenTransientFilePerm() is a new function that takes an explicit permissions set for the few cases where it is needed. This also saves an unnecessary argument for call sites that are just opening an existing file. While we're reviewing these APIs, get rid of the FileName typedef and use the standard const char * for the file name and mode_t for the file mode. This makes these functions match other file handling functions and removes an unnecessary layer of mysteriousness. We can also get rid of a few casts that way. Author: David Steele <david@pgmasters.net>	2017-09-23 10:16:18 -04:00
Tom Lane	85feb77aa0	Assume wcstombs(), towlower(), and sibling functions are always present. These functions are required by SUS v2, which is our minimum baseline for Unix platforms, and are present on all interesting Windows versions as well. Even our oldest buildfarm members have them. Thus, we were not testing the "!USE_WIDE_UPPER_LOWER" code paths, which explains why the bug fixed in commit `e6023ee7f` escaped detection. Per discussion, there seems to be no more real-world value in maintaining this option. Hence, remove the configure-time tests for wcstombs() and towlower(), remove the USE_WIDE_UPPER_LOWER symbol, and remove all the !USE_WIDE_UPPER_LOWER code. There's not actually all that much of the latter, but simplifying the #if nests is a win in itself. Discussion: https://postgr.es/m/20170921052928.GA188913@rfd.leadboat.com	2017-09-22 11:00:58 -04:00
Peter Eisentraut	e6023ee7fa	Fix build with !USE_WIDE_UPPER_LOWER The placement of the ifdef blocks in formatting.c was pretty bogus, so the code failed to compile if USE_WIDE_UPPER_LOWER was not defined. Reported-by: Peter Geoghegan <pg@bowt.ie> Reported-by: Noah Misch <noah@leadboat.com>	2017-09-22 09:26:38 -04:00
Tom Lane	7b86c2ac95	Improve dubious memory management in pg_newlocale_from_collation(). pg_newlocale_from_collation() used malloc() and strdup() directly, which is generally not per backend coding style, and it didn't bother to check for failure results, but would just SIGSEGV instead. Also, if one of the numerous error checks in the middle of the function failed, the already-allocated memory would be leaked permanently. Admittedly, it's not a lot of memory, but it could build up if this function were called repeatedly for a bad collation. The first two problems are easily cured by palloc'ing in TopMemoryContext instead of calling libc directly. We can fairly easily dodge the leakage problem for the struct pg_locale_struct by filling in a temporary variable and allocating permanent storage only once we reach the bottom of the function. It's harder to get rid of the potential leakage for ICU's copy of the collcollate string, but at least that's only allocated after most of the error checks; so live with that aspect. Back-patch to v10 where this code came in, with one or another of the ICU patches.	2017-09-20 13:52:36 -04:00
Andres Freund	fc49e24fa6	Make WAL segment size configurable at initdb time. For performance reasons a larger segment size than the default 16MB can be useful. A larger segment size has two main benefits: Firstly, in setups using archiving, it makes it easier to write scripts that can keep up with higher amounts of WAL, secondly, the WAL has to be written and synced to disk less frequently. But at the same time large segment size are disadvantageous for smaller databases. So far the segment size had to be configured at compile time, often making it unrealistic to choose one fitting to a particularly load. Therefore change it to a initdb time setting. This includes a breaking changes to the xlogreader.h API, which now requires the current segment size to be configured. For that and similar reasons a number of binaries had to be taught how to recognize the current segment size. Author: Beena Emerson, editorialized by Andres Freund Reviewed-By: Andres Freund, David Steele, Kuntal Ghosh, Michael Paquier, Peter Eisentraut, Robert Hass, Tushar Ahuja Discussion: https://postgr.es/m/CAOG9ApEAcQ--1ieKbhFzXSQPw_YLmepaa4hNdnY5+ZULpt81Mw@mail.gmail.com	2017-09-19 22:03:48 -07:00
Tom Lane	2d484f9b05	Remove no-op GiST support functions in the core GiST opclasses. The preceding patch allowed us to remove useless GiST support functions. This patch actually does that for all the no-op cases in the core GiST code. This buys us whatever performance gain is to be had, and more importantly exercises the preceding patch. There remain no-op functions in the contrib GiST opclasses, but those will take more work to remove. Discussion: https://postgr.es/m/CAJEAwVELVx9gYscpE=Be6iJxvdW5unZ_LkcAaVNSeOwvdwtD=A@mail.gmail.com	2017-09-19 23:32:59 -04:00
Andres Freund	54b6cd589a	Speedup pgstat_report_activity by moving mb-aware truncation to read side. Previously multi-byte aware truncation was done on every pgstat_report_activity() call - proving to be a bottleneck for workloads with long query strings that execute quickly. Instead move the truncation to the read side, which commonly is executed far less frequently. That's possible because all server encodings allow to determine the length of a multi-byte string from the first byte. Rename PgBackendStatus.st_activity to st_activity_raw so existing extension users of the field break - their code has to be adjusted to use pgstat_clip_activity(). Author: Andres Freund Tested-By: Khuntal Ghosh Reviewed-By: Robert Haas, Tom Lane Discussion: https://postgr.es/m/20170912071948.pa7igbpkkkviecpz@alap3.anarazel.de	2017-09-19 12:51:14 -07:00
Tom Lane	ed22fb8b00	Cache datatype-output-function lookup info across calls of concat(). Testing indicates this can save a third to a half of the runtime of the function. Pavel Stehule, reviewed by Alexander Kuzmenkov Discussion: https://postgr.es/m/CAFj8pRAT62pRgjoHbgTfJUc2uLmeQ4saUj+yVJAEZUiMwNCmdg@mail.gmail.com	2017-09-19 15:09:38 -04:00
Tom Lane	4bd1994650	Make DatumGetFoo/PG_GETARG_FOO/PG_RETURN_FOO macro names more consistent. By project convention, these names should include "P" when dealing with a pointer type; that is, if the result of a GETARG macro is of type FOO , it should be called PG_GETARG_FOO_P not just PG_GETARG_FOO. Some newer types such as JSONB and ranges had not followed the convention, and a number of contrib modules hadn't gotten that memo either. Rename the offending macros to improve consistency. In passing, fix a few places that thought PG_DETOAST_DATUM() returns a Datum; it does not, it returns "struct varlena ". Applying DatumGetPointer to that happens not to cause any bad effects today, but it's formally wrong. Also, adjust an ltree macro that was designed without any thought for what pgindent would do with it. This is all cosmetic and shouldn't have any impact on generated code. Mark Dilger, some further tweaks by me Discussion: https://postgr.es/m/EA5676F4-766F-4F38-8348-ECC7DB427C6A@gmail.com	2017-09-18 15:21:23 -04:00
Tom Lane	cad22075bc	Fix bogus size calculation introduced by commit `cc5f81366`. The elements of RecordCacheArray are TupleDesc, not TupleDesc *. Those are actually the same size, so that this error is harmless, but it's still wrong --- and it might bite us someday, if TupleDesc ever became a struct, say. Per Coverity.	2017-09-17 11:35:27 -04:00
Peter Eisentraut	3012061b86	Apply pg_get_serial_sequence() to identity column sequences as well Bug: #14813	2017-09-15 14:21:20 -04:00
Tom Lane	71aa4801a8	Get rid of shared_record_typmod_registry_worker_detach; it doesn't work. This code is unsafe, as proven by buildfarm failures, because it tries to access shared memory that might already be gone. It's also unnecessary, because we're about to exit the process anyway and so the record type cache should never be accessed again. The idea was to lay some foundations for someday recycling workers --- which would require attaching to a different shared tupdesc registry --- but that will require considerably more thought. In the meantime let's save some bytes by just removing the nonfunctional code. Problem identification, and proposal to fix by removing functionality from the detach function, by Thomas Munro. I went a bit further by removing the function altogether. Discussion: https://postgr.es/m/E1dsguX-00056N-9x@gemulon.postgresql.org	2017-09-15 10:52:30 -04:00
Tom Lane	eaa4070543	Don't use anonymous unions. Commit `cc5f81366c` introduced a language feature that is not acceptable to strict C89 compilers. Thomas Munro Per buildfarm.	2017-09-15 00:57:38 -04:00
Andres Freund	cc5f81366c	Add support for coordinating record typmods among parallel workers. Tuples can have type RECORDOID and a typmod number that identifies a blessed TupleDesc in a backend-private cache. To support the sharing of such tuples through shared memory and temporary files, provide a typmod registry in shared memory. To achieve that, introduce per-session DSM segments, created on demand when a backend first runs a parallel query. The per-session DSM segment has a table-of-contents just like the per-query DSM segment, and initially the contents are a shared record typmod registry and a DSA area to provide the space it needs to grow. State relating to the current session is accessed via a Session object reached through global variable CurrentSession that may require significant redesign further down the road as we figure out what else needs to be shared or remodelled. Author: Thomas Munro Reviewed-By: Andres Freund Discussion: https://postgr.es/m/CAEepm=0ZtQ-SpsgCyzzYpsXS6e=kZWqk3g5Ygn3MDV7A8dabUA@mail.gmail.com	2017-09-14 19:59:21 -07:00
Tom Lane	7d08ce286c	Distinguish selectivity of < from <= and > from >=. Historically, the selectivity functions have simply not distinguished < from <=, or > from >=, arguing that the fraction of the population that satisfies the "=" aspect can be considered to be vanishingly small, if the comparison value isn't any of the most-common-values for the variable. (If it is, the code path that executes the operator against each MCV will take care of things properly.) But that isn't really true unless we're dealing with a continuum of variable values, and in practice we seldom are. If "x = const" would estimate a nonzero number of rows for a given const value, then it follows that we ought to estimate different numbers of rows for "x < const" and "x <= const", even if the const is not one of the MCVs. Handling this more honestly makes a significant difference in edge cases, such as the estimate for a tight range (x BETWEEN y AND z where y and z are close together). Hence, split scalarltsel into scalarltsel/scalarlesel, and similarly split scalargtsel into scalargtsel/scalargesel. Adjust <= and >= operator definitions to reference the new selectivity functions. Improve the core ineq_histogram_selectivity() function to make a correction for equality. (Along the way, I learned quite a bit about exactly why that function gives good answers, which I tried to memorialize in improved comments.) The corresponding join selectivity functions were, and remain, just stubs. But I chose to split them similarly, to avoid confusion and to prevent the need for doing this exercise again if someone ever makes them less stubby. In passing, change ineq_histogram_selectivity's clamp for extreme probability estimates so that it varies depending on the histogram size, instead of being hardwired at 0.0001. With the default histogram size of 100 entries, you still get the old clamp value, but bigger histograms should allow us to put more faith in edge values. Tom Lane, reviewed by Aleksander Alekseev and Kuntal Ghosh Discussion: https://postgr.es/m/12232.1499140410@sss.pgh.pa.us	2017-09-13 11:12:39 -04:00
Andres Freund	6e7baa3227	Introduce BYTES unit for GUCs. This is already useful for track_activity_query_size, and will further be used in a later commit making the WAL segment size configurable. Author: Beena Emerson Reviewed-By: Andres Freund Discussion: https://postgr.es/m/CAOG9ApEu8bXVwBxkOO9J7ZpM76TASK_vFMEEiCEjwhMmSLiaqQ@mail.gmail.com	2017-09-12 12:13:12 -07:00
Andres Freund	c1898c3e1e	Constify numeric.c. This allows the compiler/linker to move the static variables to a read-only segment. Not all the signature changes are necessary, but it seems better to apply const in a consistent manner. Reviewed-By: Tom Lane Discussion: https://postgr.es/m/20170910232154.asgml44ji2b7lv3d@alap3.anarazel.de	2017-09-11 13:44:37 -07:00
Peter Eisentraut	821fb8cdbf	Message style fixes	2017-09-11 11:21:27 -04:00
Robert Haas	6f6b99d133	Allow a partitioned table to have a default partition. Any tuples that don't route to any other partition will route to the default partition. Jeevan Ladhe, Beena Emerson, Ashutosh Bapat, Rahila Syed, and Robert Haas, with review and testing at various stages by (at least) Rushabh Lathia, Keith Fiske, Amit Langote, Amul Sul, Rajkumar Raghuanshi, Sven Kunze, Kyotaro Horiguchi, Thom Brown, Rafia Sabih, and Dilip Kumar. Discussion: http://postgr.es/m/CAH2L28tbN4SYyhS7YV1YBWcitkqbhSWfQCy0G=apRcC_PEO-bg@mail.gmail.com Discussion: http://postgr.es/m/CAOG9ApEYj34fWMcvBMBQ-YtqR9fTdXhdN82QEKG0SVZ6zeL1xg@mail.gmail.com	2017-09-08 17:28:04 -04:00
Tom Lane	3cf17c9d47	Remove mention of password_encryption = plain in postgresql.conf.sample. Evidently missed in commit `eb61136dc`. Spotted by Oleg Bartunov. Discussion: https://postgr.es/m/CAF4Au4wz_iK5r4fnTnnd8XqioAZQs-P7-VsEAfivW34zMVpAmw@mail.gmail.com	2017-09-08 14:38:54 -04:00
Tom Lane	3ca930fc39	Improve performance of get_actual_variable_range with recently-dead tuples. In commit `fccebe421`, we hacked get_actual_variable_range() to scan the index with SnapshotDirty, so that if there are many uncommitted tuples at the end of the index range, it wouldn't laboriously scan through all of them looking for a live value to return. However, that didn't fix it for the case of many recently-dead tuples at the end of the index; SnapshotDirty recognizes those as committed dead and so we're back to the same problem. To improve the situation, invent a "SnapshotNonVacuumable" snapshot type and use that instead. The reason this helps is that, if the snapshot rejects a given index entry, we know that the indexscan will mark that index entry as killed. This means the next get_actual_variable_range() scan will proceed past that entry without visiting the heap, making the scan a lot faster. We may end up accepting a recently-dead tuple as being the estimated extremal value, but that doesn't seem much worse than the compromise we made before to accept not-yet-committed extremal values. The cost of the scan is still proportional to the number of dead index entries at the end of the range, so in the interval after a mass delete but before VACUUM's cleaned up the mess, it's still possible for get_actual_variable_range() to take a noticeable amount of time, if you've got enough such dead entries. But the constant factor is much much better than before, since all we need to do with each index entry is test its "killed" bit. We chose to back-patch commit `fccebe421` at the time, but I'm hesitant to do so here, because this form of the problem seems to affect many fewer people. Also, even when it happens, it's less bad than the case fixed by commit `fccebe421` because we don't get the contention effects from expensive TransactionIdIsInProgress tests. Dmitriy Sarafannikov, reviewed by Andrey Borodin Discussion: https://postgr.es/m/05C72CF7-B5F6-4DB9-8A09-5AC897653113@yandex.ru	2017-09-07 19:41:51 -04:00
Peter Eisentraut	1356f78ea9	Reduce excessive dereferencing of function pointers It is equivalent in ANSI C to write (*funcptr) () and funcptr(). These two styles have been applied inconsistently. After discussion, we'll use the more verbose style for plain function pointer variables, to make it clear that it's a variable, and the shorter style when the function pointer is in a struct (s.func() or s->func()), because then it's clear that it's not a plain function name, and otherwise the excessive punctuation makes some of those invocations hard to read. Discussion: https://www.postgresql.org/message-id/f52c16db-14ed-757d-4b48-7ef360b1631d@2ndquadrant.com	2017-09-07 13:56:09 -04:00
Simon Riggs	5b6d13eec7	Allow SET STATISTICS on expression indexes Index columns are referenced by ordinal number rather than name, e.g. CREATE INDEX coord_idx ON measured (x, y, (z + t)); ALTER INDEX coord_idx ALTER COLUMN 3 SET STATISTICS 1000; Incompatibility note for release notes: \d+ for indexes now also displays Stats Target Authors: Alexander Korotkov, with contribution by Adrien NAYRAT Review: Adrien NAYRAT, Simon Riggs Wordsmith: Simon Riggs	2017-09-06 13:46:01 -07:00
Tom Lane	8689e38263	Clean up handling of dropped columns in NAMEDTUPLESTORE RTEs. The NAMEDTUPLESTORE patch piggybacked on the infrastructure for TABLEFUNC/VALUES/CTE RTEs, none of which can ever have dropped columns, so the possibility was ignored most places. Fix that, including adding a specification to parsenodes.h about what it's supposed to look like. In passing, clean up assorted comments that hadn't been maintained properly by said patch. Per bug #14799 from Philippe Beaudoin. Back-patch to v10. Discussion: https://postgr.es/m/20170906120005.25630.84360@wrigleys.postgresql.org	2017-09-06 10:41:05 -04:00
Peter Eisentraut	17273d059c	Remove unnecessary parentheses in return statements The parenthesized style has only been used in a few modules. Change that to use the style that is predominant across the whole tree. Reviewed-by: Michael Paquier <michael.paquier@gmail.com> Reviewed-by: Ryan Murphy <ryanfmurphy@gmail.com>	2017-09-05 14:52:55 -04:00
Robert Haas	7b69b6ceb8	Fix assorted carelessness about Datum vs. int64 vs. uint64 Bugs introduced by commit `81c5e46c49`	2017-09-01 00:14:54 -04:00
Robert Haas	0d9506d125	Try to repair poorly-considered code in previous commit.	2017-08-31 23:09:00 -04:00
Robert Haas	81c5e46c49	Introduce 64-bit hash functions with a 64-bit seed. This will be useful for hash partitioning, which needs a way to seed the hash functions to avoid problems such as a hash index on a hash partitioned table clumping all values into a small portion of the bucket space; it's also useful for anything that wants a 64-bit hash value rather than a 32-bit hash value. Just in case somebody wants a 64-bit hash value that is compatible with the existing 32-bit hash values, make the low 32-bits of the 64-bit hash value match the 32-bit hash value when the seed is 0. Robert Haas and Amul Sul Discussion: http://postgr.es/m/CA+Tgmoafx2yoJuhCQQOL5CocEi-w_uG4S2xT0EtgiJnPGcHW3g@mail.gmail.com	2017-08-31 22:21:21 -04:00
Robert Haas	bf11e7ee2e	Propagate sort instrumentation from workers back to leader. Up until now, when parallel query was used, no details about the sort method or space used by the workers were available; details were shown only for any sorting done by the leader. Fix that. Commit `1177ab1dab` forced the test case added by commit `1f6d515a67` to run without parallelism; now that we have this infrastructure, allow that again, with a little tweaking to make it pass with and without force_parallel_mode. Robert Haas and Tom Lane Discussion: http://postgr.es/m/CA+Tgmoa2VBZW6S8AAXfhpHczb=Rf6RqQ2br+zJvEgwJ0uoD_tQ@mail.gmail.com	2017-08-29 13:26:33 -04:00
Andres Freund	20fbf25533	Fix harmless thinko in dsa.c. Commit `16be2fd100` added DSA_ALLOC_HUGE, DSA_ALLOC_ZERO and DSA_ALLOC_NO_OOM which have the same numerical values and meanings as the similarly named MCXT_... macros. In one place we accidentally used MCXT_ALLOC_NO_OOM when DSA_ALLOC_NO_OOM is wanted, so tidy that up. Author: Thomas Munro Discussion: http://postgr.es/m/CAEepm=2AimHxVkkxnMfQvbZMkXy0uKbVa0-D38c5-qwrCm4CMQ@mail.gmail.com Backpatch: 10, where dsa was introduced.	2017-08-24 15:07:40 -07:00
Andres Freund	35ea75632a	Refactor typcache.c's record typmod hash table. Previously, tuple descriptors were stored in chains keyed by a fixed size array of OIDs. That meant there were effectively two levels of collision chain -- one inside and one outside the hash table. Instead, let dynahash.c look after conflicts for us by supplying a proper hash and equal function pair. This is a nice cleanup on its own, but also simplifies followup changes allowing blessed TupleDescs to be shared between backends participating in parallel query. Author: Thomas Munro Reviewed-By: Andres Freund Discussion: https://postgr.es/m/CAEepm%3D34GVhOL%2BarUx56yx7OPk7%3DqpGsv3CpO54feqjAwQKm5g%40mail.gmail.com	2017-08-22 16:11:54 -07:00
Andres Freund	2cd7084524	Change tupledesc->attrs[n] to TupleDescAttr(tupledesc, n). This is a mechanical change in preparation for a later commit that will change the layout of TupleDesc. Introducing a macro to abstract the details of where attributes are stored will allow us to change that in separate step and revise it in future. Author: Thomas Munro, editorialized by Andres Freund Reviewed-By: Andres Freund Discussion: https://postgr.es/m/CAEepm=0ZtQ-SpsgCyzzYpsXS6e=kZWqk3g5Ygn3MDV7A8dabUA@mail.gmail.com	2017-08-20 11:19:07 -07:00
Tom Lane	2b74303637	Make the planner assume that the entries in a VALUES list are distinct. Previously, if we had to estimate the number of distinct values in a VALUES column, we fell back on the default behavior used whenever we lack statistics, which effectively is that there are Min(# of entries, 200) distinct values. This can be very badly off with a large VALUES list, as noted by Jeff Janes. We could consider actually running an ANALYZE-like scan on the VALUES, but that seems unduly expensive, and anyway it could not deliver reliable info if the entries are not all constants. What seems like a better choice is to assume that the values are all distinct. This will sometimes be just as wrong as the old code, but it seems more likely to be more nearly right in many common cases. Also, it is more consistent with what happens in some related cases, for example WHERE x = ANY(ARRAY[1,2,3,...,n]) and WHERE x = ANY(VALUES (1),(2),(3),...,(n)) now are estimated similarly. This was discussed some time ago, but consensus was it'd be better to slip it in at the start of a development cycle not near the end. (It should've gone into v10, really, but I forgot about it.) Discussion: https://postgr.es/m/CAMkU=1xHkyPa8VQgGcCNg3RMFFvVxUdOpus1gKcFuvVi0w6Acg@mail.gmail.com	2017-08-16 15:37:20 -04:00
Peter Eisentraut	77d05706be	Fix up some misusage of appendStringInfo() and friends Change to appendStringInfoChar() or appendStringInfoString() where those can be used. Author: David Rowley <david.rowley@2ndquadrant.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>	2017-08-15 23:34:39 -04:00
Tom Lane	4867d7f62f	Avoid out-of-memory in a hash join with many duplicate inner keys. The executor is capable of splitting buckets during a hash join if too much memory is being used by a small number of buckets. However, this only helps if a bucket's population is actually divisible; if all the hash keys are alike, the tuples still end up in the same new bucket. This can result in an OOM failure if there are enough inner keys with identical hash values. The planner's cost estimates will bias it against choosing a hash join in such situations, but not by so much that it will never do so. To mitigate the OOM hazard, explicitly estimate the hash bucket space needed by just the inner side's most common value, and if that would exceed work_mem then add disable_cost to the hash cost estimate. This approach doesn't account for the possibility that two or more common values would share the same hash value. On the other hand, work_mem is normally a fairly conservative bound, so that eating two or more times that much space is probably not going to kill us. If we have no stats about the inner side, ignore this consideration. There was some discussion of making a conservative assumption, but that would effectively result in disabling hash join whenever we lack stats, which seems like an overreaction given how seldom the problem manifests in the field. Per a complaint from David Hinkle. Although this could be viewed as a bug fix, the lack of similar complaints weighs against back- patching; indeed we waited for v11 because it seemed already rather late in the v10 cycle to be making plan choice changes like this one. Discussion: https://postgr.es/m/32013.1487271761@sss.pgh.pa.us	2017-08-15 14:05:53 -04:00
Robert Haas	e139f1953f	Assorted preparatory refactoring for partition-wise join. Instead of duplicating the logic to search for a matching ParamPathInfo in multiple places, factor it out into a separate function. Pass only the relevant bits of the PartitionKey to partition_bounds_equal instead of the whole thing, because partition-wise join will want to call this without having a PartitionKey available. Adjust allow_star_schema_join and calc_nestloop_required_outer to take relevant Relids rather than the entire Path, because partition-wise join will want to call it with the top-parent relids to determine whether a child join is allowable. Ashutosh Bapat. Review and testing of the larger patch set of which this is a part by Amit Langote, Rajkumar Raghuwanshi, Rafia Sabih, Thomas Munro, Dilip Kumar, and me. Discussion: http://postgr.es/m/CA+TgmobQK80vtXjAsPZWWXd7c8u13G86gmuLupN+uUJjA+i4nA@mail.gmail.com	2017-08-15 12:30:38 -04:00
Tom Lane	21d304dfed	Final pgindent + perltidy run for v10.	2017-08-14 17:29:33 -04:00
Tom Lane	5b6289c1e0	Handle elog(FATAL) during ROLLBACK more robustly. Stress testing by Andreas Seltenreich disclosed longstanding problems that occur if a FATAL exit (e.g. due to receipt of SIGTERM) occurs while we are trying to execute a ROLLBACK of an already-failed transaction. In such a case, xact.c is in TBLOCK_ABORT state, so that AbortOutOfAnyTransaction would skip AbortTransaction and go straight to CleanupTransaction. This led to an assert failure in an assert-enabled build (due to the ROLLBACK's portal still having a cleanup hook) or without assertions, to a FATAL exit complaining about "cannot drop active portal". The latter's not disastrous, perhaps, but it's messy enough to want to improve it. We don't really want to run all of AbortTransaction in this code path. The minimum required to clean up the open portal safely is to do AtAbort_Memory and AtAbort_Portals. It seems like a good idea to do AtAbort_Memory unconditionally, to be entirely sure that we are starting with a safe CurrentMemoryContext. That means that if the main loop in AbortOutOfAnyTransaction does nothing, we need an extra step at the bottom to restore CurrentMemoryContext = TopMemoryContext, which I chose to do by invoking AtCleanup_Memory. This'll result in calling AtCleanup_Memory twice in many of the paths through this function, but that seems harmless and reasonably inexpensive. The original motivation for the assertion in AtCleanup_Portals was that we wanted to be sure that any user-defined code executed as a consequence of the cleanup hook runs during AbortTransaction not CleanupTransaction. That still seems like a valid concern, and now that we've seen one case of the assertion firing --- which means that exactly that would have happened in a production build --- let's replace the Assert with a runtime check. If we see the cleanup hook still set, we'll emit a WARNING and just drop the hook unexecuted. This has been like this a long time, so back-patch to all supported branches. Discussion: https://postgr.es/m/877ey7bmun.fsf@ansel.ydns.eu	2017-08-14 15:43:20 -04:00
Tom Lane	004a9702e0	Remove AtEOXact_CatCache(). The sole useful effect of this function, to check that no catcache entries have positive refcounts at transaction end, has really been obsolete since we introduced ResourceOwners in PG 8.1. We reduced the checks to assertions years ago, so that the function was a complete no-op in production builds. There have been previous discussions about removing it entirely, but consensus up to now was that it had some small value as a cross-check for bugs in the ResourceOwner logic. However, it now emerges that it's possible to trigger these assertions if you hit an assert-enabled backend with SIGTERM during a call to SearchCatCacheList, because that function temporarily increases the refcounts of entries it's intending to add to a catcache list construct. In a normal ERROR scenario, the extra refcounts are cleaned up by SearchCatCacheList's PG_CATCH block; but in a FATAL exit we do a transaction abort and exit without ever executing PG_CATCH handlers. There's a case to be made that this is a generic hazard and we should consider restructuring elog(FATAL) handling so that pending PG_CATCH handlers do get run. That's pretty scary though: it could easily create more problems than it solves. Preliminary stress testing by Andreas Seltenreich suggests that there are not many live problems of this ilk, so we rejected that idea. There are more-localized ways to fix the problem; the most principled one would be to use PG_ENSURE_ERROR_CLEANUP instead of plain PG_TRY. But adding cycles to SearchCatCacheList isn't very appealing. We could also weaken the assertions in AtEOXact_CatCache in some more or less ad-hoc way, but that just makes its raison d'etre even less compelling. In the end, the most reasonable solution seems to be to just remove AtEOXact_CatCache altogether, on the grounds that it's not worth trying to fix it. It hasn't found any bugs for us in many years. Per report from Jeevan Chalke. Back-patch to all supported branches. Discussion: https://postgr.es/m/CAM2+6=VEE30YtRQCZX7_sCFsEpoUkFBV1gZazL70fqLn8rcvBA@mail.gmail.com	2017-08-13 16:15:14 -04:00
Robert Haas	bb5d6e80b1	Improve the error message when creating an empty range partition. The previous message didn't mention the name of the table or the bounds. Put the table name in the primary error message and the bounds in the detail message. Amit Langote, changed slightly by me. Suggestions on the exac phrasing from Tom Lane, David G. Johnston, and Dean Rasheed. Discussion: http://postgr.es/m/CA+Tgmoae6bpwVa-1BMaVcwvCCeOoJ5B9Q9-RHWo-1gJxfPBZ5Q@mail.gmail.com	2017-08-10 13:46:56 -04:00
Tom Lane	9bf4068cc3	Fix datumSerialize infrastructure to not crash on non-varlena data. Commit `1efc7e538` did a poor job of emulating existing logic for touching Datums that might be expanded-object pointers. It didn't check for typlen being -1 first, which meant it could crash on fixed-length pass-by-ref values, and probably on cstring values as well. It also didn't use DatumGetPointer before VARATT_IS_EXTERNAL_EXPANDED, which while currently harmless is not according to documentation nor prevailing style. I also think the lack of any explanation as to why datumSerialize makes these particular nonobvious choices is pretty awful, so fix that. Per report from Jarred Ward. Back-patch to 9.6 where this code came in. Discussion: https://postgr.es/m/6F61E6D2-2F5E-4794-9479-A429BE1CEA4B@simple.com	2017-08-08 19:18:22 -04:00
Alvaro Herrera	f5d54ef97a	Fix typo in comment	2017-08-08 18:34:25 -04:00
Tom Lane	9d4e566999	Remove broken and useless entry-count printing in HASH_DEBUG code. init_htab(), with #define HASH_DEBUG, prints a bunch of hashtable parameters. It used to also print nentries, but commit `44ca4022f` changed that to "hash_get_num_entries(hctl)", which is wrong (the parameter should be "hashp"). Rather than correct the coding, though, let's just remove that field from the printout. The table must be empty, since we just finished building it, so expensively calculating the number of entries is rather pointless. Moreover hash_get_num_entries makes assumptions (about not needing locks) which we could do without in debugging code. Noted by Choi Doo-Won in bug #14764. Back-patch to 9.6 where the faulty code was introduced. Discussion: https://postgr.es/m/20170802032353.8424.12274@wrigleys.postgresql.org	2017-08-02 12:17:08 -04:00
Tom Lane	514f613293	Second try at getting useful errors out of newlocale/_create_locale. The early buildfarm returns for commit `1e165d05f` are pretty awful: not only does Windows not return a useful error, but it looks like a lot of Unix-ish platforms don't either. Given the number of different errnos seen so far, guess that what's really going on is that some newlocale() implementations fail to set errno at all. Hence, let's try zeroing errno just before newlocale() and then if it's still zero report as though it's ENOENT. That should cover the Windows case too. It's clear that we'll have to drop the regression test case, unless we want to maintain a separate expected-file for platforms without HAVE_LOCALE_T. But I'll leave it there awhile longer to see if this actually improves matters or not. Discussion: https://postgr.es/m/CAKKotZS-wcDcofXDCH=sidiuajE+nqHn2CGjLLX78anyDmi3gQ@mail.gmail.com	2017-08-01 17:17:20 -04:00
Tom Lane	1e165d05fe	Try to deliver a sane message for _create_locale() failure on Windows. We were just printing errno, which is certainly not gonna work on Windows. Now, it's not entirely clear from Microsoft's documentation whether _create_locale() adheres to standard Windows error reporting conventions, but let's assume it does and try to map the GetLastError result to an errno. If this turns out not to work, probably the best thing to do will be to assume the error is always ENOENT on Windows. This is a longstanding bug, but given the lack of previous field complaints, I'm not excited about back-patching it. Per report from Murtuza Zabuawala. Discussion: https://postgr.es/m/CAKKotZS-wcDcofXDCH=sidiuajE+nqHn2CGjLLX78anyDmi3gQ@mail.gmail.com	2017-08-01 16:11:51 -04:00
Heikki Linnakangas	c0a15e07cd	Always use 2048 bit DH parameters for OpenSSL ephemeral DH ciphers. 1024 bits is considered weak these days, but OpenSSL always passes 1024 as the key length to the tmp_dh callback. All the code to handle other key lengths is, in fact, dead. To remedy those issues: * Only include hard-coded 2048-bit parameters. * Set the parameters directly with SSL_CTX_set_tmp_dh(), without the callback * The name of the file containing the DH parameters is now a GUC. This replaces the old hardcoded "dh1024.pem" filename. (The files for other key lengths, dh512.pem, dh2048.pem, etc. were never actually used.) This is not a new problem, but it doesn't seem worth the risk and churn to backport. If you care enough about the strength of the DH parameters on old versions, you can create custom DH parameters, with as many bits as you wish, and put them in the "dh1024.pem" file. Per report by Nicolas Guini and Damian Quiroga. Reviewed by Michael Paquier. Discussion: https://www.postgresql.org/message-id/CAMxBoUyjOOautVozN6ofzym828aNrDjuCcOTcCquxjwS-L2hGQ@mail.gmail.com	2017-07-31 22:36:09 +03:00
Tatsuo Ishii	393d47ed0f	Add missing comment in postgresql.conf. current_source requires to restart server to reflect the new value. Per Yugo Nagata and Masahiko Sawada. Back patched to 9.2 and beyond.	2017-07-31 11:24:51 +09:00
Tatsuo Ishii	8b015dd723	Add missing comment in postgresql.conf. dynamic_shared_memory_type requires to restart server to reflect the new value. Per Yugo Nagata and Masahiko Sawada. Back pached to 9.4 and beyond.	2017-07-31 11:06:37 +09:00
Tatsuo Ishii	9fe63092b5	Add missing comment in postgresql.conf. max_logical_replication_workers requires to restart server to reflect the new value. Per Yugo Nagata. Minor editing by me.	2017-07-31 10:46:32 +09:00
Tom Lane	b4af9e3f37	Ensure that pg_get_ruledef()'s output matches pg_get_viewdef()'s. Various cases involving renaming of view columns are handled by having make_viewdef pass down the view's current relation tupledesc to get_query_def, which then takes care to use the column names from the tupledesc for the output column names of the SELECT. For some reason though, we'd missed teaching make_ruledef to do similarly when it is printing an ON SELECT rule, even though this is exactly the same case. The results from pg_get_ruledef would then be different and arguably wrong. In particular, this breaks pre-v10 versions of pg_dump, which in some situations would define views by means of emitting a CREATE RULE ... ON SELECT command. Third-party tools might not be happy either. In passing, clean up some crufty code in make_viewdef; we'd apparently modernized the equivalent code in make_ruledef somewhere along the way, and missed this copy. Per report from Gilles Darold. Back-patch to all supported versions. Discussion: https://postgr.es/m/ec05659a-40ff-4510-fc45-ca9d965d0838@dalibo.com	2017-07-24 15:16:31 -04:00
Tom Lane	278cb43411	Be more consistent about errors for opfamily member lookup failures. Add error checks in some places that were calling get_opfamily_member or get_opfamily_proc and just assuming that the call could never fail. Also, standardize the wording for such errors in some other places. None of these errors are expected in normal use, hence they're just elog not ereport. But they may be handy for diagnosing omissions in custom opclasses. Rushabh Lathia found the oversight in RelationBuildPartitionKey(); I found the others by grepping for all callers of these functions. Discussion: https://postgr.es/m/CAGPqQf2R9Nk8htpv0FFi+FP776EwMyGuORpc9zYkZKC8sFQE3g@mail.gmail.com	2017-07-24 11:23:27 -04:00
Tom Lane	ab2324fd46	Improve comments about partitioned hash table freelists. While I couldn't find any live bugs in commit `44ca4022f`, the comments seemed pretty far from adequate; in particular it was not made plain that "borrowing" entries from other freelists is critical for correctness. Try to improve the commentary. A couple of very minor code style tweaks, as well. Discussion: https://postgr.es/m/10593.1500670709@sss.pgh.pa.us	2017-07-22 18:02:26 -04:00
Dean Rasheed	d363d42bb9	Use MINVALUE/MAXVALUE instead of UNBOUNDED for range partition bounds. Previously, UNBOUNDED meant no lower bound when used in the FROM list, and no upper bound when used in the TO list, which was OK for single-column range partitioning, but problematic with multiple columns. For example, an upper bound of (10.0, UNBOUNDED) would not be collocated with a lower bound of (10.0, UNBOUNDED), thus making it difficult or impossible to define contiguous multi-column range partitions in some cases. Fix this by using MINVALUE and MAXVALUE instead of UNBOUNDED to represent a partition column that is unbounded below or above respectively. This syntax removes any ambiguity, and ensures that if one partition's lower bound equals another partition's upper bound, then the partitions are contiguous. Also drop the constraint prohibiting finite values after an unbounded column, and just document the fact that any values after MINVALUE or MAXVALUE are ignored. Previously it was necessary to repeat UNBOUNDED multiple times, which was needlessly verbose. Note: Forces a post-PG 10 beta2 initdb. Report by Amul Sul, original patch by Amit Langote with some additional hacking by me. Discussion: https://postgr.es/m/CAAJ_b947mowpLdxL3jo3YLKngRjrq9+Ej4ymduQTfYR+8=YAYQ@mail.gmail.com	2017-07-21 09:20:47 +01:00
Tom Lane	eb145fdfea	Fix dumping of outer joins with empty qual lists. Normally, a JoinExpr would have empty "quals" only if it came from CROSS JOIN syntax. However, it's possible to get to this state by specifying NATURAL JOIN between two tables with no common column names, and there might be other ways too. The code previously printed no ON clause if "quals" was empty; that's right for CROSS JOIN but syntactically invalid if it's some type of outer join. Fix by printing ON TRUE in that case. This got broken by commit `2ffa740be`, which stopped using NATURAL JOIN syntax in ruleutils output due to its brittleness in the face of column renamings. Back-patch to 9.3 where that commit appeared. Per report from Tushar Ahuja. Discussion: https://postgr.es/m/98b283cd-6dda-5d3f-f8ac-87db8c76a3da@enterprisedb.com	2017-07-20 11:29:36 -04:00
Tom Lane	04a2c7f412	Improve make_tsvector() to handle empty input, and simplify its callers. It seemed a bit silly that each caller of make_tsvector() was laboriously special-casing the situation where no lexemes were found, when it would be easy and much more bullet-proof to make make_tsvector() handle that.	2017-07-18 13:13:47 -04:00
Tom Lane	decb08ebdf	Code review for NextValueExpr expression node type. Add missing infrastructure for this node type, notably in ruleutils.c where its lack could demonstrably cause EXPLAIN to fail. Add outfuncs/readfuncs support. (outfuncs support is useful today for debugging purposes. The readfuncs support may never be needed, since at present it would only matter for parallel query and NextValueExpr should never appear in a parallelizable query; but it seems like a bad idea to have a primnode type that isn't fully supported here.) Teach planner infrastructure that NextValueExpr is a volatile, parallel-unsafe, non-leaky expression node with cost cpu_operator_cost. Given its limited scope of usage, there might be no live bug today from the lack of that knowledge, but it's certainly going to bite us on the rear someday. Teach pg_stat_statements about the new node type, too. While at it, also teach cost_qual_eval() that MinMaxExpr, SQLValueFunction, XmlExpr, and CoerceToDomain should be charged as cpu_operator_cost. Failing to do this for SQLValueFunction was an oversight in my commit `0bb51aa96`. The others are longer-standing oversights, but no time like the present to fix them. (In principle, CoerceToDomain could have cost much higher than this, but it doesn't presently seem worth trying to examine the domain's constraints here.) Modify execExprInterp.c to execute NextValueExpr as an out-of-line function; it seems quite unlikely to me that it's worth insisting that it be inlined in all expression eval methods. Besides, providing the out-of-line function doesn't stop anyone from inlining if they want to. Adjust some places where NextValueExpr support had been inserted with the aid of a dartboard rather than keeping it in the same order as elsewhere. Discussion: https://postgr.es/m/23862.1499981661@sss.pgh.pa.us	2017-07-14 15:25:43 -04:00
Tom Lane	a3ca72ae9a	Fix dumping of FUNCTION RTEs that contain non-function-call expressions. The grammar will only accept something syntactically similar to a function call in a function-in-FROM expression. However, there are various ways to input something that ruleutils.c won't deparse that way, potentially leading to a view or rule that fails dump/reload. Fix by inserting a dummy CAST around anything that isn't going to deparse as a function (which is one of the ways to get something like that in there in the first place). In HEAD, also make use of the infrastructure added by this to avoid emitting unnecessary parentheses in CREATE INDEX deparsing. I did not change that in back branches, thinking that people might find it to be unexpected/unnecessary behavioral change. In HEAD, also fix incorrect logic for when to add extra parens to partition key expressions. Somebody apparently thought they could get away with simpler logic than pg_get_indexdef_worker has, but they were wrong --- a counterexample is PARTITION BY LIST ((a[1])). Ignoring the prettyprint flag for partition expressions isn't exactly a nice solution anyway. This has been broken all along, so back-patch to all supported branches. Discussion: https://postgr.es/m/10477.1499970459@sss.pgh.pa.us	2017-07-13 19:25:03 -04:00
Tom Lane	bc2d716ad0	Fix ruleutils.c for domain-over-array cases, too. Further investigation shows that ruleutils isn't quite up to speed either for cases where we have a domain-over-array: it needs to be prepared to look past a CoerceToDomain at the top level of field and element assignments, else it decompiles them incorrectly. Potentially this would result in failure to dump/reload a rule, if it looked like the one in the new test case. (I also added a test for EXPLAIN; that output isn't broken, but clearly we need more test coverage here.) Like commit `b1cb32fb6`, this bug is reachable in cases we already support, so back-patch all the way.	2017-07-12 18:00:04 -04:00
Tom Lane	512f67c8d0	Avoid integer overflow while sifting-up a heap in tuplesort.c. If the number of tuples in the heap exceeds approximately INT_MAX/2, this loop's calculation "2*i+1" could overflow, resulting in a crash. Fix it by using unsigned int rather than int for the relevant local variables; that shouldn't cost anything extra on any popular hardware. Per bug #14722 from Sergey Koposov. Original patch by Sergey Koposov, modified by me per a suggestion from Heikki Linnakangas to use unsigned int not int64. Back-patch to 9.4, where tuplesort.c grew the ability to sort as many as INT_MAX tuples in-memory (commit `263865a48`). Discussion: https://postgr.es/m/20170629161637.1478.93109@wrigleys.postgresql.org	2017-07-12 13:24:16 -04:00
Peter Eisentraut	d8b3c81335	Refine memory allocation in ICU conversions The simple calculations done to estimate the size of the output buffers for ucnv_fromUChars() and ucnv_toUChars() could overflow int32_t for large strings. To avoid that, go the long way and run the function first without an output buffer to get the correct output buffer size requirement.	2017-07-01 23:08:37 -04:00
Peter Eisentraut	13a57710db	Prohibit creating ICU collation with different ctype ICU does not support "collate" and "ctype" being different, so the collctype catalog column is ignored. But for catalog neatness, ensure that they are the same.	2017-06-30 11:24:00 -04:00
Tom Lane	f13ea95f9e	Change pg_ctl to detect server-ready by watching status in postmaster.pid. Traditionally, "pg_ctl start -w" has waited for the server to become ready to accept connections by attempting a connection once per second. That has the major problem that connection issues (for instance, a kernel packet filter blocking traffic) can't be reliably told apart from server startup issues, and the minor problem that if server startup isn't quick, we accumulate "the database system is starting up" spam in the server log. We've hacked around many of the possible connection issues, but it resulted in ugly and complicated code in pg_ctl.c. In commit `c61559ec3`, I changed the probe rate to every tenth of a second. That prompted Jeff Janes to complain that the log-spam problem had become much worse. In the ensuing discussion, Andres Freund pointed out that we could dispense with connection attempts altogether if the postmaster were changed to report its status in postmaster.pid, which "pg_ctl start" already relies on being able to read. This patch implements that, teaching postmaster.c to report a status string into the pidfile at the same state-change points already identified as being of interest for systemd status reporting (cf commit `7d17e683f`). pg_ctl no longer needs to link with libpq at all; all its functions now depend on reading server files. In support of this, teach AddToDataDirLockFile() to allow addition of postmaster.pid lines in not-necessarily-sequential order. This is needed on Windows where the SHMEM_KEY line will never be written at all. We still have the restriction that we don't want to truncate the pidfile; document the reasons for that a bit better. Also, fix the pg_ctl TAP tests so they'll notice if "start -w" mode is broken --- before, they'd just wait out the sixty seconds until the loop gives up, and then report success anyway. (Yes, I found that out the hard way.) While at it, arrange for pg_ctl to not need to #include miscadmin.h; as a rather low-level backend header, requiring that to be compilable client-side is pretty dubious. This requires moving the #define's associated with the pidfile into a new header file, and moving PG_BACKEND_VERSIONSTR someplace else. For lack of a clearly better "someplace else", I put it into port.h, beside the declaration of find_other_exec(), since most users of that macro are passing the value to find_other_exec(). (initdb still depends on miscadmin.h, but at least pg_ctl and pg_upgrade no longer do.) In passing, fix main.c so that PG_BACKEND_VERSIONSTR actually defines the output of "postgres -V", which remarkably it had never done before. Discussion: https://postgr.es/m/CAMkU=1xJW8e+CTotojOMBd-yzUvD0e_JZu2xHo=MnuZ4__m7Pg@mail.gmail.com	2017-06-28 17:31:32 -04:00
Tom Lane	00c5e511b9	Minor code review for parse_phrase_operator(). Fix its header comment, which described the old behavior of the <N> phrase distance operator; we missed updating that in commit `028350f61`. Also, reset errno before strtol() call, to defend against the possibility that it was already ERANGE at entry. (The lack of complaints says that it generally isn't, but this is at least a latent bug.) Very minor stylistic improvements as well. Victor Drobny noted the obsolete comment, I noted the errno issue. Back-patch to 9.6 where this code was added, just in case the errno issue is a live bug in some cases. Discussion: https://postgr.es/m/2b5382fdff9b1f79d5eb2c99c4d2cbe2@postgrespro.ru	2017-06-26 10:31:10 -04:00
Simon Riggs	a15b47df35	Fix typo in comment in SerializeSnapshot Author: Masahiko Sawada	2017-06-24 13:51:26 +01:00
Simon Riggs	829f12e269	Revert `1f30295eab` Reported-by: Tom Lane	2017-06-24 13:03:55 +01:00
Tom Lane	b6159202c9	Fix memory leakage in ICU encoding conversion, and other code review. Callers of icu_to_uchar() neglected to pfree the result string when done with it. This results in catastrophic memory leaks in varstr_cmp(), because of our prevailing assumption that btree comparison functions don't leak memory. For safety, make all the call sites clean up leaks, though I suspect that we could get away without it in formatting.c. I audited callers of icu_from_uchar() as well, but found no places that seemed to have a comparable issue. Add function API specifications for icu_to_uchar() and icu_from_uchar(); the lack of any thought-through specification is perhaps not unrelated to the existence of this bug in the first place. Fix icu_to_uchar() to guarantee a nul-terminated result; although no existing caller appears to care, the fact that it would have been nul-terminated except in extreme corner cases seems ideally designed to bite someone on the rear someday. Fix ucnv_fromUChars() destCapacity argument --- in the worst case, that could perhaps have led to a non-nul-terminated result, too. Fix icu_from_uchar() to have a more reasonable definition of the function result --- no callers are actually paying attention, so this isn't a live bug, but it's certainly sloppily designed. Const-ify icu_from_uchar()'s input string for consistency. That is not the end of what needs to be done to these functions, but it's as much as I have the patience for right now. Discussion: https://postgr.es/m/1955.1498181798@sss.pgh.pa.us	2017-06-23 12:22:06 -04:00
Tom Lane	780b3a4c43	Manually un-break a few URLs that pgindent used to insist on splitting. These will no longer get re-split by pgindent runs, so it's worth cleaning them up now. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us	2017-06-21 16:02:08 -04:00
Tom Lane	382ceffdf7	Phase 3 of pgindent updates. Don't move parenthesized lines to the left, even if that means they flow past the right margin. By default, BSD indent lines up statement continuation lines that are within parentheses so that they start just to the right of the preceding left parenthesis. However, traditionally, if that resulted in the continuation line extending to the right of the desired right margin, then indent would push it left just far enough to not overrun the margin, if it could do so without making the continuation line start to the left of the current statement indent. That makes for a weird mix of indentations unless one has been completely rigid about never violating the 80-column limit. This behavior has been pretty universally panned by Postgres developers. Hence, disable it with indent's new -lpl switch, so that parenthesized lines are always lined up with the preceding left paren. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us	2017-06-21 15:35:54 -04:00
Tom Lane	c7b8998ebb	Phase 2 of pgindent updates. Change pg_bsd_indent to follow upstream rules for placement of comments to the right of code, and remove pgindent hack that caused comments following #endif to not obey the general rule. Commit `e3860ffa4d` wasn't actually using the published version of pg_bsd_indent, but a hacked-up version that tried to minimize the amount of movement of comments to the right of code. The situation of interest is where such a comment has to be moved to the right of its default placement at column 33 because there's code there. BSD indent has always moved right in units of tab stops in such cases --- but in the previous incarnation, indent was working in 8-space tab stops, while now it knows we use 4-space tabs. So the net result is that in about half the cases, such comments are placed one tab stop left of before. This is better all around: it leaves more room on the line for comment text, and it means that in such cases the comment uniformly starts at the next 4-space tab stop after the code, rather than sometimes one and sometimes two tabs after. Also, ensure that comments following #endif are indented the same as comments following other preprocessor commands such as #else. That inconsistency turns out to have been self-inflicted damage from a poorly-thought-through post-indent "fixup" in pgindent. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us	2017-06-21 15:19:25 -04:00
Tom Lane	e3860ffa4d	Initial pgindent run with pg_bsd_indent version 2.0. The new indent version includes numerous fixes thanks to Piotr Stefaniak. The main changes visible in this commit are: * Nicer formatting of function-pointer declarations. * No longer unexpectedly removes spaces in expressions using casts, sizeof, or offsetof. * No longer wants to add a space in "struct structname varname", as well as some similar cases for const- or volatile-qualified pointers. Declarations using PG_USED_FOR_ASSERTS_ONLY are formatted more nicely. * Fixes bug where comments following declarations were sometimes placed with no space separating them from the code. * Fixes some odd decisions for comments following case labels. * Fixes some cases where comments following code were indented to less than the expected column 33. On the less good side, it now tends to put more whitespace around typedef names that are not listed in typedefs.list. This might encourage us to put more effort into typedef name collection; it's not really a bug in indent itself. There are more changes coming after this round, having to do with comment indentation and alignment of lines appearing within parentheses. I wanted to limit the size of the diffs to something that could be reviewed without one's eyes completely glazing over, so it seemed better to split up the changes as much as practical. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us	2017-06-21 14:39:04 -04:00
Tom Lane	9ef2dbefc7	Final pgindent run with old pg_bsd_indent (version 1.3). This is just to have a clean basis for comparison with the results of the new version (which will indeed end up reverting some of these changes...) Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us	2017-06-21 14:09:24 -04:00
Tom Lane	a69dfe5f40	Don't downcase entries within shared_preload_libraries et al. load_libraries(), which processes the various xxx_preload_libraries GUCs, was parsing them using SplitIdentifierString() which isn't really appropriate for values that could be path names: it downcases unquoted text, and it doesn't allow embedded whitespace unless quoted. Use SplitDirectoriesString() instead. That also allows us to simplify load_libraries() a bit, since canonicalize_path() is now done for it. While this definitely seems like a bug fix, it has the potential to break configuration settings that accidentally worked before because of the downcasing behavior. Also, there's an easy workaround for the bug, namely to double-quote troublesome text. Hence, no back-patch. QL Zhuo, tweaked a bit by me Discussion: https://postgr.es/m/CAB-oJtxHVDc3H+Km3CjB9mY1VDzuyaVH_ZYSz7iXcRqCtb93Ew@mail.gmail.com	2017-06-20 13:03:29 -04:00
Peter Eisentraut	41839b7abc	Fix ICU collation use on Windows Windows uses a separate code path for libc locales. The code previously ended up there also if an ICU collation should be used, leading to a crash. Reported-by: Ashutosh Sharma <ashu.coek88@gmail.com>	2017-06-16 10:08:54 -04:00
Andres Freund	6c2003f8a1	Don't force-assign transaction id when exporting a snapshot. Previously we required every exported transaction to have an xid assigned. That was used to check that the exporting transaction is still running, which in turn is needed to guarantee that that necessary rows haven't been removed in between exporting and importing the snapshot. The exported xid caused unnecessary problems with logical decoding, because slot creation has to wait for all concurrent xid to finish, which in turn serializes concurrent slot creation. It also prohibited snapshots to be exported on hot-standby replicas. Instead export the virtual transactionid, which avoids the unnecessary serialization and the inability to export snapshots on standbys. This changes the file name of the exported snapshot, but since we never documented what that one means, that seems ok. Author: Petr Jelinek, slightly editorialized by me Reviewed-By: Andres Freund Discussion: https://postgr.es/m/f598b4b8-8cd7-0d54-0939-adda763d8c34@2ndquadrant.com	2017-06-14 11:57:21 -07:00
Robert Haas	b08df9cab7	Teach predtest.c about CHECK clauses to fix partitioning bugs. In a CHECK clause, a null result means true, whereas in a WHERE clause it means false. predtest.c provided different functions depending on which set of semantics applied to the predicate being proved, but had no option to control what a null meant in the clauses provided as axioms. Add one. Use that in the partitioning code when figuring out whether the validation scan on a new partition can be skipped. Rip out the old logic that attempted (not very successfully) to compensate for the absence of the necessary support in predtest.c. Ashutosh Bapat and Robert Haas, reviewed by Amit Langote and incorporating feedback from Tom Lane. Discussion: http://postgr.es/m/CAFjFpReT_kq_uwU_B8aWDxR7jNGE=P0iELycdq5oupi=xSQTOw@mail.gmail.com	2017-06-14 13:13:11 -04:00
Tom Lane	651902deb1	Re-run pgindent. This is just to have a clean base state for testing of Piotr Stefaniak's latest version of FreeBSD indent. I fixed up a couple of places where pgindent would have changed format not-nicely. perltidy not included. Discussion: https://postgr.es/m/VI1PR03MB119959F4B65F000CA7CD9F6BF2CC0@VI1PR03MB1199.eurprd03.prod.outlook.com	2017-06-13 13:05:59 -04:00
Peter Eisentraut	e11e24b1ed	Formatting improvements in config file samples	2017-06-09 14:38:33 -04:00
Robert Haas	3106829513	Use NIL rather than NULL to represent an empty list. Just to be tidy. Amit Langote Discussion: http://postgr.es/m/9297f80f-e4ab-7dda-33d4-8580bab6d634@lab.ntt.co.jp	2017-06-06 11:21:22 -04:00
Andres Freund	6e1dd2773e	Unify SIGHUP handling between normal and walsender backends. Because walsender and normal backends share the same main loop it's problematic to have two different flag variables, set in signal handlers, indicating a pending configuration reload. Only certain walsender commands reach code paths checking for the variable (START_[LOGICAL_]REPLICATION, CREATE_REPLICATION_SLOT ... LOGICAL, notably not base backups). This is a bug present since the introduction of walsender, but has gotten worse in releases since then which allow walsender to do more. A later patch, not slated for v10, will similarly unify SIGHUP handling in other types of processes as well. Author: Petr Jelinek, Andres Freund Reviewed-By: Michael Paquier Discussion: https://postgr.es/m/20170423235941.qosiuoyqprq4nu7v@alap3.anarazel.de Backpatch: 9.2-, bug is present since 9.0	2017-06-05 19:18:16 -07:00
Tom Lane	e7941a9766	Replace over-optimistic Assert in partitioning code with a runtime test. get_partition_parent felt that it could simply Assert that systable_getnext found a tuple. This is unlike any other caller of that function, and it's unsafe IMO --- in fact, the reason I noticed it was that the Assert failed. (OK, I was working with known-inconsistent catalog contents, but I wasn't expecting the DB to fall over quite that violently. The behavior in a non-assert-enabled build wouldn't be very nice, either.) Fix it to do what other callers do, namely an actual runtime-test-and-elog. Also, standardize the wording of elog messages that are complaining about unexpected failure of systable_getnext. 90% of them say "could not find tuple for <object>", so make the remainder do likewise. Many of the holdouts were using the phrasing "cache lookup failed", which is outright misleading since no catcache search is involved.	2017-06-04 16:20:03 -04:00
Alvaro Herrera	55a70a023c	Assorted translatable string fixes Mark our rusage reportage string translatable; remove quotes from type names; unify formatting of very similar messages.	2017-06-04 11:41:16 -04:00
Tom Lane	5936d25f81	Remove dead variables. Commit `512c7356b` left a couple of variables unused except for being set. My compiler didn't whine about this, but some buildfarm members did.	2017-06-03 20:35:52 -04:00
Tom Lane	512c7356b6	Fix <> and pattern-NOT-match estimators to handle nulls correctly. These estimators returned 1 minus the corresponding equality/match estimate, which is incorrect: we need to subtract off the fraction of nulls in the column, since those are neither equal nor not equal to the comparison value. The error only becomes obvious if the nullfrac is large, but it could be very bad in a mostly-nulls column, as reported in bug #14676 from Marko Tiikkaja. To fix the <> case, refactor eqsel() and neqsel() to call a common support routine, which can be made to account for nullfrac correctly. The pattern-match cases were already factored that way, and it was simply an oversight that patternsel() wasn't subtracting off nullfrac. neqjoinsel() has a similar problem, but since we're elsewhere discussing changing its behavior entirely, I left it alone for now. This is a very longstanding bug, but I'm hesitant to back-patch a fix for it. Given the lack of prior complaints, such cases must not come up often, so it's probably not worth the risk of destabilizing plans in stable branches. Discussion: https://postgr.es/m/20170529153847.4275.95416@wrigleys.postgresql.org	2017-06-03 14:36:25 -04:00
Magnus Hagander	acbd8375e9	Fix copy/paste mistake in comment Amit Langote	2017-06-02 11:18:24 +02:00
Tom Lane	54e839fe29	Sort syscache identifiers into alphabetical order. Not much point in having a convention about this if we don't enforce it. Mark Dilger Discussion: https://postgr.es/m/7F67FBEF-C3B3-404E-8EC6-E02ACB15D894@gmail.com	2017-05-30 18:47:13 -04:00
Tom Lane	68cff231e3	Make edge-case behavior of jsonb_populate_record match json_populate_record json_populate_record throws an error if asked to convert a JSON scalar or array into a composite type. jsonb_populate_record was returning a record full of NULL fields instead. It seems better to make it throw an error for this case as well. Nikita Glukhov Discussion: https://postgr.es/m/fbd1d566-bba0-a3de-d6d0-d3b1d7c24ff2@postgrespro.ru	2017-05-29 19:29:42 -04:00
Tom Lane	e45c5be99d	Fix thinko in JsObjectSize() macro. The macro gave the wrong answers for a JsObject with is_json == 0: it would return 1 if jsonb_cont == NULL, or if that wasn't NULL, it would return 1 for any non-zero size. We could fix that, but the only use of this macro at present is in the JsObjectIsEmpty() macro, so it seems simpler and clearer to get rid of JsObjectSize() and put corrected logic into JsObjectIsEmpty(). Thinko in commit `cf35346e8`, so no need for back-patch. Nikita Glukhov Discussion: https://postgr.es/m/fbd1d566-bba0-a3de-d6d0-d3b1d7c24ff2@postgrespro.ru	2017-05-29 18:51:56 -04:00
Tom Lane	76a3df6e5e	Code review focused on new node types added by partitioning support. Fix failure to check that we got a plain Const from const-simplification of a coercion request. This is the cause of bug #14666 from Tian Bing: there is an int4 to money cast, but it's only stable not immutable (because of dependence on lc_monetary), resulting in a FuncExpr that the code was miserably unequipped to deal with, or indeed even to notice that it was failing to deal with. Add test cases around this coercion behavior. In view of the above, sprinkle the code liberally with castNode() macros, in hope of catching the next such bug a bit sooner. Also, change some functions that were randomly declared to take Node* to take more specific pointer types. And change some struct fields that were declared Node* but could be given more specific types, allowing removal of assorted explicit casts. Place PARTITION_MAX_KEYS check a bit closer to the code it's protecting. Likewise check only-one-key-for-list-partitioning restriction in a less random place. Avoid not-per-project-style usages like !strcmp(...). Fix assorted failures to avoid scribbling on the input of parse transformation. I'm not sure how necessary this is, but it's entirely silly for these functions to be expending cycles to avoid that and not getting it right. Add guards against partitioning on system columns. Put backend/nodes/ support code into an order that matches handling of these node types elsewhere. Annotate the fact that somebody added location fields to PartitionBoundSpec and PartitionRangeDatum but forgot to handle them in outfuncs.c/readfuncs.c. This is fairly harmless for production purposes (since readfuncs.c would just substitute -1 anyway) but it's still bogus. It's not worth forcing a post-beta1 initdb just to fix this, but if we have another reason to force initdb before 10.0, we should go back and clean this up. Contrariwise, somebody added location fields to PartitionElem and PartitionSpec but forgot to teach exprLocation() about them. Consolidate duplicative code in transformPartitionBound(). Improve a couple of error messages. Improve assorted commentary. Re-pgindent the files touched by this patch; this affects a few comment blocks that must have been added quite recently. Report: https://postgr.es/m/20170524024550.29935.14396@wrigleys.postgresql.org	2017-05-28 23:20:28 -04:00
Tom Lane	9ae2661fe1	Tighten checks for whitespace in functions that parse identifiers etc. This patch replaces isspace() calls with scanner_isspace() in functions that are likely to be presented with non-ASCII input. isspace() has the small advantage that it will correctly recognize no-break space in single-byte encodings (such as LATIN1); but it cannot work successfully for any multibyte character, and depending on platform it might return false positive results for some fragments of multibyte characters. That's disastrous for functions that are trying to discard whitespace between valid strings, as noted in bug #14662 from Justin Muise. Even treating no-break space as whitespace is pretty questionable for the usages touched here, because the core scanner would think it is an identifier character. Affected functions are parse_ident(), parseNameAndArgTypes (underlying regprocedurein() and siblings), SplitIdentifierString (used for parsing GUCs and options that are qualified names or lists of names), and SplitDirectoriesString (used for parsing GUCs that are lists of directories). All the functions adjusted here are parsing SQL identifiers and similar constructs, so it's reasonable to insist that their definition of whitespace match the core scanner. So we can hope that this won't cause many backwards-compatibility problems. I've left alone isspace() calls in places that aren't really expecting any non-ASCII input characters, such as float8in(). Back-patch to all supported branches. Discussion: https://postgr.es/m/10129.1495302480@sss.pgh.pa.us	2017-05-24 15:28:34 -04:00
Tom Lane	d761fe2182	Fix precision and rounding issues in money multiplication and division. The cash_div_intX functions applied rint() to the result of the division. That's not merely useless (because the result is already an integer) but it causes precision loss for values larger than 2^52 or so, because of the forced conversion to float8. On the other hand, the cash_mul_fltX functions neglected to apply rint() to their multiplication results, thus possibly causing off-by-one outputs. Per C standard, arithmetic between any integral value and a float value is performed in float format. Thus, cash_mul_flt4 and cash_div_flt4 produced answers good to only about six digits, even when the float value is exact. We can improve matters noticeably by widening the float inputs to double. (It's tempting to consider using "long double" arithmetic if available, but that's probably too much of a stretch for a back-patched fix.) Also, document that cash_div_intX operators truncate rather than round. Per bug #14663 from Richard Pistole. Back-patch to all supported branches. Discussion: https://postgr.es/m/22403.1495223615@sss.pgh.pa.us	2017-05-21 13:05:16 -04:00
Tom Lane	cf5389f5b5	Fix misspelled struct tag. This was evidently intended to match the struct's typedef name, but it didn't quite. Noted while testing find_typedefs.	2017-05-19 15:05:58 -04:00
Peter Eisentraut	7f17ae0ad0	Fix argument name differences Different names were used between function declaration and definition.	2017-05-19 14:47:56 -04:00
Heikki Linnakangas	94884e1c27	Make slab allocator work on platforms with MAXIMUM_ALIGNOF < sizeof(int). Notably, m68k only needs 2-byte alignment. Per report from Christoph Berg. Discussion: https://www.postgresql.org/message-id/20170517193957.fwntkgi6epuso5l2@msg.df7cb.de	2017-05-18 22:22:13 +03:00
Heikki Linnakangas	2df537e43f	Fix typo in comment. Daniel Gustafsson	2017-05-18 10:33:16 +03:00
Bruce Momjian	ce55481032	Post-PG 10 beta1 pgperltidy run	2017-05-17 19:01:23 -04:00
Bruce Momjian	a6fd7b7a5f	Post-PG 10 beta1 pgindent run perltidy run not included.	2017-05-17 16:31:56 -04:00
Tom Lane	c079673dcb	Preventive maintenance in advance of pgindent run. Reformat various places in which pgindent will make a mess, and fix a few small violations of coding style that I happened to notice while perusing the diffs from a pgindent dry run. There is one actual bug fix here: the need-to-enlarge-the-buffer code path in icu_convert_case was obviously broken. Perhaps it's unreachable in our usage? Or maybe this is just sadly undertested.	2017-05-16 20:36:35 -04:00
Tom Lane	f04c9a6146	Standardize terminology for pg_statistic_ext entries. Consistently refer to such an entry as a "statistics object", not just "statistics" or "extended statistics". Previously we had a mismash of terms, accompanied by utter confusion as to whether the term was singular or plural. That's not only grating (at least to the ear of a native English speaker) but could be outright misleading, eg in error messages that seemed to be referring to multiple objects where only one could be meant. This commit fixes the code and a lot of comments (though I may have missed a few). I also renamed two new SQL functions, pg_get_statisticsextdef -> pg_get_statisticsobjdef pg_statistic_ext_is_visible -> pg_statistics_obj_is_visible to conform better with this terminology. I have not touched the SGML docs other than fixing those function names; the docs certainly need work but it seems like a separable task. Discussion: https://postgr.es/m/22676.1494557205@sss.pgh.pa.us	2017-05-14 10:55:01 -04:00
Tom Lane	9aab83fc50	Redesign get_attstatsslot()/free_attstatsslot() for more safety and speed. The mess cleaned up in commit `da0759600` is clear evidence that it's a bug hazard to expect the caller of get_attstatsslot()/free_attstatsslot() to provide the correct type OID for the array elements in the slot. Moreover, we weren't even getting any performance benefit from that, since get_attstatsslot() was extracting the real type OID from the array anyway. So we ought to get rid of that requirement; indeed, it would make more sense for get_attstatsslot() to pass back the type OID it found, in case the caller isn't sure what to expect, which is likely in binary- compatible-operator cases. Another problem with the current implementation is that if the stats array element type is pass-by-reference, we incur a palloc/memcpy/pfree cycle for each element. That seemed acceptable when the code was written because we were targeting O(10) array sizes --- but these days, stats arrays are almost always bigger than that, sometimes much bigger. We can save a significant number of cycles by doing one palloc/memcpy/pfree of the whole array. Indeed, in the now-probably-common case where the array is toasted, that happens anyway so this method is basically free. (Note: although the catcache code will inline any out-of-line toasted values, it doesn't decompress them. At the other end of the size range, it doesn't expand short-header datums either. In either case, DatumGetArrayTypeP would have to make a copy. We do end up using an extra array copy step if the element type is pass-by-value and the array length is neither small enough for a short header nor large enough to have suffered compression. But that seems like a very acceptable price for winning in pass-by-ref cases.) Hence, redesign to take these insights into account. While at it, convert to an API in which we fill a struct rather than passing a bunch of pointers to individual output arguments. That will make it less painful if we ever want further expansion of what get_attstatsslot can pass back. It's certainly arguable that this is new development and not something to push post-feature-freeze. However, I view it as primarily bug-proofing and therefore something that's better to have sooner not later. Since we aren't quite at beta phase yet, let's put it in. Discussion: https://postgr.es/m/16364.1494520862@sss.pgh.pa.us	2017-05-13 15:14:39 -04:00
Robert Haas	1848b73d45	Teach \d+ to show partitioning constraints. The fact that we didn't have this in the first place is likely why the problem fixed by `f8bffe9e6d` escaped detection. Patch by Amit Langote, reviewed and slightly adjusted by me. Discussion: http://postgr.es/m/CA+TgmoYWnV2GMnYLG-Czsix-E1WGAbo4D+0tx7t9NdfYBDMFsA@mail.gmail.com	2017-05-13 12:04:53 -04:00
Alvaro Herrera	d99d58cdc8	Complete tab completion for DROP STATISTICS Tab-completing DROP STATISTICS would only work if you started writing the schema name containing the statistics object, because the visibility clause was missing. To add it, we need to add SQL-callable support for testing visibility of a statistics object, like all other object types already have. Discussion: https://postgr.es/m/22676.1494557205@sss.pgh.pa.us	2017-05-13 01:05:48 -03:00
Tom Lane	2df5d46555	Avoid searching for callback functions in CallSyscacheCallbacks(). We have now grown enough registerable syscache-invalidation callback functions that the original assumption that there would be few of them is causing performance problems. In particular, let's fix things so that CallSyscacheCallbacks doesn't have to search the whole array to find which callback(s) to invoke for a given cache ID. Preserve the original behavior that callbacks are called in order of registration, just in case there's someplace that depends on that (which I doubt). In support of this, export the number of syscaches from syscache.h. People could have found that out anyway from the enum, but adding a #define makes that much safer. This provides a useful additional speedup in Mathieu Fenniak's logical-decoding test case, although we're reaching the point of diminishing returns there. I think any further improvement will have to come from reducing the number of cache invalidations that are triggered in the first place. Still, we can hope that this change gives some incremental benefit for all invalidation scenarios. Back-patch to 9.4 where logical decoding was introduced. Discussion: https://postgr.es/m/CAHoiPjzea6N0zuCi=+f9v_j94nfsy6y8SU7-=bp4=7qw6_i=Rg@mail.gmail.com	2017-05-12 19:05:27 -04:00
Tom Lane	8085a4f751	Reduce initial size of RelfilenodeMapHash. A test case provided by Mathieu Fenniak shows that hash_seq_search'ing this hashtable can consume a very significant amount of overhead during logical decoding, which triggers frequent cache invalidation. Testing suggests that the actual population of the hashtable is often no more than a few dozen entries, so we can cut the overhead just by dropping the initial number of buckets down from 1024 --- I chose to cut it to 64. (In situations where we do have a significant number of entries, we shouldn't get any real penalty from doing this, as the dynahash.c code will resize the hashtable automatically.) This gives a further factor-of-two savings in Mathieu's test case. That may be overly optimistic for real-world benefit, as real cases may have larger average table populations, but it's hard to see it turning into a net negative for any workload. Back-patch to 9.4 where relfilenodemap.c was introduced. Discussion: https://postgr.es/m/CAHoiPjzea6N0zuCi=+f9v_j94nfsy6y8SU7-=bp4=7qw6_i=Rg@mail.gmail.com	2017-05-12 18:30:17 -04:00
Tom Lane	50ee1c7462	Avoid searching for the target catcache in CatalogCacheIdInvalidate. A test case provided by Mathieu Fenniak shows that the initial search for the target catcache in CatalogCacheIdInvalidate consumes a very significant amount of overhead in cases where cache invalidation is triggered but has little useful work to do. There is no good reason for that search to exist at all, as the index array maintained by syscache.c allows direct lookup of the catcache from its ID. We just need a frontend function in syscache.c, matching the division of labor for most other cache-accessing operations. While there's more that can be done in this area, this patch alone reduces the runtime of Mathieu's example by 2X. We can hope that it offers some useful benefit in other cases too, although usually cache invalidation overhead is not such a striking fraction of the total runtime. Back-patch to 9.4 where logical decoding was introduced. It might be worth going further back, but presently the only case we know of where cache invalidation is really a significant burden is in logical decoding. Also, older branches have fewer catcaches, reducing the possible benefit. (Note: although this nominally changes catcache's API, we have always documented CatalogCacheIdInvalidate as a private function, so I would have little sympathy for an external module calling it directly. So backpatching should be fine.) Discussion: https://postgr.es/m/CAHoiPjzea6N0zuCi=+f9v_j94nfsy6y8SU7-=bp4=7qw6_i=Rg@mail.gmail.com	2017-05-12 18:17:29 -04:00
Alvaro Herrera	bc085205c8	Change CREATE STATISTICS syntax Previously, we had the WITH clause in the middle of the command, where you'd specify both generic options as well as statistic types. Few people liked this, so this commit changes it to remove the WITH keyword from that clause and makes it accept statistic types only. (We currently don't have any generic options, but if we invent in the future, we will gain a new WITH clause, probably at the end of the command). Also, the column list is now specified without parens, which makes the whole command look more similar to a SELECT command. This change will let us expand the command to supporting expressions (not just columns names) as well as multiple tables and their join conditions. Tom added lots of code comments and fixed some parts of the CREATE STATISTICS reference page, too; more changes in this area are forthcoming. He also fixed a potential problem in the alter_generic regression test, reducing verbosity on a cascaded drop to avoid dependency on message ordering, as we do in other tests. Tom also closed a security bug: we documented that table ownership was required in order to create a statistics object on it, but didn't actually implement it. Implement tab-completion for statistics objects. This can stand some more improvement. Authors: Alvaro Herrera, with lots of cleanup by Tom Lane Discussion: https://postgr.es/m/20170420212426.ltvgyhnefvhixm6i@alvherre.pgsql	2017-05-12 14:59:35 -03:00
Tom Lane	596a7c8df7	Increase MAX_SYSCACHE_CALLBACKS to provide more room for extensions. Increase from the historical value of 32 to 64. We are up to 31 callers of CacheRegisterSyscacheCallback() in HEAD, so if they were all to be exercised in one process that would leave only one slot for add-on modules. It's probably not possible for that to happen, but still we clearly need more daylight here. (At some point it might be worth making the array dynamically resizable; but since we've never heard a complaint of "out of syscache_callback_list slots" happening in the field, I doubt it's worth it yet.) Back-patch as far as 9.4, which is where we increased the companion limit MAX_RELCACHE_CALLBACKS (cf commit `f01d1ae3a`). It's not as urgent in released branches, which have only a couple dozen call sites in core, but it still seems that somebody might hit the limit before these branches die. Discussion: https://postgr.es/m/12184.1494450131@sss.pgh.pa.us	2017-05-11 14:51:21 -04:00
Tom Lane	d10c626de4	Rename WAL-related functions and views to use "lsn" not "location". Per discussion, "location" is a rather vague term that could refer to multiple concepts. "LSN" is an unambiguous term for WAL locations and should be preferred. Some function names, view column names, and function output argument names used "lsn" already, but others used "location", as well as yet other terms such as "wal_position". Since we've already renamed a lot of things in this area from "xlog" to "wal" for v10, we may as well incur a bit more compatibility pain and make these names all consistent. David Rowley, minor additional docs hacking by me Discussion: https://postgr.es/m/CAKJS1f8O0njDKe8ePFQ-LK5-EjwThsDws6ohJ-+c6nWK+oUxtg@mail.gmail.com	2017-05-11 11:49:59 -04:00
Robert Haas	622c82279d	Avoid theoretical infinite loop loading relcache partition key. Amit Langote, per report from 甄明洋 Discussion: http://postgr.es/m/57bd1e1.1886.15bd7b79cee.Coremail.18612389267@yeah.net	2017-05-09 23:53:35 -04:00
Peter Eisentraut	489b96e80b	Improve memory use in logical replication apply Previously, the memory used by the logical replication apply worker for processing messages would never be freed, so that could end up using a lot of memory. To improve that, change the existing ApplyContext memory context to ApplyMessageContext and reset that after every message (similar to MessageContext used elsewhere). For consistency of naming, rename the ApplyCacheContext to ApplyContext. Author: Stas Kelvich <s.kelvich@postgrespro.ru>	2017-05-09 14:51:49 -04:00
Tom Lane	da07596006	Further patch rangetypes_selfuncs.c's statistics slot management. Values in a STATISTIC_KIND_RANGE_LENGTH_HISTOGRAM slot are float8, not of the type of the column the statistics are for. This bug is at least partly the fault of sloppy specification comments for get_attstatsslot()/free_attstatsslot(): the type OID they want is that of the stavalues entries, not of the underlying column. (I double-checked other callers and they seem to get this right.) Adjust the comments to be more correct. Per buildfarm. Security: CVE-2017-7484	2017-05-08 15:03:14 -04:00
Tom Lane	b6576e5914	Fix possibly-uninitialized variable. Oversight in `e2d4ef8de` et al (my fault not Peter's). Per buildfarm. Security: CVE-2017-7484	2017-05-08 11:18:40 -04:00
Peter Eisentraut	e2d4ef8de8	Add security checks to selectivity estimation functions Some selectivity estimation functions run user-supplied operators over data obtained from pg_statistic without security checks, which allows those operators to leak pg_statistic data without having privileges on the underlying tables. Fix by checking that one of the following is satisfied: (1) the user has table or column privileges on the table underlying the pg_statistic data, or (2) the function implementing the user-supplied operator is leak-proof. If neither is satisfied, planning will proceed as if there are no statistics available. At least one of these is satisfied in most cases in practice. The only situations that are negatively impacted are user-defined or not-leak-proof operators on a security-barrier view. Reported-by: Robert Haas <robertmhaas@gmail.com> Author: Peter Eisentraut <peter_e@gmx.net> Author: Tom Lane <tgl@sss.pgh.pa.us> Security: CVE-2017-7484	2017-05-08 09:26:32 -04:00
Heikki Linnakangas	eb61136dc7	Remove support for password_encryption='off' / 'plain'. Storing passwords in plaintext hasn't been a good idea for a very long time, if ever. Now seems like a good time to finally forbid it, since we're messing with this in PostgreSQL 10 anyway. Remove the CREATE/ALTER USER UNENCRYPTED PASSSWORD 'foo' syntax, since storing passwords unencrypted is no longer supported. ENCRYPTED PASSWORD 'foo' is still accepted, but ENCRYPTED is now just a noise-word, it does the same as just PASSWORD 'foo'. Likewise, remove the --unencrypted option from createuser, but accept --encrypted as a no-op for backward compatibility. AFAICS, --encrypted was a no-op even before this patch, because createuser encrypted the password before sending it to the server even if --encrypted was not specified. It added the ENCRYPTED keyword to the SQL command, but since the password was already in encrypted form, it didn't make any difference. The documentation was not clear on whether that was intended or not, but it's moot now. Also, while password_encryption='on' is still accepted as an alias for 'md5', it is now marked as hidden, so that it is not listed as an accepted value in error hints, for example. That's not directly related to removing 'plain', but it seems better this way. Reviewed by Michael Paquier Discussion: https://www.postgresql.org/message-id/16e9b768-fd78-0b12-cfc1-7b6b7f238fde@iki.fi	2017-05-08 11:26:07 +03:00
Simon Riggs	1f30295eab	Remove poorly worded and duplicated comment Move line of code to avoid need for duplicated comment Brought to attention by Masahiko Sawada	2017-05-08 08:49:28 +01:00
Andres Freund	b58c433ef9	Fix duplicated words in comment. Reported-By: Peter Geoghegan Discussion: https://postgr.es/m/CAH2-Wzn3rY2N0gTWndaApD113T+O8L6oz8cm7_F3P8y4awdoOg@mail.gmail.com Backpatch: no, only present in master	2017-05-06 17:03:45 -07:00
Peter Eisentraut	0de791ed76	Fix cursor_to_xml in tableforest false mode It only produced <row> elements but no wrapping <table> element. By contrast, cursor_to_xmlschema produced a schema that is now correct but did not previously match the XML data produced by cursor_to_xml. In passing, also fix a minor misunderstanding about moving cursors in the tests related to this. Reported-by: filip@jirsak.org Based-on-patch-by: Thomas Munro <thomas.munro@enterprisedb.com>	2017-05-03 21:41:10 -04:00
Tom Lane	23c6eb0336	Remove create_singleton_array(), hard-coding the case in its sole caller. create_singleton_array() was not really as useful as we perhaps thought when we added it. It had never accreted more than one call site, and is only saving a dozen lines of code at that one, which is considerably less bulk than the function itself. Moreover, because of its insistence on using the caller's fn_extra cache space, it's arguably a coding hazard. text_to_array_internal() does not currently use fn_extra in any other way, but if it did it would be subtly broken, since the conflicting fn_extra uses could be needed within a single query, in the seldom-tested case that the field separator varies during the query. The same objection seems likely to apply to any other potential caller. The replacement code is a bit uglier, because it hardwires knowledge of the storage parameters of type TEXT, but it's not like we haven't got dozens or hundreds of other places that do the same. Uglier seems like a good tradeoff for smaller, faster, and safer. Per discussion with Neha Khatri. Discussion: https://postgr.es/m/CAFO0U+_fS5SRhzq6uPG+4fbERhoA9N2+nPrtvaC9mmeWivxbsA@mail.gmail.com	2017-05-02 20:41:37 -04:00
Magnus Hagander	34fc616738	Change hot_standby default value to 'on' This goes together with the changes made to enable replication on the sending side by default (wal_level, max_wal_senders etc) by making the receiving stadby node also enable it by default. Huong Dangminh	2017-05-02 11:12:30 +02:00
Tom Lane	54affb41e7	Improve function header comment for create_singleton_array(). Mentioning the caller is neither future-proof nor an adequate substitute for giving an API specification. Per gripe from Neha Khatri, though I changed the patch around some. Discussion: https://postgr.es/m/CAFO0U+_fS5SRhzq6uPG+4fbERhoA9N2+nPrtvaC9mmeWivxbsA@mail.gmail.com	2017-05-01 15:31:41 -04:00
Robert Haas	5e1ccd4844	In load_relcache_init_file, initialize rd_pdcxt. Oversight noted by Gao Zeng Qi. Discussion: http://postgr.es/m/CAFmBtr1N3-SbepJbnGpaYp=jw-FvWMnYY7-bTtRgvjvbyB8YJA@mail.gmail.com	2017-04-28 14:05:13 -04:00
Stephen Frost	0c76c2463e	pg_get_partkeydef: return NULL for non-partitions Our general rule for pg_get_X(oid) functions is to simply return NULL when passed an invalid or inappropriate OID. Teach pg_get_partkeydef to do this also, making it easier for users to use this function when querying against tables with both partitions and non-partitions (such as pg_class). As a concrete example, this makes pg_dump's life a little easier. Author: Amit Langote	2017-04-26 14:59:22 -04:00
Robert Haas	914ae8d3cb	Adjust outdated comment. Commit `5dfc198146` removed the only existing caller of hash_freeze, but left behind a comment indicating that hash_freeze was still used. Adjust. Kyotaro Horiguchi Discussion: http://postgr.es/m/20170424.165541.230634914.horiguchi.kyotaro@lab.ntt.co.jp	2017-04-25 10:58:45 -04:00
Fujii Masao	7cc14ae9d8	Update copyright in recently added files. This commit also fixes copyright line missed by the automated script. Author: Masahiko Sawada	2017-04-25 23:38:41 +09:00
Heikki Linnakangas	b977780a9b	Also fix comment in sample postgresql.conf file, for "scram-sha-256". Reported offlist by hubert depesz lubaczewski.	2017-04-18 17:38:32 +03:00
Heikki Linnakangas	c727f120ff	Rename "scram" to "scram-sha-256" in pg_hba.conf and password_encryption. Per discussion, plain "scram" is confusing because we actually implement SCRAM-SHA-256 rather than the original SCRAM that uses SHA-1 as the hash algorithm. If we add support for SCRAM-SHA-512 or some other mechanism in the SCRAM family in the future, that would become even more confusing. Most of the internal files and functions still use just "scram" as a shorthand for SCRMA-SHA-256, but I did change PASSWORD_TYPE_SCRAM to PASSWORD_TYPE_SCRAM_SHA_256, as that could potentially be used by 3rd party extensions that hook into the password-check hook. Michael Paquier did this in an earlier version of the SCRAM patch set already, but I didn't include that in the version that was committed. Discussion: https://www.postgresql.org/message-id/fde71ff1-5858-90c8-99a9-1c2427e7bafb@iki.fi	2017-04-18 14:50:50 +03:00
Alvaro Herrera	ee6922112e	Rename columns in new pg_statistic_ext catalog The new catalog reused a column prefix "sta" from pg_statistic, but this is undesirable, so change the catalog to use prefix "stx" instead. Also, rename the column that lists enabled statistic kinds as "stxkind" rather than "enabled". Discussion: https://postgr.es/m/CAKJS1f_2t5jhSN7huYRFH3w3rrHfG2QU7hiUHsu-Vdjd1rYT3w@mail.gmail.com	2017-04-17 18:34:29 -03:00
Peter Eisentraut	6275f5d28a	Fix new warnings from GCC 7 This addresses the new warning types -Wformat-truncation -Wformat-overflow that are part of -Wall, via -Wformat, in GCC 7.	2017-04-17 13:59:46 -04:00
Tom Lane	32470825d3	Avoid passing function pointers across process boundaries. We'd already recognized that we can't pass function pointers across process boundaries for functions in loadable modules, since a shared library could get loaded at different addresses in different processes. But actually the practice doesn't work for functions in the core backend either, if we're using EXEC_BACKEND. This is the cause of recent failures on buildfarm member culicidae. Switch to passing a string function name in all cases. Something like this needs to be back-patched into 9.6, but let's see if the buildfarm likes it first. Petr Jelinek, with a bunch of basically-cosmetic adjustments by me Discussion: https://postgr.es/m/548f9c1d-eafa-e3fa-9da8-f0cc2f654e60@2ndquadrant.com	2017-04-14 23:50:16 -04:00
Tom Lane	5e39f06cfe	Move bootstrap-time lookup of regproc OIDs into genbki.pl. Formerly, the bootstrap backend looked up the OIDs corresponding to names in regproc catalog entries using brute-force searches of pg_proc. It was somewhat remarkable that that worked at all, since it was used while populating other pretty-fundamental catalogs like pg_operator. And it was also quite slow, and getting slower as pg_proc gets bigger. This patch moves the lookup work into genbki.pl, so that the values in postgres.bki for regproc columns are always numeric OIDs, an option that regprocin() already supported. Perl isn't the world's speediest language, so this about doubles the time needed to run genbki.pl (from 0.3 to 0.6 sec on my machine). But we only do that at most once per build. The time needed to run initdb drops significantly --- on my machine, initdb --no-sync goes from 1.8 to 1.3 seconds. So this is a small net win even for just one initdb per build, and it becomes quite a nice win for test sequences requiring many initdb runs. Strip out the now-dead code for brute-force catalog searching in regprocin. We'd also cargo-culted similar logic into regoperin and some (not all) of the other reg*in functions. That is all dead code too since we currently have no need to load such values during bootstrap. I removed it all, reasoning that if we ever need such functionality it'd be much better to do it in a similar way to this patch. There might be some simplifications possible in the backend now that regprocin doesn't require doing catalog reads so early in bootstrap. I've not looked into that, though. Andreas Karlsson, with some small adjustments by me Discussion: https://postgr.es/m/30896.1492006367@sss.pgh.pa.us	2017-04-13 12:07:57 -04:00
Simon Riggs	2c2ecddcff	Mention pg_index changes also cause relcache invalidation Amit Langote, additional line by me	2017-04-13 10:07:21 +01:00
Robert Haas	6599c9ac33	Add an Assert() to max_parallel_workers enforcement. To prevent future bugs along the lines of the one corrected by commit `8ff518699f`, or find any that remain in the current code, add an Assert() that the difference between parallel_register_count and parallel_terminate_count is in a sane range. Kuntal Ghosh, with considerable tidying-up by me, per a suggestion from Neha Khatri. Reviewed by Tomas Vondra. Discussion: http://postgr.es/m/CAFO0U+-E8yzchwVnvn5BeRDPgX2z9vZUxQ8dxx9c0XFGBC7N1Q@mail.gmail.com	2017-04-11 13:03:44 -04:00
Fujii Masao	ff7bce1743	Add max_sync_workers_per_subscription to postgresql.conf.sample. This commit also does - add REPLICATION_SUBSCRIBERS into config_group - mark max_logical_replication_workers and max_sync_workers_per_subscription as REPLICATION_SUBSCRIBERS parameters - move those parameters into "Subscribers" section in postgresql.conf.sample Author: Masahiko Sawada, Petr Jelinek and me Reported-by: Masahiko Sawada Discussion: http://postgr.es/m/CAD21AoAonSCoa=v=87ZO3vhfUZA1k_E2XRNHTt=xioWGUa+0ug@mail.gmail.com	2017-04-12 00:10:54 +09:00
Andres Freund	c45b1d2283	Fix initialization of dsa.c free area counter. The backend local copy of dsa_area_control->freed_segment_counter was not properly initialized / maintained. This could, if unlucky, lead to keeping attached to a segment for too long. Found via valgrind bleat on buildfarm animal skink. Author: Thomas Munro Discussion: https://postgr.es/m/20170407164935.obsf2jipjfos5zei@alap3.anarazel.de	2017-04-10 11:56:46 -07:00
Tom Lane	8f0530f580	Improve castNode notation by introducing list-extraction-specific variants. This extends the castNode() notation introduced by commit `5bcab1114` to provide, in one step, extraction of a list cell's pointer and coercion to a concrete node type. For example, "lfirst_node(Foo, lc)" is the same as "castNode(Foo, lfirst(lc))". Almost half of the uses of castNode that have appeared so far include a list extraction call, so this is pretty widely useful, and it saves a few more keystrokes compared to the old way. As with the previous patch, back-patch the addition of these macros to pg_list.h, so that the notation will be available when back-patching. Patch by me, after an idea of Andrew Gierth's. Discussion: https://postgr.es/m/14197.1491841216@sss.pgh.pa.us	2017-04-10 13:51:53 -04:00
Tom Lane	511540dadf	Move isolationtester's is-blocked query into C code for speed. Commit `4deb41381` modified isolationtester's query to see whether a session is blocked to also check for waits occurring in GetSafeSnapshot. However, it did that in a way that enormously increased the query's runtime under CLOBBER_CACHE_ALWAYS, causing the buildfarm members that use that to run about four times slower than before, and in some cases fail entirely. To fix, push the entire logic into a dedicated backend function. This should actually reduce the CLOBBER_CACHE_ALWAYS runtime from what it was previously, though I've not checked that. In passing, expose a SQL function to check for safe-snapshot blockage, comparable to pg_blocking_pids. This is more or less free given the infrastructure built to solve the other problem, so we might as well. Thomas Munro Discussion: https://postgr.es/m/20170407165749.pstcakbc637opkax@alap3.anarazel.de	2017-04-10 10:26:54 -04:00
Kevin Grittner	c63172d60f	Add GUCs for predicate lock promotion thresholds. Defaults match the fixed behavior of prior releases, but now DBAs have better options to tune serializable workloads. It might be nice to be able to set this per relation, but that part will need to wait for another release. Author: Dagfinn Ilmari Mannsåker	2017-04-07 21:38:05 -05:00
Robert Haas	5c4488478b	Use English, instead of internal names, for translatable messages. Discussion: http://postgr.es/m/CA+Tgmobuz2C-YiQ87h8h0gECCV=F+SE=HBNaAU75rR5FEwtEhQ@mail.gmail.com	2017-04-07 15:38:46 -04:00
Heikki Linnakangas	0c732850d2	Remove duplicate assignment. Harmless, but clearly wrong. Kyotaro Horiguchi	2017-04-07 19:19:50 +03:00
Andrew Dunstan	88dd4e4831	Remove extraneous comma to satisfy picky compiler per buildfarm	2017-04-06 23:28:14 -04:00
Andrew Dunstan	cf35346e81	Make json_populate_record and friends operate recursively With this change array fields are populated from json(b) arrays, and composite fields are populated from json(b) objects. Along the way, some significant code refactoring is done to remove redundancy in the way to populate_record[_set] and to_record[_set] functions operate, and some significant efficiency gains are made by caching tuple descriptors. Nikita Glukhov, edited some by me. Reviewed by Aleksander Alekseev and Tom Lane.	2017-04-06 22:22:13 -04:00
Simon Riggs	ac2b095088	Reset API of clause_selectivity() Discussion: https://postgr.es/m/CAKJS1f9yurJQW9pdnzL+rmOtsp2vOytkpXKGnMFJEO-qz5O5eA@mail.gmail.com	2017-04-06 19:10:51 -04:00
Andres Freund	fa117ee403	Allow avoiding tuple copy within tuplesort_gettupleslot(). Add a "copy" argument to make it optional to receive a copy of caller tuple that is safe to use following a subsequent manipulating of tuplesort's state. This is a performance optimization. Most existing tuplesort_gettupleslot() callers are made to opt out of copying. Existing callers that happen to rely on the validity of tuple memory beyond subsequent manipulations of the tuplesort request their own copy. This brings tuplesort_gettupleslot() in line with tuplestore_gettupleslot(). In the future, a "copy" tuplesort_getdatum() argument may be added, that similarly allows callers to opt out of receiving their own copy of tuple. In passing, clarify assumptions that callers of other tuplesort fetch routines may make about tuple memory validity, per gripe from Tom Lane. Author: Peter Geoghegan Discussion: CAM3SWZQWZZ_N=DmmL7tKy_OUjGH_5mN=N=A6h7kHyyDvEhg2DA@mail.gmail.com	2017-04-06 14:48:59 -07:00
Alvaro Herrera	7e534adcdc	Fix BRIN cost estimation The original code was overly optimistic about the cost of scanning a BRIN index, leading to BRIN indexes being selected when they'd be a worse choice than some other index. This complete rewrite should be more accurate. Author: David Rowley, based on an earlier patch by Emre Hasegeli Reviewed-by: Emre Hasegeli Discussion: https://postgr.es/m/CAKJS1f9n-Wapop5Xz1dtGdpdqmzeGqQK4sV2MK-zZugfC14Xtw@mail.gmail.com	2017-04-06 17:51:53 -03:00
Alvaro Herrera	b1fc51a36e	Comment fixes for extended statistics Clean up some code comments in new extended statistics code, from `7b504eb282`.	2017-04-06 12:28:50 -03:00
Simon Riggs	cd0cebaf7d	Always SnapshotResetXmin() during ClearTransaction() Avoid corner cases during 2PC with `6bad580d9e`	2017-04-06 10:30:22 -04:00
Peter Eisentraut	3217327053	Identity columns This is the SQL standard-conforming variant of PostgreSQL's serial columns. It fixes a few usability issues that serial columns have: - CREATE TABLE / LIKE copies default but refers to same sequence - cannot add/drop serialness with ALTER TABLE - dropping default does not drop sequence - need to grant separate privileges to sequence - other slight weirdnesses because serial is some kind of special macro Reviewed-by: Vitaly Burovoy <vitaly.burovoy@gmail.com>	2017-04-06 08:41:37 -04:00
Simon Riggs	6bad580d9e	Avoid SnapshotResetXmin() during AtEOXact_Snapshot() For normal commits and aborts we already reset PgXact->xmin, so we can simply avoid running SnapshotResetXmin() twice. During performance tests by Alexander Korotkov, diagnosis by Andres Freund showed PgXact array as a bottleneck. After manual analysis by me of the code paths that touch those memory locations, I was able to identify extraneous code in the main transaction commit path. Avoiding touching highly contented shmem improves concurrent performance slightly on all workloads, confirmed by tests run by Ashutosh Sharma and Alexander Korotkov. Simon Riggs Discussion: CANP8+jJdXE9b+b9F8CQT-LuxxO0PBCB-SZFfMVAdp+akqo4zfg@mail.gmail.com	2017-04-06 08:31:52 -04:00
Tom Lane	df1a699e5b	Fix integer-overflow problems in interval comparison. When using integer timestamps, the interval-comparison functions tried to compute the overall magnitude of an interval as an int64 number of microseconds. As reported by Frazer McLean, this overflows for intervals exceeding about 296000 years, which is bad since we nominally allow intervals many times larger than that. That results in wrong comparison results, and possibly in corrupted btree indexes for columns containing such large interval values. To fix, compute the magnitude as int128 instead. Although some compilers have native support for int128 calculations, many don't, so create our own support functions that can do 128-bit addition and multiplication if the compiler support isn't there. These support functions are designed with an eye to allowing the int128 code paths in numeric.c to be rewritten for use on all platforms, although this patch doesn't do that, or even provide all the int128 primitives that will be needed for it. Back-patch as far as 9.4. Earlier releases did not guard against overflow of interval values at all (commit `146604ec4` fixed that), so it seems not very exciting to worry about overly-large intervals for them. Before 9.6, we did not assume that unreferenced "static inline" functions would not draw compiler warnings, so omit functions not directly referenced by timestamp.c, the only present consumer of int128.h. (We could have omitted these functions in HEAD too, but since they were written and debugged on the way to the present patch, and they look likely to be needed by numeric.c, let's keep them in HEAD.) I did not bother to try to prevent such warnings in a --disable-integer-datetimes build, though. Before 9.5, configure will never define HAVE_INT128, so the part of int128.h that exploits a native int128 implementation is dead code in the 9.4 branch. I didn't bother to remove it, thinking that keeping the file looking similar in different branches is more useful. In HEAD only, add a simple test harness for int128.h in src/tools/. In back branches, this does not change the float-timestamps code path. That's not subject to the same kind of overflow risk, since it computes the interval magnitude as float8. (No doubt, when this code was originally written, overflow was disregarded for exactly that reason.) There is a precision hazard instead :-(, but we'll avert our eyes from that question, since no complaints have been reported and that code's deprecated anyway. Kyotaro Horiguchi and Tom Lane Discussion: https://postgr.es/m/1490104629.422698.918452336.26FA96B7@webmail.messagingengine.com	2017-04-05 23:51:27 -04:00
Simon Riggs	2686ee1b7c	Collect and use multi-column dependency stats Follow on patch in the multi-variate statistics patch series. CREATE STATISTICS s1 WITH (dependencies) ON (a, b) FROM t; ANALYZE; will collect dependency stats on (a, b) and then use the measured dependency in subsequent query planning. Commit `7b504eb282` added CREATE STATISTICS with n-distinct coefficients. These are now specified using the mutually exclusive option WITH (ndistinct). Author: Tomas Vondra, David Rowley Reviewed-by: Kyotaro HORIGUCHI, Álvaro Herrera, Dean Rasheed, Robert Haas and many other comments and contributions Discussion: https://postgr.es/m/56f40b20-c464-fad2-ff39-06b668fac47c@2ndquadrant.com	2017-04-05 18:00:42 -04:00
Simon Riggs	9a3215026b	Make min_wal_size/max_wal_size use MB internally Previously they were defined using multiples of XLogSegSize. Remove GUC_UNIT_XSEGS. Introduce GUC_UNIT_MB Extracted from patch series on XLogSegSize infrastructure. Beena Emerson	2017-04-04 18:00:01 -04:00
Andres Freund	490e9a98ff	Fix two valgrind issues in slab allocator. During allocation VALGRIND_MAKE_MEM_DEFINED was called with a pointer as size. That kind of works, but makes valgrind exceedingly slow for workloads involving the slab allocator. Secondly there was an access to memory marked as unreachable within SlabCheck(). Fix that too. Author: Tomas Vondra Discussion: https://postgr.es/m/a6543b6d-6015-99b1-63ef-3ed55a76a730@2ndquadrant.com	2017-04-04 14:26:42 -07:00
Robert Haas	ea69a0dead	Expand hash indexes more gradually. Since hash indexes typically have very few overflow pages, adding a new splitpoint essentially doubles the on-disk size of the index, which can lead to large and abrupt increases in disk usage (and perhaps long delays on occasion). To mitigate this problem to some degree, divide larger splitpoints into four equal phases. This means that, for example, instead of growing from 4GB to 8GB all at once, a hash index will now grow from 4GB to 5GB to 6GB to 7GB to 8GB, which is perhaps still not as smooth as we'd like but certainly an improvement. This changes the on-disk format of the metapage, so bump HASH_VERSION from 2 to 3. This will force a REINDEX of all existing hash indexes, but that's probably a good idea anyway. First, hash indexes from pre-10 versions of PostgreSQL could easily be corrupted, and we don't want to confuse corruption carried over from an older release with any corruption caused despite the new write-ahead logging in v10. Second, it will let us remove some backward-compatibility code added by commit `293e24e507`. Mithun Cy, reviewed by Amit Kapila, Jesper Pedersen and me. Regression test outputs updated by me. Discussion: http://postgr.es/m/CAD__OuhG6F1gQLCgMQNnMNgoCvOLQZz9zKYJQNYvYmmJoM42gA@mail.gmail.com Discussion: http://postgr.es/m/CA+TgmoYty0jCf-pa+m+vYUJ716+AxM7nv_syvyanyf5O-L_i2A@mail.gmail.com	2017-04-03 23:46:33 -04:00
Robert Haas	7a39b5e4d1	Abstract logic to allow for multiple kinds of child rels. Currently, the only type of child relation is an "other member rel", which is the child of a baserel, but in the future joins and even upper relations may have child rels. To facilitate that, introduce macros that test to test for particular RelOptKind values, and use them in various places where they help to clarify the sense of a test. (For example, a test may allow RELOPT_OTHER_MEMBER_REL either because it intends to allow child rels, or because it intends to allow simple rels.) Also, remove find_childrel_top_parent, which will not work for a child rel that is not a baserel. Instead, add a new RelOptInfo member top_parent_relids to track the same kind of information in a more generic manner. Ashutosh Bapat, slightly tweaked by me. Review and testing of the patch set from which this was taken by Rajkumar Raghuwanshi and Rafia Sabih. Discussion: http://postgr.es/m/CA+TgmoagTnF2yqR3PT2rv=om=wJiZ4-A+ATwdnriTGku1CLYxA@mail.gmail.com	2017-04-03 22:41:31 -04:00
Kevin Grittner	41bd155dd6	Fix two undocumented parameters to functions from ENR patch. On ProcessUtility document the parameter, to match others. On CreateCachedPlan drop the queryEnv parameter. It was not referenced within the function, and had been added on the assumption that with some unknown future usage of QueryEnvironment it might be useful to do something there. We have avoided other "just in case" implementation of unused paramters, so drop it here. Per gripe from Tom Lane	2017-04-01 15:21:05 -05:00
Kevin Grittner	18ce3a4ab2	Add infrastructure to support EphemeralNamedRelation references. A QueryEnvironment concept is added, which allows new types of objects to be passed into queries from parsing on through execution. At this point, the only thing implemented is a collection of EphemeralNamedRelation objects -- relations which can be referenced by name in queries, but do not exist in the catalogs. The only type of ENR implemented is NamedTuplestore, but provision is made to add more types fairly easily. An ENR can carry its own TupleDesc or reference a relation in the catalogs by relid. Although these features can be used without SPI, convenience functions are added to SPI so that ENRs can easily be used by code run through SPI. The initial use of all this is going to be transition tables in AFTER triggers, but that will be added to each PL as a separate commit. An incidental effect of this patch is to produce a more informative error message if an attempt is made to modify the contents of a CTE from a referencing DML statement. No tests previously covered that possibility, so one is added. Kevin Grittner and Thomas Munro Reviewed by Heikki Linnakangas, David Fetter, and Thomas Munro with valuable comments and suggestions from many others	2017-03-31 23:17:18 -05:00
Robert Haas	9a12ad042d	Fix typos. Brandur Leach	2017-03-31 20:18:11 -04:00
Andrew Dunstan	c80b9920fc	Transform or iterate over json(b) string values Dmitry Dolgov, reviewed and lightly edited by me.	2017-03-31 14:25:25 -04:00
Simon Riggs	25fff40798	Default monitoring roles Three nologin roles with non-overlapping privs are created by default * pg_read_all_settings - read all GUCs. * pg_read_all_stats - pg_stat_, pg_database_size(), pg_tablespace_size() pg_stat_scan_tables - may lock/scan tables Top level role - pg_monitor includes all of the above by default, plus others Author: Dave Page Reviewed-by: Stephen Frost, Robert Haas, Peter Eisentraut, Simon Riggs	2017-03-30 14:18:53 -04:00
Andres Freund	5ded4bd214	Remove support for version-0 calling conventions. The V0 convention is failure prone because we've so far assumed that a function is V0 if PG_FUNCTION_INFO_V1 is missing, leading to crashes if a function was coded against the V1 interface. V0 doesn't allow proper NULL, SRF and toast handling. V0 doesn't offer features that V1 doesn't. Thus remove V0 support and obsolete fmgr README contents relating to it. Author: Andres Freund, with contributions by Peter Eisentraut & Craig Ringer Reviewed-By: Peter Eisentraut, Craig Ringer Discussion: https://postgr.es/m/20161208213441.k3mbno4twhg2qf7g@alap3.anarazel.de	2017-03-30 06:25:46 -07:00
Teodor Sigaev	f90d23d0c5	Implement SortSupport for macaddr data type Introduces a scheme to produce abbreviated keys for the macaddr type. Bump catalog version. Author: Brandur Leach Reviewed-by: Julien Rouhaud, Peter Geoghegan https://commitfest.postgresql.org/13/743/	2017-03-29 23:28:56 +03:00
Robert Haas	fddf45b380	Plug race in dsa_attach. With sufficiently bad luck, it was possible for a parallel worker to attempt attach to a DSA area after all other backends have detached from it, which is not legal. If the worker had waited a little longer to get started, the DSM itself would have been destroyed, which is why this wasn't noticed before. Thomas Munro, per a report from Andreas Seltenreich Discussion: http://postgr.es/m/87h92g83t3.fsf@credativ.de	2017-03-29 09:48:39 -04:00
Peter Eisentraut	4cb824699e	Cast result of copyObject() to correct type copyObject() is declared to return void , which allows easily assigning the result independent of the input, but it loses all type checking. If the compiler supports typeof or something similar, cast the result to the input type. This creates a greater amount of type safety. In some cases, where the result is assigned to a generic type such as Node or Expr *, new casts are now necessary, but in general casts are now unnecessary in the normal case and indicate that something unusual is happening. Reviewed-by: Mark Dilger <hornschnorter@gmail.com>	2017-03-28 21:59:23 -04:00
Alvaro Herrera	ce96ce60ca	Remove direct uses of ItemPointer.{ip_blkid,ip_posid} There are no functional changes here; this simply encapsulates knowledge of the ItemPointerData struct so that a future patch can change things without more breakage. All direct users of ip_blkid and ip_posid are changed to use existing macros ItemPointerGetBlockNumber and ItemPointerGetOffsetNumber respectively. For callers where that's inappropriate (because they Assert that the itempointer is is valid-looking), add ItemPointerGetBlockNumberNoCheck and ItemPointerGetOffsetNumberNoCheck, which lack the assertion but are otherwise identical. Author: Pavan Deolasee Discussion: https://postgr.es/m/CABOikdNnFon4cJiL=h1mZH3bgUeU+sWHuU4Yr8AB=j3A2p1GiA@mail.gmail.com	2017-03-28 19:02:23 -03:00
Alvaro Herrera	1f171a1803	Fix thinko in estimate_num_groups The code for the reworked n-distinct estimation on commit `7b504eb282` was written differently in a previous version of the patch, prior to commit; on rewriting it, we missed updating an initializer. This caused the code to (mistakenly) apply a fudge factor even in the case where a single value is applied, leading to incorrect results. This means that the 'relvarcount' variable name is now wrong. Add a comment to try and make the situation clearer, and remove an incorrect comment I added. Problem noticed, and code patch, by Tomas Vondra. Additional commentary by Álvaro.	2017-03-27 13:14:23 -03:00
Peter Eisentraut	3371e4d9b1	Change default of log_directory to 'log' The previous default 'pg_log' might have indicated by its "pg_" prefix that it is an internal system directory. The new default is more in line with the typical naming of directories with user-facing log files. Together with the renaming of pg_clog and pg_xlog, this should clear up that difference. Author: Andreas Karlsson <andreas@proxel.se>	2017-03-27 10:34:33 -04:00
Peter Eisentraut	facde2a98f	Clean up Perl code according to perlcritic Fix all perlcritic warnings of severity level 5, except in src/backend/utils/Gen_dummy_probes.pl, which is automatically generated. Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Reviewed-by: Daniel Gustafsson <daniel@yesql.se>	2017-03-27 08:18:22 -04:00
Alvaro Herrera	2c3e47527a	Fix a couple of problems in pg_get_statisticsextdef There was a thinko whereby we tested the wrong tuple after fetching it from cache; avoid that by using generate_relation_name instead, which is simpler. Also, the statistics name was not qualified, so add that. (It could be argued that qualification should be conditional on the schema not being on search path. We can add that later, but at least this form is correct.) Author: David Rowley, Álvaro Herrera Discussion: https://postgr.es/m/CAKJS1f8RjLeVZJ2+93pdQGuZJeBF-ifsHaFMR-q-6-Z0qxA8cA@mail.gmail.com	2017-03-27 01:03:50 -03:00
Robert Haas	fc70a4b0df	Show more processes in pg_stat_activity. Previously, auxiliary processes and background workers not connected to a database (such as the logical replication launcher) weren't shown. Include them, so that we can see the associated wait state information. Add a new column to identify the processes type, so that people can filter them out easily using SQL if they wish. Before this patch was written, there was discussion about whether we should expose this information in a separate view, so as to avoid contaminating pg_stat_activity with things people might not want to see. But putting everything in pg_stat_activity was a more popular choice, so that's what the patch does. Kuntal Ghosh, reviewed by Amit Langote and Michael Paquier. Some revisions and bug fixes by me. Discussion: http://postgr.es/m/CA+TgmoYES5nhkEGw9nZXU8_FhA8XEm8NTm3-SO+3ML1B81Hkww@mail.gmail.com	2017-03-26 22:02:22 -04:00
Tom Lane	244dd95ce9	Update some obsolete comments. Fix a few stray references to expression eval functions that don't exist anymore or don't take the same input representation they used to.	2017-03-26 11:36:46 -04:00
Andres Freund	b8d7f053c5	Faster expression evaluation and targetlist projection. This replaces the old, recursive tree-walk based evaluation, with non-recursive, opcode dispatch based, expression evaluation. Projection is now implemented as part of expression evaluation. This both leads to significant performance improvements, and makes future just-in-time compilation of expressions easier. The speed gains primarily come from: - non-recursive implementation reduces stack usage / overhead - simple sub-expressions are implemented with a single jump, without function calls - sharing some state between different sub-expressions - reduced amount of indirect/hard to predict memory accesses by laying out operation metadata sequentially; including the avoidance of nearly all of the previously used linked lists - more code has been moved to expression initialization, avoiding constant re-checks at evaluation time Future just-in-time compilation (JIT) has become easier, as demonstrated by released patches intended to be merged in a later release, for primarily two reasons: Firstly, due to a stricter split between expression initialization and evaluation, less code has to be handled by the JIT. Secondly, due to the non-recursive nature of the generated "instructions", less performance-critical code-paths can easily be shared between interpreted and compiled evaluation. The new framework allows for significant future optimizations. E.g.: - basic infrastructure for to later reduce the per executor-startup overhead of expression evaluation, by caching state in prepared statements. That'd be helpful in OLTPish scenarios where initialization overhead is measurable. - optimizing the generated "code". A number of proposals for potential work has already been made. - optimizing the interpreter. Similarly a number of proposals have been made here too. The move of logic into the expression initialization step leads to some backward-incompatible changes: - Function permission checks are now done during expression initialization, whereas previously they were done during execution. In edge cases this can lead to errors being raised that previously wouldn't have been, e.g. a NULL array being coerced to a different array type previously didn't perform checks. - The set of domain constraints to be checked, is now evaluated once during expression initialization, previously it was re-built every time a domain check was evaluated. For normal queries this doesn't change much, but e.g. for plpgsql functions, which caches ExprStates, the old set could stick around longer. The behavior around might still change. Author: Andres Freund, with significant changes by Tom Lane, changes by Heikki Linnakangas Reviewed-By: Tom Lane, Heikki Linnakangas Discussion: https://postgr.es/m/20161206034955.bh33paeralxbtluv@alap3.anarazel.de	2017-03-25 14:52:06 -07:00
Peter Eisentraut	066e3a68ae	Fix locale pointer use in WIN32 code path Author: David Rowley <david.rowley@2ndquadrant.com>	2017-03-25 00:38:12 -04:00
Alvaro Herrera	2e0c919bce	Fix typo in comment	2017-03-24 17:20:55 -03:00
Simon Riggs	3428ef7911	Reverting `42b4b0b241` Buildfarm issues and other reported issues	2017-03-24 17:56:17 +00:00
Alvaro Herrera	7b504eb282	Implement multivariate n-distinct coefficients Add support for explicitly declared statistic objects (CREATE STATISTICS), allowing collection of statistics on more complex combinations that individual table columns. Companion commands DROP STATISTICS and ALTER STATISTICS ... OWNER TO / SET SCHEMA / RENAME are added too. All this DDL has been designed so that more statistic types can be added later on, such as multivariate most-common-values and multivariate histograms between columns of a single table, leaving room for permitting columns on multiple tables, too, as well as expressions. This commit only adds support for collection of n-distinct coefficient on user-specified sets of columns in a single table. This is useful to estimate number of distinct groups in GROUP BY and DISTINCT clauses; estimation errors there can cause over-allocation of memory in hashed aggregates, for instance, so it's a worthwhile problem to solve. A new special pseudo-type pg_ndistinct is used. (num-distinct estimation was deemed sufficiently useful by itself that this is worthwhile even if no further statistic types are added immediately; so much so that another version of essentially the same functionality was submitted by Kyotaro Horiguchi: https://postgr.es/m/20150828.173334.114731693.horiguchi.kyotaro@lab.ntt.co.jp though this commit does not use that code.) Author: Tomas Vondra. Some code rework by Álvaro. Reviewed-by: Dean Rasheed, David Rowley, Kyotaro Horiguchi, Jeff Janes, Ideriha Takeshi Discussion: https://postgr.es/m/543AFA15.4080608@fuzzy.cz https://postgr.es/m/20170320190220.ixlaueanxegqd5gr@alvherre.pgsql	2017-03-24 14:06:10 -03:00
Robert Haas	857ee8e391	Add a txid_status function. If your connection to the database server is lost while a COMMIT is in progress, it may be difficult to figure out whether the COMMIT was successful or not. This function will tell you, provided that you don't wait too long to ask. It may be useful in other situations, too. Craig Ringer, reviewed by Simon Riggs and by me Discussion: http://postgr.es/m/CAMsr+YHQiWNEi0daCTboS40T+V5s_+dst3PYv_8v2wNVH+Xx4g@mail.gmail.com	2017-03-24 12:00:53 -04:00
Simon Riggs	42b4b0b241	Avoid SnapshotResetXmin() during AtEOXact_Snapshot() For normal commits and aborts we already reset PgXact->xmin Avoiding touching highly contented shmem improves concurrent performance. Simon Riggs Discussion: CANP8+jJdXE9b+b9F8CQT-LuxxO0PBCB-SZFfMVAdp+akqo4zfg@mail.gmail.com	2017-03-24 14:20:59 +00:00
Peter Eisentraut	524e0f7ac8	Fix crash in ICU patch This only happened with single-byte encodings.	2017-03-23 16:31:39 -04:00
Peter Eisentraut	eccfef81e1	ICU support Add a column collprovider to pg_collation that determines which library provides the collation data. The existing choices are default and libc, and this adds an icu choice, which uses the ICU4C library. The pg_locale_t type is changed to a union that contains the provider-specific locale handles. Users of locale information are changed to look into that struct for the appropriate handle to use. Also add a collversion column that records the version of the collation when it is created, and check at run time whether it is still the same. This detects potentially incompatible library upgrades that can corrupt indexes and other structures. This is currently only supported by ICU-provided collations. initdb initializes the default collation set as before from the `locale -a` output but also adds all available ICU locales with a "-x-icu" appended. Currently, ICU-provided collations can only be explicitly named collations. The global database locales are still always libc-provided. ICU support is enabled by configure --with-icu. Reviewed-by: Thomas Munro <thomas.munro@enterprisedb.com> Reviewed-by: Andreas Karlsson <andreas@proxel.se>	2017-03-23 15:28:48 -04:00
Simon Riggs	232c532213	Minor spelling correction in comment Jon Nelson	2017-03-23 15:29:42 +00:00
Peter Eisentraut	7c4f52409a	Logical replication support for initial data copy Add functionality for a new subscription to copy the initial data in the tables and then sync with the ongoing apply process. For the copying, add a new internal COPY option to have the COPY source data provided by a callback function. The initial data copy works on the subscriber by receiving COPY data from the publisher and then providing it locally into a COPY that writes to the destination table. A WAL receiver can now execute full SQL commands. This is used here to obtain information about tables and publications. Several new options were added to CREATE and ALTER SUBSCRIPTION to control whether and when initial table syncing happens. Change pg_dump option --no-create-subscription-slots to --no-subscription-connect and use the new CREATE SUBSCRIPTION ... NOCONNECT option for that. Author: Petr Jelinek <petr.jelinek@2ndquadrant.com> Tested-by: Erik Rijkers <er@xs4all.nl>	2017-03-23 08:55:37 -04:00
Robert Haas	d3cc37f1d8	Don't scan partitioned tables. Partitioned tables do not contain any data; only their unpartitioned descendents need to be scanned. However, the partitioned tables still need to be locked, even though they're not scanned. To make that work, Append and MergeAppend relations now need to carry a list of (unscanned) partitioned relations that must be locked, and InitPlan must lock all partitioned result relations. Aside from the obvious advantage of avoiding some work at execution time, this has two other advantages. First, it may improve the planner's decision-making in some cases since the empty relation might throw things off. Second, it paves the way to getting rid of the storage for partitioned tables altogether. Amit Langote, reviewed by me. Discussion: http://postgr.es/m/6837c359-45c4-8044-34d1-736756335a15@lab.ntt.co.jp	2017-03-21 09:48:04 -04:00
Teodor Sigaev	d5286aa905	Fix support for some operators (&<, &>, $<\|, \|&>) in box operator class of SP-GiST. Bug exists since initial commit of box opclass for SP-GiST, so backpath to 9.6 Author: Nikita Glukhov with minor editorization of tests by me Reviewed-by: Kyotaro Horiguchi, Anastasia Lubennikova https://commitfest.postgresql.org/13/981/	2017-03-21 16:23:10 +03:00
Andrew Dunstan	29bf501683	Add a direct function call mechanism using the caller's context. The current DirectFunctionCall functions use NULL as the flinfo in initializing the FunctionCallInfoData for the call. That means the called function has no fn_mcxt or fn_extra to work with, and attempting to do so will result in an access violation. These functions instead use the provided flinfo, which will usually be the caller's own flinfo. The caller needs to ensure that it doesn't use the fn_extra in way that is incompatible with the way the called function will use it. The called function should not rely on anything else in the provided context, as it will be relevant to the caller, not the callee. Original code from Tom Lane. Discussion: https://postgr.es/m/db2b70a4-78d7-294a-a315-8e7f506c5978@2ndQuadrant.com	2017-03-21 08:57:46 -04:00
Robert Haas	249cf070e3	Create and use wait events for read, write, and fsync operations. Previous commits, notably `53be0b1add` and `6f3bd98ebf`, made it possible to see from pg_stat_activity when a backend was stuck waiting for another backend, but it's also fairly common for a backend to be stuck waiting for an I/O. Add wait events for those operations, too. Rushabh Lathia, with further hacking by me. Reviewed and tested by Michael Paquier, Amit Kapila, Rajkumar Raghuwanshi, and Rahila Syed. Discussion: http://postgr.es/m/CAGPqQf0LsYHXREPAZqYGVkDqHSyjf=KsD=k0GTVPAuzyThh-VQ@mail.gmail.com	2017-03-18 07:43:01 -04:00
Robert Haas	88e66d193f	Rename "pg_clog" directory to "pg_xact". Names containing the letters "log" sometimes confuse users into believing that only non-critical data is present. It is hoped this renaming will discourage ill-considered removals of transaction status data. Michael Paquier Discussion: http://postgr.es/m/CA+Tgmoa9xFQyjRZupbdEFuwUerFTvC6HjZq1ud6GYragGDFFgA@mail.gmail.com	2017-03-17 09:48:38 -04:00
Robert Haas	befd73c50f	Add pg_ls_logdir() and pg_ls_waldir() functions. These functions are intended to be used by monitoring tools, and, unlike pg_ls_dir(), access to them can be granted to non-superusers, so that those monitoring tools can observe the principle of least privilege. Dave Page, revised by me, and also reviewed a bit by Thomas Munro. Discussion: http://postgr.es/m/CA+OCxow-X=D2fWdKy+HP+vQ1LtrgbsYQ=CshzZBqyFT5jOYrFw@mail.gmail.com	2017-03-16 15:05:02 -04:00
Stephen Frost	cccbddeb14	Be more careful about signed vs. unsigned char The buildfarm has reminded me that not all systems consider char to be signed and we need to be explicit. Adjust the various bits of mac8.c for what we intend, mostly using casts to unsigned char as suggested by Tom, and adjust the tests for valid input accordingly. Explicitly make the hexlookup table signed as it's useful to use -1 there to indicate an invalid value.	2017-03-16 00:13:37 -04:00
Stephen Frost	7821f7229c	Clean up overly paranoid checks in mac8.c Andres' compiler points out, quite correctly, that there's no need for some of the overly paranoid checks which were put into mac8.c. Remove those, as they're useless, add some comments and make a few other minor improvements- reduce the size of hexlookup by making it a char array instead of an int array, and pass in the ptr location directly instead of making hex2_to_uchar re-calculate the location based off the offset every time.	2017-03-15 23:23:28 -04:00
Stephen Frost	c7a9fa399d	Add support for EUI-64 MAC addresses as macaddr8 This adds in support for EUI-64 MAC addresses by adding a new data type called 'macaddr8' (using our usual convention of indicating the number of bytes stored). This was largely a copy-and-paste from the macaddr data type, with appropriate adjustments for having 8 bytes instead of 6 and adding support for converting a provided EUI-48 (6 byte format) to the EUI-64 format. Conversion from EUI-48 to EUI-64 inserts FFFE as the 4th and 5th bytes but does not perform the IPv6 modified EUI-64 action of flipping the 7th bit, but we add a function to perform that specific action for the user as it may be commonly done by users who wish to calculate their IPv6 address based on their network prefix and 48-bit MAC address. Author: Haribabu Kommi, with a good bit of rework of macaddr8_in by me. Reviewed by: Vitaly Burovoy, Kuntal Ghosh Discussion: https://postgr.es/m/CAJrrPGcUi8ZH+KkK+=TctNQ+EfkeCEHtMU_yo1mvX8hsk_ghNQ@mail.gmail.com	2017-03-15 11:16:25 -04:00
Robert Haas	c11453ce0a	hash: Add write-ahead logging support. The warning about hash indexes not being write-ahead logged and their use being discouraged has been removed. "snapshot too old" is now supported for tables with hash indexes. Most importantly, barring bugs, hash indexes will now be crash-safe and usable on standbys. This commit doesn't yet add WAL consistency checking for hash indexes, as we now have for other index types; a separate patch has been submitted to cure that lack. Amit Kapila, reviewed and slightly modified by me. The larger patch series of which this is a part has been reviewed and tested by Álvaro Herrera, Ashutosh Sharma, Mark Kirkwood, Jeff Janes, and Jesper Pedersen. Discussion: http://postgr.es/m/CAA4eK1JOBX=YU33631Qh-XivYXtPSALh514+jR8XeD7v+K3r_Q@mail.gmail.com	2017-03-14 13:27:02 -04:00
Peter Eisentraut	f97a028d8e	Spelling fixes in code comments From: Josh Soref <jsoref@gmail.com>	2017-03-14 12:58:39 -04:00
Heikki Linnakangas	dd12bef58c	Include array size in forward declaration. Some compilers require it. At least Visual Studio, according to the buildfarm, and gcc with the -pedantic flag.	2017-03-13 21:53:38 +02:00
Peter Eisentraut	1e6de941e3	Change xlog to WAL in some error messages	2017-03-13 15:42:10 -04:00

... 4 5 6 7 8 ...

6571 Commits