postgresql

Commit Graph

Author	SHA1	Message	Date
Heikki Linnakangas	b0bea38705	Use FD_CLOEXEC on ListenSockets It's good hygiene if e.g. an extension launches a subprogram when being loaded. We went through some effort to close them in the child process in EXEC_BACKEND mode, but it's better to not hand them down to the child process in the first place. We still need to close them after fork when !EXEC_BACKEND, but it's a little simpler. In the passing, LOG a message if closing the client connection or listen socket fails. Shouldn't happen, but if it does, would be nice to know. Reviewed-by: Tristan Partin, Andres Freund, Thomas Munro Discussion: https://www.postgresql.org/message-id/7a59b073-5b5b-151e-7ed3-8b01ff7ce9ef@iki.fi	2023-08-24 17:03:05 +03:00
Nathan Bossart	f4b54e1ed9	Introduce macros for protocol characters. This commit introduces descriptively-named macros for the identifiers used in wire protocol messages. These new macros are placed in a new header file so that they can be easily used by third-party code. Author: Dave Cramer Reviewed-by: Alvaro Herrera, Tatsuo Ishii, Peter Smith, Robert Haas, Tom Lane, Peter Eisentraut, Michael Paquier Discussion: https://postgr.es/m/CADK3HHKbBmK-PKf1bPNFoMC%2BoBt%2BpD9PH8h5nvmBQskEHm-Ehw%40mail.gmail.com	2023-08-22 19:16:12 -07:00
Amit Kapila	02c1b64fb1	Refactor to split Apply and Tablesync Workers code. Both apply and tablesync workers were using ApplyWorkerMain() as entry point. As the name implies, ApplyWorkerMain() should be considered as the main function for apply workers. Tablesync worker's path was hidden and does not have enough in common to share the same main function with apply worker. Also, most of the code shared by both worker types is already combined in LogicalRepApplyLoop(). There is no need to combine the rest in ApplyWorkerMain() anymore. This patch introduces TablesyncWorkerMain() as a new entry point for tablesync workers. This aims to increase code readability and would help with future improvements like the reuse of tablesync workers in the initial synchronization. Author: Melih Mutlu based on suggestions by Melanie Plageman Reviewed-by: Peter Smith, Kuroda Hayato, Amit Kapila Discussion: http://postgr.es/m/CAGPVpCTq=rUDd4JUdaRc1XUWf4BrH2gdSNf3rtOMUGj9rPpfzQ@mail.gmail.com	2023-08-03 08:59:50 +05:30
Nathan Bossart	884eee5bfb	Remove db_user_namespace. This feature was intended to be a temporary measure to support per-database user names. A better one hasn't materialized in the ~21 years since it was added, and nobody claims to be using it, so let's just remove it. Reviewed-by: Michael Paquier, Magnus Hagander Discussion: https://postgr.es/m/20230630200509.GA2830328%40nathanxps13 Discussion: https://postgr.es/m/20230630215608.GD2941194%40nathanxps13	2023-07-17 11:44:59 -07:00
Andres Freund	c66a7d75e6	Handle DROP DATABASE getting interrupted Until now, when DROP DATABASE got interrupted in the wrong moment, the removal of the pg_database row would also roll back, even though some irreversible steps have already been taken. E.g. DropDatabaseBuffers() might have thrown out dirty buffers, or files could have been unlinked. But we continued to allow connections to such a corrupted database. To fix this, mark databases invalid with an in-place update, just before starting to perform irreversible steps. As we can't add a new column in the back branches, we use pg_database.datconnlimit = -2 for this purpose. An invalid database cannot be connected to anymore, but can still be dropped. Unfortunately we can't easily add output to psql's \l to indicate that some database is invalid, it doesn't fit in any of the existing columns. Add tests verifying that a interrupted DROP DATABASE is handled correctly in the backend and in various tools. Reported-by: Evgeny Morozov <postgresql3@realityexists.net> Author: Andres Freund <andres@anarazel.de> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/20230509004637.cgvmfwrbht7xm7p6@awork3.anarazel.de Discussion: https://postgr.es/m/20230314174521.74jl6ffqsee5mtug@awork3.anarazel.de Backpatch: 11-, bug present in all supported versions	2023-07-13 13:03:28 -07:00
Nathan Bossart	957845789b	Increase size of bgw_library_name. This commit increases the size of the bgw_library_name member of the BackgroundWorker struct from BGW_MAXLEN (96) bytes to MAXPGPATH (default of 1024) bytes so that it can store longer file names (e.g., absolute paths). Author: Yurii Rashkovskii Reviewed-by: Daniel Gustafsson, Aleksander Alekseev Discussion: https://postgr.es/m/CA%2BRLCQyjFV5Y8tG5QgUb6gjteL4S3p%2B1gcyqWTqigyM93WZ9Pg%40mail.gmail.com	2023-07-03 15:02:16 -07:00
Nathan Bossart	562bee0fc1	Don't truncate database and user names in startup packets. Unlike commands such as CREATE DATABASE, ProcessStartupPacket() does not perform multibyte-aware truncation of overlength names. This means that connection attempts might fail even if the user provides the same overlength names that were used in CREATE DATABASE, CREATE ROLE, etc. Ideally, we'd do the same multibyte- aware truncation in both code paths, but it doesn't seem worth the added complexity of trying to discover the encoding of the names. Instead, let's simply skip truncating the names in the startup packet and let the user/database lookup fail later on. With this change, users must provide the exact names stored in the catalogs, even if the names were truncated. This reverts commit `d18c1d1f51`. Author: Bertrand Drouvot Reviewed-by: Kyotaro Horiguchi, Tom Lane Discussion: https://postgr.es/m/07436793-1426-29b2-f924-db7422a05fb7%40gmail.com	2023-07-03 13:18:05 -07:00
Tom Lane	0245f8db36	Pre-beta mechanical code beautification. Run pgindent, pgperltidy, and reformat-dat-files. This set of diffs is a bit larger than typical. We've updated to pg_bsd_indent 2.1.2, which properly indents variable declarations that have multi-line initialization expressions (the continuation lines are now indented one tab stop). We've also updated to perltidy version 20230309 and changed some of its settings, which reduces its desire to add whitespace to lines to make assignments etc. line up. Going forward, that should make for fewer random-seeming changes to existing code. Discussion: https://postgr.es/m/20230428092545.qfb3y5wcu4cm75ur@alvherre.pgsql	2023-05-19 17:24:48 -04:00
Thomas Munro	63932a6d38	Fix wal_writer_flush_after initializer value. Commit `a73952b795` (new in 16) required default values in guc_table.c and C variable initializers to match. This one only matched when XLOG_BLCKSZ == 8kB. Fix by using the same expression in both places with a new DEFAULT_XXX macro, as done for other GUCs. Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CA+hUKGLNmLV=VrT==5MqnbARgx2ifRSFtdd8ofdfrdSLL3yv5A@mail.gmail.com	2023-05-15 11:19:54 +12:00
Daniel Gustafsson	bfac8f8bc4	Fix vacuum_cost_delay check for balance calculation. Commit `1021bd6a89` excluded autovacuum workers from cost-limit balance calculations when per-relation options were set. The code checks for limit and cost_delay being greater than zero, but since cost_delay can be set to -1 the test needs to check for greater than or zero. Backpatch to all supported branches since `1021bd6a89` was backpatched all the way at the time. Author: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CAD21AoBS7o6Ljt_vfqPQPf67AhzKu3fR0iqk8B=vVYczMugKMQ@mail.gmail.com Backpatch-through: v11 (all supported branches)	2023-04-25 13:54:10 +02:00
Daniel Gustafsson	a9781ae11b	Fix autovacuum cost debug logging Commit `7d71d3dd0` introduced finer grained updates of autovacuum option changes by increasing the frequency of reading the configuration file. The debug logging of cost parameter was however changed such that some initial values weren't logged. Fix by changing logging to use the old frequency of logging regardless of them changing. Also avoid taking a log for rendering the log message unless the set loglevel is such that the log entry will be emitted. Author: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/CAD21AoBS7o6Ljt_vfqPQPf67AhzKu3fR0iqk8B=vVYczMugKMQ@mail.gmail.com	2023-04-20 15:45:44 +02:00
David Rowley	1cbbee0338	Add VACUUM/ANALYZE BUFFER_USAGE_LIMIT option Add new options to the VACUUM and ANALYZE commands called BUFFER_USAGE_LIMIT to allow users more control over how large to make the buffer access strategy that is used to limit the usage of buffers in shared buffers. Larger rings can allow VACUUM to run more quickly but have the drawback of VACUUM possibly evicting more buffers from shared buffers that might be useful for other queries running on the database. Here we also add a new GUC named vacuum_buffer_usage_limit which controls how large to make the access strategy when it's not specified in the VACUUM/ANALYZE command. This defaults to 256KB, which is the same size as the access strategy was prior to this change. This setting also controls how large to make the buffer access strategy for autovacuum. Per idea by Andres Freund. Author: Melanie Plageman Reviewed-by: David Rowley Reviewed-by: Andres Freund Reviewed-by: Justin Pryzby Reviewed-by: Bharath Rupireddy Discussion: https://postgr.es/m/20230111182720.ejifsclfwymw2reb@awork3.anarazel.de	2023-04-07 11:40:31 +12:00
Daniel Gustafsson	7d71d3dd08	Refresh cost-based delay params more frequently in autovacuum Allow autovacuum to reload the config file more often so that cost-based delay parameters can take effect while VACUUMing a relation. Previously, autovacuum workers only reloaded the config file once per relation vacuumed, so config changes could not take effect until beginning to vacuum the next table. Now, check if a reload is pending roughly once per block, when checking if we need to delay. In order for autovacuum workers to safely update their own cost delay and cost limit parameters without impacting performance, we had to rethink when and how these values were accessed. Previously, an autovacuum worker's wi_cost_limit was set only at the beginning of vacuuming a table, after reloading the config file. Therefore, at the time that autovac_balance_cost() was called, workers vacuuming tables with no cost-related storage parameters could still have different values for their wi_cost_limit_base and wi_cost_delay. Now that the cost parameters can be updated while vacuuming a table, workers will (within some margin of error) have no reason to have different values for cost limit and cost delay (in the absence of cost-related storage parameters). This removes the rationale for keeping cost limit and cost delay in shared memory. Balancing the cost limit requires only the number of active autovacuum workers vacuuming a table with no cost-based storage parameters. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com	2023-04-07 01:00:21 +02:00
Daniel Gustafsson	a85c60a945	Separate vacuum cost variables from GUCs Vacuum code run both by autovacuum workers and a backend doing VACUUM/ANALYZE previously inspected VacuumCostLimit and VacuumCostDelay, which are the global variables backing the GUCs vacuum_cost_limit and vacuum_cost_delay. Autovacuum workers needed to override these variables with their own values, derived from autovacuum_vacuum_cost_limit and autovacuum_vacuum_cost_delay and worker cost limit balancing logic. This led to confusing code which, in some cases, both derived and set a new value of VacuumCostLimit from VacuumCostLimit. In preparation for refreshing these GUC values more often, introduce new, independent global variables and add a function to update them using the GUCs and existing logic. Per suggestion by Kyotaro Horiguchi Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com	2023-04-07 00:54:53 +02:00
David Rowley	b9b125b9c1	Move various prechecks from vacuum() into ExecVacuum() vacuum() is used for both the VACUUM command and for autovacuum. There were many prechecks being done inside vacuum() that were just not relevant to autovacuum. Let's move the bulk of these into ExecVacuum() so that they're only executed when running the VACUUM command. This removes a small amount of overhead when autovacuum vacuums a table. While we are at it, allocate VACUUM's BufferAccessStrategy in ExecVacuum() and pass it into vacuum() instead of expecting vacuum() to make it if it's not already made by the calling function. To make this work, we need to create the vacuum memory context slightly earlier, so we now need to pass that down to vacuum() so that it's available for use in other memory allocations. Author: Melanie Plageman Reviewed-by: David Rowley Discussion: https://postgr.es/m/20230405211534.4skgskbilnxqrmxg@awork3.anarazel.de	2023-04-06 15:44:52 +12:00
Andres Freund	12f3867f55	bufmgr: Support multiple in-progress IOs by using resowner A future patch will add support for extending relations by multiple blocks at once. To be concurrency safe, the buffers for those blocks need to be marked as BM_IO_IN_PROGRESS. Until now we only had infrastructure for recovering from an IO error for a single buffer. This commit extends that infrastructure to multiple buffers by using the resource owner infrastructure. This commit increases the size of the ResourceOwnerData struct, which appears to have a just about measurable overhead in very extreme workloads. Medium term we are planning to substantially shrink the size of ResourceOwnerData. Short term the increase is small enough to not worry about it for now. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/20221029025420.eplyow6k7tgu6he3@awork3.anarazel.de Discussion: https://postgr.es/m/20221029200025.w7bvlgvamjfo6z44@awork3.anarazel.de	2023-04-05 14:17:55 -07:00
Noah Misch	e33967b13b	Comment on expectations for AutoVacuumWorkItem handlers. This might prevent a repeat of the brin_summarize_range() vulnerability that commit `a117cebd63` fixed.	2023-03-25 13:00:27 -07:00
Thomas Munro	10b6745d31	Small tidyup for commit `d41a178b`, part II. Further to commit `6a9229da`, checking for NULL is now redundant. An "out of memory" error would have been thrown already by palloc() and treated as FATAL, so we can delete a few more lines. Back-patch to all releases, like those other commits. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/4040668.1679013388%40sss.pgh.pa.us	2023-03-17 14:44:12 +13:00
Thomas Munro	6a9229da65	Small tidyup for commit `d41a178b`. A comment was left behind claiming that we needed to use malloc() rather than palloc() because the corresponding free would run in another thread, but that's not true anymore. Remove that comment. And, with the reason being gone, we might as well actually use palloc(). Back-patch to supported releases, like `d41a178b`. Discussion: https://postgr.es/m/CA%2BhUKG%2BpdM9v3Jv4tc2BFx2jh_daY3uzUyAGBhtDkotEQDNPYw%40mail.gmail.com	2023-03-17 10:44:46 +13:00
Thomas Munro	d41a178b3a	Fix waitpid() emulation on Windows. Our waitpid() emulation didn't prevent a PID from being recycled by the OS before the call to waitpid(). The postmaster could finish up tracking more than one child process with the same PID, and confuse them. Fix, by moving the guts of pgwin32_deadchild_callback() into waitpid(), so that resources are released synchronously. The process and PID continue to exist until we close the process handle, which only happens once we're ready to adjust our book-keeping of running children. This seems to explain a couple of failures on CI. It had never been reported before, despite the code being as old as the Windows port. Perhaps Windows started recycling PIDs more rapidly, or perhaps timing changes due to commit `7389aad6` made it more likely to break. Thanks to Alexander Lakhin for analysis and Andres Freund for tracking down the root cause. Back-patch to all supported branches. Reported-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20230208012852.bvkn2am4h4iqjogq%40awork3.anarazel.de	2023-03-15 13:24:47 +13:00
Michael Paquier	4211fbd841	Add PROCESS_MAIN to VACUUM Disabling this option is useful to run VACUUM (with or without FULL) on only the toast table of a relation, bypassing the main relation. This option is enabled by default. Running directly VACUUM on a toast table was already possible without this feature, by using the non-deterministic name of a toast relation (as of pg_toast.pg_toast_N, where N would be the OID of the parent relation) in the VACUUM command, and it required a scan of pg_class to know the name of the toast table. So this feature is basically a shortcut to be able to run VACUUM or VACUUM FULL on a toast relation, using only the name of the parent relation. A new switch called --no-process-main is added to vacuumdb, to work as an equivalent of PROCESS_MAIN. Regression tests are added to cover VACUUM and VACUUM FULL, looking at pg_stat_all_tables.vacuum_count to see how many vacuums have run on each table, main or toast. Author: Nathan Bossart Reviewed-by: Masahiko Sawada Discussion: https://postgr.es/m/20221230000028.GA435655@nathanxps13	2023-03-06 16:41:05 +09:00
Michael Paquier	35739b87dc	Redesign archive modules A new callback named startup_cb, called shortly after a module is loaded, is added. This makes possible the initialization of any additional state data required by a module. This initial state data can be saved in a ArchiveModuleState, that is now passed down to all the callbacks that can be defined in a module. With this design, it is possible to have a per-module state, aimed at opening the door to the support of more than one archive module. The initialization of the callbacks is changed so as _PG_archive_module_init() does not anymore give in input a ArchiveModuleCallbacks that a module has to fill in with callback definitions. Instead, a module now needs to return a const ArchiveModuleCallbacks. All the structure and callback definitions of archive modules are moved into their own header, named archive_module.h, from pgarch.h. Command-based archiving follows the same line, with a new set of files named shell_archive.{c,h}. There are a few more items that are under discussion to improve the design of archive modules, like the fact that basic_archive calls sigsetjmp() by itself to define its own error handling flow. These will be adjusted later, the changes done here cover already a good portion of what has been discussed. Any modules created for v15 will need to be adjusted to this new design. Author: Nathan Bossart Reviewed-by: Andres Freund Discussion: https://postgr.es/m/20230130194810.6fztfgbn32e7qarj@awork3.anarazel.de	2023-02-17 14:26:42 +09:00
Andres Freund	30b789eafe	Remove uses of AssertVariableIsOfType() obsoleted by `f2b73c8` Author: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://postgr.es/m/20230208172705.GA451849@nathanxps13	2023-02-08 21:06:46 -08:00
Robert Haas	8a2f783cc4	Disable STARTUP_PROGRESS_TIMEOUT in standby mode. In standby mode, we don't actually report progress of recovery, but up until now, startup_progress_timeout_handler() nevertheless got called every log_startup_progress_interval seconds. That's an unnecessary expense, so avoid it. Report by Thomas Munro. Patch by Bharath Rupireddy, reviewed by Simon Riggs, Thomas Munro, and me. Back-patch to v15, where the problem was introduced. Discussion: https://www.postgresql.org/message-id/CA%2BhUKGKCHSffAj8zZJKJvNX7ygnQFxVD6wm1d-2j3fVw%2BMafPQ%40mail.gmail.com	2023-02-06 10:51:08 -05:00
Thomas Munro	cdf6518ef0	Retire PG_SETMASK() macro. In the 90s we needed to deal with computers that still had the pre-standard signal masking APIs. That hasn't been relevant for a very long time on Unix systems, and `c94ae9d8` got rid of a remaining dependency in our Windows porting code. PG_SETMASK didn't expose save/restore functionality, so we'd already started using sigprocmask() directly in places, creating the visual distraction of having two ways to spell it. It's not part of the API that extensions are expected to be using (but if they are, the change will be trivial). It seems like a good time to drop the old macro and just call the standard POSIX function. Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKG%2BKfQgrhHP2DLTohX1WwubaCBHmTzGnAEDPZ-Gug-Xskg%40mail.gmail.com	2023-02-03 11:29:46 +13:00
Michael Paquier	38cc085464	Simplify main waiting loop of the archiver process As coded, the timeout given to WaitLatch() was always equal to PGARCH_AUTOWAKE_INTERVAL, as time() was called two times repeatedly. This simplification could have been done in `d75288f`. While on it, this adjusts a comment in pgarch.c to describe the archiver in a more neutral way. Author: Sravan Kumar, Nathan Bossart Reviewed-by: Kyotaro Horiguchi Discussion: https://postgr.es/m/CA+=NbjjqYE9-Lnw7H7DAiS5jebmoMikwZQb_sBP7kgBCn9q6Hg@mail.gmail.com	2023-02-01 15:46:04 +09:00
Thomas Munro	8d2c1913ed	Remove unneeded volatile qualifiers from postmaster.c. Several flags were marked volatile and in some cases used sig_atomic_t because they were accessed from signal handlers. After commit `7389aad6`, we can just use unqualified bool. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CA%2BhUKGLMoeZNZY6gYdLUQmuoW_a8bKyLvtuZkd_zHcGVOfDzBA%40mail.gmail.com	2023-01-28 15:06:23 +13:00
Tom Lane	3a28d78089	Improve TimestampDifferenceMilliseconds to cope with overflow sanely. We'd like to use TimestampDifferenceMilliseconds with the stop_time possibly being TIMESTAMP_INFINITY, but up to now it's disclaimed responsibility for overflow cases. Define it to clamp its output to the range [0, INT_MAX], handling overflow correctly. (INT_MAX rather than LONG_MAX seems appropriate, because the function is already described as being intended for calculating wait times for WaitLatch et al, and that infrastructure only handles waits up to INT_MAX. Also, this choice gets rid of cross-platform behavioral differences.) Having done that, we can replace some ad-hoc code in walreceiver.c with a simple call to TimestampDifferenceMilliseconds. While at it, fix some buglets in existing callers of TimestampDifferenceMilliseconds: basebackup_copy.c had not read the memo about TimestampDifferenceMilliseconds never returning a negative value, and postmaster.c had not read the memo about Min() and Max() being macros with multiple-evaluation hazards. Neither of these quite seem worth back-patching. Patch by me; thanks to Nathan Bossart for review. Discussion: https://postgr.es/m/3126727.1674759248@sss.pgh.pa.us	2023-01-26 17:09:12 -05:00
Peter Geoghegan	6c6b497266	Revert "Add eager and lazy freezing strategies to VACUUM." This reverts commit `4d41799261`. Broad concerns about regressions caused by eager freezing strategy have been raised. Whether or not these concerns can be worked through in any time frame is far from certain. Discussion: https://postgr.es/m/20230126004347.gepcmyenk2csxrri@awork3.anarazel.de	2023-01-25 22:22:27 -08:00
Peter Geoghegan	4d41799261	Add eager and lazy freezing strategies to VACUUM. Eager freezing strategy avoids large build-ups of all-visible pages. It makes VACUUM trigger page-level freezing whenever doing so will enable the page to become all-frozen in the visibility map. This is useful for tables that experience continual growth, particularly strict append-only tables such as pgbench's history table. Eager freezing significantly improves performance stability by spreading out the cost of freezing over time, rather than doing most freezing during aggressive VACUUMs. It complements the insert autovacuum mechanism added by commit `b07642db`. VACUUM determines its freezing strategy based on the value of the new vacuum_freeze_strategy_threshold GUC (or reloption) with logged tables. Tables that exceed the size threshold use the eager freezing strategy. Unlogged tables and temp tables always use eager freezing strategy, since the added cost is negligible there. Non-permanent relations won't incur any extra overhead in WAL written (for the obvious reason), nor in pages dirtied (since any extra freezing will only take place on pages whose PD_ALL_VISIBLE bit needed to be set either way). VACUUM uses lazy freezing strategy for logged tables that fall under the GUC size threshold. Page-level freezing triggers based on the criteria established in commit `1de58df4`, which added basic page-level freezing. Eager freezing is strictly more aggressive than lazy freezing. Settings like vacuum_freeze_min_age still get applied in just the same way in every VACUUM, independent of the strategy in use. The only mechanical difference between eager and lazy freezing strategies is that only the former applies its own additional criteria to trigger freezing pages. Note that even lazy freezing strategy will trigger freezing whenever a page happens to have required that an FPI be written during pruning, provided that the page will thereby become all-frozen in the visibility map afterwards (due to the FPI optimization from commit `1de58df4`). The vacuum_freeze_strategy_threshold default setting is 4GB. This is a relatively low setting that prioritizes performance stability. It will be reviewed at the end of the Postgres 16 beta period. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Jeff Davis <pgsql@j-davis.com> Reviewed-By: Andres Freund <andres@anarazel.de> Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/CAH2-WzkFok_6EAHuK39GaW4FjEFQsY=3J0AAd6FXk93u-Xq3Fg@mail.gmail.com	2023-01-25 14:15:38 -08:00
Thomas Munro	239b175342	Process pending postmaster work before connections. Modify the new event loop code from commit `7389aad6` so that it checks for work requested by signal handlers even if it doesn't see a latch event yet. This gives priority to shutdown and reload requests where the latch will be reported later in the event array, or in a later call to WaitEventSetWait(), due to scheduling details. In particular, this guarantees that a SIGHUP-then-connect sequence (as seen in authentication tests) causes the postmaster to process the reload before accepting the connection. If the WaitEventSetWait() call saw the socket as ready, and the reload signal was generated before the connection, then the latest time the signal handler should be able to run is after poll/epoll_wait/kevent returns but before we check the pending_pm_reload_request flag. While here, also shift the handling of child exit below reload requests, per Tom Lane's observation that that might start new processes, so we should make sure we pick up new settings first. This probably explains the one-off failure of build farm animal malleefowl. Reported-by: Hou Zhijie <houzj.fnst@fujitsu.com> Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/OS0PR01MB57163D3BF2AB42ECAA94E5C394C29%40OS0PR01MB5716.jpnprd01.prod.outlook.com	2023-01-25 15:00:05 +13:00
Noah Misch	e52daaabf8	Reject CancelRequestPacket having unexpected length. When the length was too short, the server read outside the allocation. That yielded the same log noise as sending the correct length with (backendPID,cancelAuthCode) matching nothing. Change to a message about the unexpected length. Given the attacker's lack of control over the memory layout and the general lack of diversity in memory layouts at the code in question, we doubt a would-be attacker could cause a segfault. Hence, while the report arrived via security@postgresql.org, this is not a vulnerability. Back-patch to v11 (all supported versions). Andrey Borodin, reviewed by Tom Lane. Reported by Andrey Borodin.	2023-01-21 06:08:00 -08:00
Michael Paquier	8eba3e3f02	Move queryjumble.c code to src/backend/nodes/ This will ease a follow-up move that will generate automatically this code. The C file is renamed, for consistency with the node-related files whose code are generated by gen_node_support.pl: - queryjumble.c -> queryjumblefuncs.c - utils/queryjumble.h -> nodes/queryjumble.h Per a suggestion from Peter Eisentraut. Reviewed-by: Peter Eisentraut Discussion: https://postgr.es/m/Y5BHOUhX3zTH/ig6@paquier.xyz	2023-01-21 11:48:37 +09:00
Robert Haas	6e2775e4d4	Add new GUC reserved_connections. This provides a way to reserve connection slots for non-superusers. The slots reserved via the new GUC are available only to users who have the new predefined role pg_use_reserved_connections. superuser_reserved_connections remains as a final reserve in case reserved_connections has been exhausted. Patch by Nathan Bossart. Reviewed by Tushar Ahuja and by me. Discussion: http://postgr.es/m/20230119194601.GA4105788@nathanxps13	2023-01-20 15:39:13 -05:00
Robert Haas	fe00fec1f5	Rename ReservedBackends variable to SuperuserReservedConnections. This is in preparation for adding a new reserved_connections GUC, but aligning the GUC name with the variable name is also a good idea on general principle. Patch by Nathan Bossart. Reviewed by Tushar Ahuja and by me. Discussion: http://postgr.es/m/20230119194601.GA4105788@nathanxps13	2023-01-20 15:32:08 -05:00
Thomas Munro	5a26c7b310	Refactor DetermineSleepTime() to use milliseconds. Since we're not using select() anymore, we don't need to bother with struct timeval. We can work directly in milliseconds, which the latch API wants. Discussion: https://postgr.es/m/CA%2BhUKG%2BZ-HpOj1JsO9eWUP%2Bar7npSVinsC_npxSy%2BjdOMsx%3DGg%40mail.gmail.com	2023-01-12 16:32:30 +13:00
Thomas Munro	7389aad636	Use WaitEventSet API for postmaster's event loop. Switch to a design similar to regular backends, instead of the previous arrangement where signal handlers did non-trivial state management and called fork(). The main changes are: * The postmaster now has its own local latch to wait on. (For now, we don't want other backends setting its latch directly, but that could probably be made to work with more research on robustness.) * The existing signal handlers are cut in two: a handle_pm_XXX() part that just sets pending_pm_XXX flags and the latch, and a process_pm_XXX() part that runs later when the latch is seen. * Signal handlers are now installed with the regular pqsignal() function rather than the special pqsignal_pm() function; historical portability concerns about the effect of SA_RESTART on select() are no longer relevant, and we don't need to block signals anymore. Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CA%2BhUKG%2BZ-HpOj1JsO9eWUP%2Bar7npSVinsC_npxSy%2BjdOMsx%3DGg%40mail.gmail.com	2023-01-12 16:32:20 +13:00
Peter Eisentraut	c96de2ce17	Common function for percent placeholder replacement There are a number of places where a shell command is constructed with percent-placeholders (like %x). It's cumbersome to have to open-code this several times. This factors out this logic into a separate function. This also allows us to ensure consistency for and document some subtle behaviors, such as what to do with unrecognized placeholders. The unified handling is now that incorrect and unknown placeholders are an error, where previously in most cases they were skipped or ignored. This affects the following settings: - archive_cleanup_command - archive_command - recovery_end_command - restore_command - ssl_passphrase_command The following settings are part of this refactoring but already had stricter error handling and should be unchanged in their behavior: - basebackup_to_shell.command Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/5238bbed-0b01-83a6-d4b2-7eb0562a054e%40enterprisedb.com	2023-01-11 10:42:35 +01:00
Amit Kapila	216a784829	Perform apply of large transactions by parallel workers. Currently, for large transactions, the publisher sends the data in multiple streams (changes divided into chunks depending upon logical_decoding_work_mem), and then on the subscriber-side, the apply worker writes the changes into temporary files and once it receives the commit, it reads from those files and applies the entire transaction. To improve the performance of such transactions, we can instead allow them to be applied via parallel workers. In this approach, we assign a new parallel apply worker (if available) as soon as the xact's first stream is received and the leader apply worker will send changes to this new worker via shared memory. The parallel apply worker will directly apply the change instead of writing it to temporary files. However, if the leader apply worker times out while attempting to send a message to the parallel apply worker, it will switch to "partial serialize" mode - in this mode, the leader serializes all remaining changes to a file and notifies the parallel apply workers to read and apply them at the end of the transaction. We use a non-blocking way to send the messages from the leader apply worker to the parallel apply to avoid deadlocks. We keep this parallel apply assigned till the transaction commit is received and also wait for the worker to finish at commit. This preserves commit ordering and avoid writing to and reading from files in most cases. We still need to spill if there is no worker available. This patch also extends the SUBSCRIPTION 'streaming' parameter so that the user can control whether to apply the streaming transaction in a parallel apply worker or spill the change to disk. The user can set the streaming parameter to 'on/off', or 'parallel'. The parameter value 'parallel' means the streaming will be applied via a parallel apply worker, if available. The parameter value 'on' means the streaming transaction will be spilled to disk. The default value is 'off' (same as current behaviour). In addition, the patch extends the logical replication STREAM_ABORT message so that abort_lsn and abort_time can also be sent which can be used to update the replication origin in parallel apply worker when the streaming transaction is aborted. Because this message extension is needed to support parallel streaming, parallel streaming is not supported for publications on servers < PG16. Author: Hou Zhijie, Wang wei, Amit Kapila with design inputs from Sawada Masahiko Reviewed-by: Sawada Masahiko, Peter Smith, Dilip Kumar, Shi yu, Kuroda Hayato, Shveta Mallik Discussion: https://postgr.es/m/CAA4eK1+wyN6zpaHUkCLorEWNx75MG0xhMwcFhvjqm2KURZEAGw@mail.gmail.com	2023-01-09 07:52:45 +05:30
Tom Lane	a46a7011b2	Add options to control whether VACUUM runs vac_update_datfrozenxid. VACUUM normally ends by running vac_update_datfrozenxid(), which requires a scan of pg_class. Therefore, if one attempts to vacuum a database one table at a time --- as vacuumdb has done since v12 --- we will spend O(N^2) time in vac_update_datfrozenxid(). That causes serious performance problems in databases with tens of thousands of tables, and indeed the effect is measurable with only a few hundred. To add insult to injury, only one process can run vac_update_datfrozenxid at the same time per DB, so this behavior largely defeats vacuumdb's -j option. Hence, invent options SKIP_DATABASE_STATS and ONLY_DATABASE_STATS to allow applications to postpone vac_update_datfrozenxid() until the end of a series of VACUUM requests, and teach vacuumdb to use them. Per bug #17717 from Gunnar L. Sadly, this answer doesn't seem like something we'd consider back-patching, so the performance problem will remain in v12-v15. Tom Lane and Nathan Bossart Discussion: https://postgr.es/m/17717-6c50eb1c7d23a886@postgresql.org	2023-01-06 14:17:25 -05:00
Bruce Momjian	c8e1ba736b	Update copyright for 2023 Backpatch-through: 11	2023-01-02 15:00:37 -05:00
Andrew Dunstan	8284cf5f74	Add copyright notices to meson files Discussion: https://postgr.es/m/222b43a5-2fb3-2c1b-9cd0-375d376c8246@dunslane.net	2022-12-20 07:54:39 -05:00
Peter Eisentraut	df8b8968d4	Order getopt arguments Order the letters in the arguments of getopt() and getopt_long(), as well as in the subsequent switch statements. In most cases, I used alphabetical with lower case first. In a few cases, existing different orders (e.g., upper case first) was kept to reduce the diff size. Discussion: https://www.postgresql.org/message-id/flat/3efd0fe8-351b-f836-9122-886002602357%40enterprisedb.com	2022-12-12 15:20:00 +01:00
Michael Paquier	83a1a1b566	Generate pg_stat_get*() functions for tables using macros The same code pattern is repeated 17 times for int64 counters (0 for missing entry) and 5 times for timestamps (NULL for missing entry) on table entries. This code is switched to use a macro for the basic code instead, shaving a few hundred lines of originally-duplicated code. The function names remain the same, but some fields of PgStat_StatTabEntry have to be renamed to cope with the new style. Author: Bertrand Drouvot Reviewed-by: Nathan Bossart Discussion: https:/postgr.es/m/20221204173207.GA2669116@nathanxps13	2022-12-06 10:46:35 +09:00
Michael Paquier	af205152ef	Add the database name to the ps display of logical WAL senders Logical WAL senders display now as follows, gaining a database name: postgres: walsender USER DATABASE HOST(PORT) STATE Physical WAL senders show up the same, as of: postgres: walsender USER HOST(PORT) STATE This information was missing, hence it was not possible to know from ps if a WAL sender was a logical or a physical one, and on which database it is connected when it is logical. Author: Tatsuhiro Nakamori Reviewed-by: Fujii Masao, Bharath Rupireddy Discussion: https://postgr.es/m/36a3b137e82e0ea9fe7e4234f03b64a1@oss.nttdata.com	2022-11-24 16:07:59 +09:00
Tom Lane	51b5834cd5	Provide options for postmaster to kill child processes with SIGABRT. The postmaster normally sends SIGQUIT to force-terminate its child processes after a child crash or immediate-stop request. If that doesn't result in child exit within a few seconds, we follow it up with SIGKILL. This patch provides GUC flags that allow either of these signals to be replaced with SIGABRT. On typically-configured Unix systems, that will result in a core dump being produced for each such child. This can be useful for debugging problems, although it's not something you'd want to have on in production due to the risk of disk space bloat from lots of core files. The old postmaster -T switch, which sent SIGSTOP in place of SIGQUIT, is changed to be the same as send_abort_for_crash. As far as I can tell from the code comments, the intent of that switch was just to block things for long enough to force core dumps manually, which seems like an unnecessary extra step. (Maybe at the time, there was no way to get most kernels to produce core files with per-PID names, requiring manual core file renaming after each one. But now it's surely the hard way.) I also took the opportunity to remove the old postmaster -n (skip shmem reinit) switch, which hasn't actually done anything in decades, though the documentation still claimed it did. Discussion: https://postgr.es/m/2251016.1668797294@sss.pgh.pa.us	2022-11-21 11:59:29 -05:00
Peter Eisentraut	d627ce3b70	Disallow setting archive_library and archive_command at the same time Setting archive_library and archive_command at the same time is now an error. Before, archive_library would take precedence over archive_command. Author: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com> Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Discussion: https://www.postgresql.org/message-id/20220914222736.GA3042279%40nathanxps13	2022-11-15 10:03:47 +01:00
Thomas Munro	b28ac1d24d	Provide sigaction() for Windows. Commit `9abb2bfc` left behind code to block signals inside signal handlers on Windows, because our signal porting layer didn't have sigaction(). Provide a minimal implementation that is capable of blocking signals, to get rid of platform differences. See also related commit `c94ae9d8`. Discussion: https://postgr.es/m/CA%2BhUKGKKKfcgx6jzok9AYenp2TNti_tfs8FMoJpL8%2B0Gsy%3D%3D_A%40mail.gmail.com	2022-11-09 13:06:31 +13:00
Michael Paquier	d9d873bac6	Clean up some inconsistencies with GUC declarations This is similar to `7d25958`, and this commit takes care of all the remaining inconsistencies between the initial value used in the C variable associated to a GUC and its default value stored in the GUC tables (as of pg_settings.boot_val). Some of the initial values of the GUCs updated rely on a compile-time default. These are refactored so as the GUC table and its C declaration use the same values. This makes everything consistent with other places, backend_flush_after, bgwriter_flush_after, port, checkpoint_flush_after doing so already, for example. Extracted from a larger patch by Peter Smith. The spots updated in the modules are from me. Author: Peter Smith, Michael Paquier Reviewed-by: Nathan Bossart, Tom Lane, Justin Pryzby Discussion: https://postgr.es/m/CAHut+PtHE0XSfjjRQ6D4v7+dqzCw=d+1a64ujra4EX8aoc_Z+w@mail.gmail.com	2022-10-31 12:44:48 +09:00
Peter Eisentraut	b1099eca8f	Remove AssertArg and AssertState These don't offer anything over plain Assert, and their usage had already been declared obsolescent. Author: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/20221009210148.GA900071@nathanxps13	2022-10-28 09:19:06 +02:00

1 2 3 4 5 ...

1792 Commits