From fa91d4c91f28f4819dc54f93adbd413a685e366a Mon Sep 17 00:00:00 2001 From: Andres Freund Date: Wed, 31 Mar 2021 00:50:28 -0700 Subject: [PATCH] Make parallel worker shutdown complete entirely via before_shmem_exit(). This is a step toward storing stats in dynamic shared memory. As dynamic shared memory segments are detached from just after before_shmem_exit() callbacks are processed, but before on_shmem_exit() callbacks are, no stats can be collected after before_shmem_exit() callbacks have been processed. Parallel worker shutdown can cause stats to be emitted during DSM detach callbacks, e.g. for SharedFileSet (which closes its files, which can causes fd.c to emit stats about temporary files). Therefore parallel worker shutdown needs to complete during the processing of before_shmem_exit callbacks. One might think this problem could instead be solved by carefully ordering the attaching to DSM segments, so that the pgstats segments get detached from later than the parallel query ones. That turns out to not work because the stats hash might need to grow which can cause new segments to be allocated, which then will be detached from earlier. There are two code changes: First, call ParallelWorkerShutdown() via before_shmem_exit. That's a good idea on its own, because other shutdown callbacks like ShutdownPostgres and ShutdownAuxiliaryProcess are called via before_*. Second, explicitly detach from the parallel query DSM segment, thereby ensuring all stats are emitted during ParallelWorkerShutdown(). There are nicer solutions to these problems, but it's not obvious which of those solutions is the correct one. As the shared memory stats work already is a huge amount of work... Author: Andres Freund Discussion: https://postgr.es/m/20210405092914.mmxqe7j56lsjfsej@alap3.anarazel.de Discussion: https://postgr.es/m/20210803023612.iziacxk5syn2r4ut@alap3.anarazel.de --- src/backend/access/transam/parallel.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c index 3550ef13ba..5f690393fc 100644 --- a/src/backend/access/transam/parallel.c +++ b/src/backend/access/transam/parallel.c @@ -1305,7 +1305,7 @@ ParallelWorkerMain(Datum main_arg) /* Arrange to signal the leader if we exit. */ ParallelLeaderPid = fps->parallel_leader_pid; ParallelLeaderBackendId = fps->parallel_leader_backend_id; - on_shmem_exit(ParallelWorkerShutdown, (Datum) 0); + before_shmem_exit(ParallelWorkerShutdown, PointerGetDatum(seg)); /* * Now we can find and attach to the error queue provided for us. That's @@ -1507,6 +1507,16 @@ ParallelWorkerReportLastRecEnd(XLogRecPtr last_xlog_end) * This guards against the case where we exit uncleanly without sending an * ErrorResponse to the leader, for example because some code calls proc_exit * directly. + * + * Also explicitly detach from dsm segment so that subsystems using + * on_dsm_detach() have a chance to send stats before the stats subsystem is + * shut down as as part of a before_shmem_exit() hook. + * + * One might think this could instead be solved by carefully ordering the + * attaching to dsm segments, so that the pgstats segments get detached from + * later than the parallel query one. That turns out to not work because the + * stats hash might need to grow which can cause new segments to be allocated, + * which then will be detached from earlier. */ static void ParallelWorkerShutdown(int code, Datum arg) @@ -1514,6 +1524,8 @@ ParallelWorkerShutdown(int code, Datum arg) SendProcSignal(ParallelLeaderPid, PROCSIG_PARALLEL_MESSAGE, ParallelLeaderBackendId); + + dsm_detach((dsm_segment *) DatumGetPointer(arg)); } /*