postgresql/doc/src/sgml/bgworker.sgml

<!-- doc/src/sgml/bgworker.sgml -->

<chapter id="bgworker">
 <title>Background Worker Processes</title>

 <indexterm zone="bgworker">
  <primary>Background workers</primary>
 </indexterm>

 <para>
  PostgreSQL can be extended to run user-supplied code in separate processes.
  Such processes are started, stopped and monitored by <command>postgres</command>,
  which permits them to have a lifetime closely linked to the server's status.
  These processes have the option to attach to <productname>PostgreSQL</>'s
  shared memory area and to connect to databases internally; they can also run
  multiple transactions serially, just like a regular client-connected server
  process.  Also, by linking to <application>libpq</> they can connect to the
  server and behave like a regular client application.
 </para>

 <warning>
  <para>
   There are considerable robustness and security risks in using background
   worker processes because, being written in the <literal>C</> language,
   they have unrestricted access to data.  Administrators wishing to enable
   modules that include background worker process should exercise extreme
   caution.  Only carefully audited modules should be permitted to run
   background worker processes.
  </para>
 </warning>

 <para>
  Background workers can be initialized at the time that
  <productname>PostgreSQL</> is started including the module name in
  <varname>shared_preload_libraries</>.  A module wishing to run a background
  worker can register it by calling
  <function>RegisterBackgroundWorker(<type>BackgroundWorker *worker</type>)</function>
  from its <function>_PG_init()</>.  Background workers can also be started
  after the system is up and running by calling the function
  <function>RegisterDynamicBackgroundWorker</function>(<type>BackgroundWorker
  *worker</type>).  Unlike <function>RegisterBackgroundWorker</>, which can
  only be called from within the postmaster,
  <function>RegisterDynamicBackgroundWorker</function> must be called from
  a regular backend.
 </para>

 <para>
  The structure <structname>BackgroundWorker</structname> is defined thus:
<programlisting>
typedef void (*bgworker_main_type)(void *main_arg);
typedef struct BackgroundWorker
{
    char        bgw_name[BGW_MAXLEN];
    int         bgw_flags;
    BgWorkerStartTime bgw_start_time;
    int         bgw_restart_time;       /* in seconds, or BGW_NEVER_RESTART */
    bgworker_main_type bgw_main;
    char        bgw_library_name[BGW_MAXLEN];   /* only if bgw_main is NULL */
    char        bgw_function_name[BGW_MAXLEN];  /* only if bgw_main is NULL */
    Datum       bgw_main_arg;
} BackgroundWorker;
</programlisting>
  </para>

  <para>
   <structfield>bgw_name</> is a string to be used in log messages, process
   listings and similar contexts.
  </para>

  <para>
   <structfield>bgw_flags</> is a bitwise-or'd bitmask indicating the
   capabilities that the module wants.  Possible values are
   <literal>BGWORKER_SHMEM_ACCESS</literal> (requesting shared memory access)
   and <literal>BGWORKER_BACKEND_DATABASE_CONNECTION</literal> (requesting the
   ability to establish a database connection, through which it can later run
   transactions and queries). A background worker using
   <literal>BGWORKER_BACKEND_DATABASE_CONNECTION</literal> to connect to
   a database must also attach shared memory using
   <literal>BGWORKER_SHMEM_ACCESS</literal>, or worker start-up will fail.
  </para>

  <para>
   <structfield>bgw_start_time</structfield> is the server state during which
   <command>postgres</> should start the process; it can be one of
   <literal>BgWorkerStart_PostmasterStart</> (start as soon as
   <command>postgres</> itself has finished its own initialization; processes
   requesting this are not eligible for database connections),
   <literal>BgWorkerStart_ConsistentState</> (start as soon as a consistent state
   has been reached in a hot standby, allowing processes to connect to
   databases and run read-only queries), and
   <literal>BgWorkerStart_RecoveryFinished</> (start as soon as the system has
   entered normal read-write state).  Note the last two values are equivalent
   in a server that's not a hot standby.  Note that this setting only indicates
   when the processes are to be started; they do not stop when a different state
   is reached.
  </para>

  <para>
   <structfield>bgw_restart_time</structfield> is the interval, in seconds, that
   <command>postgres</command> should wait before restarting the process, in
   case it crashes.  It can be any positive value,
   or <literal>BGW_NEVER_RESTART</literal>, indicating not to restart the
   process in case of a crash.
  </para>

  <para>
   <structfield>bgw_main</structfield> is a pointer to the function to run when
   the process is started.  This function must take a single argument of type
   <type>void *</> and return <type>void</>.
   <structfield>bgw_main_arg</structfield> will be passed to it as its only
   argument.  Note that the global variable <literal>MyBgworkerEntry</literal>
   points to a copy of the <structname>BackgroundWorker</structname> structure
   passed at registration time. <structfield>bgw_main</structfield> may be
   NULL; in that case, <structfield>bgw_library_name</structfield> and
   <structfield>bgw_function_name</structfield> will be used to determine
   the entrypoint.  This is useful for background workers launched after
   postmaster startup, where the postmaster does not have the requisite
   library loaded.
  </para>

  <para>
   <structfield>bgw_library_name</structfield> is the name of a library in
   which the initial entrypoint for the background worker should be sought.
   It is ignored unless <structfield>bgw_main</structfield> is NULL.
   But if <structfield>bgw_main</structfield> is NULL, then the named library
   will be dynamically loaded by the worker process and
   <structfield>bgw_function_name</structfield> will be used to identify
   the function to be called.
  </para>

  <para>
   <structfield>bgw_function_name</structfield> is the name of a function in
   a dynamically loaded library which should be used as the initial entrypoint
   for a new background worker.  It is ignored unless
   <structfield>bgw_main</structfield> is NULL.
  </para>

  <para>Once running, the process can connect to a database by calling
   <function>BackgroundWorkerInitializeConnection(<parameter>char *dbname</parameter>, <parameter>char *username</parameter>)</function>.
   This allows the process to run transactions and queries using the
   <literal>SPI</literal> interface.  If <varname>dbname</> is NULL,
   the session is not connected to any particular database, but shared catalogs
   can be accessed.  If <varname>username</> is NULL, the process will run as
   the superuser created during <command>initdb</>.
   BackgroundWorkerInitializeConnection can only be called once per background
   process, it is not possible to switch databases.
  </para>

  <para>
   Signals are initially blocked when control reaches the
   <structfield>bgw_main</> function, and must be unblocked by it; this is to
   allow the process to customize its signal handlers, if necessary.
   Signals can be unblocked in the new process by calling
   <function>BackgroundWorkerUnblockSignals</> and blocked by calling
   <function>BackgroundWorkerBlockSignals</>.
  </para>

  <para>
   Background workers are expected to be continuously running; if they exit
   cleanly, <command>postgres</> will restart them immediately.  Consider doing
   interruptible sleep when they have nothing to do; this can be achieved by
   calling <function>WaitLatch()</function>.  Make sure the
   <literal>WL_POSTMASTER_DEATH</> flag is set when calling that function, and
   verify the return code for a prompt exit in the emergency case that
   <command>postgres</> itself has terminated.
  </para>

  <para>
   The <filename>worker_spi</> contrib module contains a working example,
   which demonstrates some useful techniques.
  </para>

  <para>
   The maximum number of registered background workers is limited by
   <xref linkend="guc-max-worker-processes">.
  </para>
</chapter>