Frontend/Backend Protocol PostgreSQL uses a message-based protocol for communication between frontends and backends (clients and servers). The protocol is supported over TCP/IP and also over Unix-domain sockets. Port number 5432 has been registered with IANA as the customary TCP port number for servers supporting this protocol, but in practice any non-privileged port number may be used. This document describes version 3.0 of the protocol, implemented in PostgreSQL 7.4 and later. For descriptions of the earlier protocol versions, see previous releases of the PostgreSQL documentation. A single server can support multiple protocol versions. The initial startup-request message tells the server which protocol version the client is attempting to use, and then the server follows that protocol if it is able. Higher level features built on this protocol (for example, how libpq passes certain environment variables when the connection is established) are covered elsewhere. In order to serve multiple clients efficiently, the server launches a new backend process for each client. In the current implementation, a new child process is created immediately after an incoming connection is detected. This is transparent to the protocol, however. For purposes of the protocol, the terms backend and server are interchangeable; likewise frontend and client are interchangeable. Overview The protocol has separate phases for startup and normal operation. In the startup phase, the frontend opens a connection to the server and authenticates itself to the satisfaction of the server. (This might involve a single message, or multiple messages depending on the authentication method being used.) If all goes well, the server then sends status information to the frontend, and finally enters normal operation. Except for the initial startup-request message, this part of the protocol is driven by the server. During normal operation, the frontend sends queries and other commands to the backend, and the backend sends back query results and other responses. There are a few cases (such as NOTIFY) wherein the backend will send unsolicited messages, but for the most part this portion of a session is driven by frontend requests. Termination of the session is normally by frontend choice, but can be forced by the backend in certain cases. In any case, when the backend closes the connection, it will roll back any open (incomplete) transaction before exiting. Within normal operation, SQL commands can be executed through either of two sub-protocols. In the simple query protocol, the frontend just sends a textual query string, which is parsed and immediately executed by the backend. In the extended query protocol, processing of queries is separated into multiple steps: parsing, binding of parameter values, and execution. This offers flexibility and performance benefits, at the cost of extra complexity. Normal operation has additional sub-protocols for special operations such as COPY. Messaging Overview All communication is through a stream of messages. The first byte of a message identifies the message type, and the next four bytes specify the length of the rest of the message (this length count includes itself, but not the message-type byte). The remaining contents of the message are determined by the message type. For historical reasons, the very first message sent by the client (the startup message) has no initial message-type byte. To avoid losing synchronization with the message stream, both servers and clients typically read an entire message into a buffer (using the byte count) before attempting to process its contents. This allows easy recovery if an error is detected while processing the contents. In extreme situations (such as not having enough memory to buffer the message), the receiver may use the byte count to determine how much input to skip before it resumes reading messages. Conversely, both servers and clients must take care never to send an incomplete message. This is commonly done by marshaling the entire message in a buffer before beginning to send it. If a communications failure occurs partway through sending or receiving a message, the only sensible response is to abandon the connection, since there is little hope of recovering message-boundary synchronization. Extended Query Overview In the extended-query protocol, execution of SQL commands is divided into multiple steps. The state retained between steps is represented by two types of objects: prepared statements and portals. A prepared statement represents the result of parsing, semantic analysis, and planning of a textual query string. A prepared statement is not necessarily ready to execute, because it may lack specific values for parameters. A portal represents a ready-to-execute or already-partially-executed statement, with any missing parameter values filled in. (For SELECT statements, a portal is equivalent to an open cursor, but we choose to use a different term since cursors don't handle non-SELECT statements.) The overall execution cycle consists of a parse step, which creates a prepared statement from a textual query string; a bind step, which creates a portal given a prepared statement and values for any needed parameters; and an execute step that runs a portal's query. In the case of a query that returns rows (SELECT, SHOW, etc), the execute step can be told to fetch only a limited number of rows, so that multiple execute steps may be needed to complete the operation. The backend can keep track of multiple prepared statements and portals (but note that these exist only within a session, and are never shared across sessions). Existing prepared statements and portals are referenced by names assigned when they were created. In addition, an unnamed prepared statement and portal exist. Although these behave largely the same as named objects, operations on them are optimized for the case of executing a query only once and then discarding it, whereas operations on named objects are optimized on the expectation of multiple uses. Formats and Format Codes Data of a particular datatype might be transmitted in any of several different formats. As of PostgreSQL 7.4 the only supported formats are text and binary, but the protocol makes provision for future extensions. The desired format for any value is specified by a format code. Clients may specify a format code for each transmitted parameter value and for each column of a query result. Text has format code zero, binary has format code one, and all other format codes are reserved for future definition. The text representation of values is whatever strings are produced and accepted by the input/output conversion functions for the particular datatype. In the transmitted representation, there is no trailing null character; the frontend must add one to received values if it wants to process them as C strings. (The text format does not allow embedded nulls, by the way.) Binary representations for integers use network byte order (most significant byte first). For other datatypes consult the documentation or source code to learn about the binary representation. Keep in mind that binary representations for complex datatypes may change across server versions; the text format is usually the more portable choice. Message Flow This section describes the message flow and the semantics of each message type. (Details of the exact representation of each message appear in .) There are several different sub-protocols depending on the state of the connection: start-up, query, function call, COPY, and termination. There are also special provisions for asynchronous operations (including notification responses and command cancellation), which can occur at any time after the start-up phase. Start-Up To begin a session, a frontend opens a connection to the server and sends a startup message. This message includes the names of the user and of the database the user wants to connect to; it also identifies the particular protocol version to be used. (Optionally, the startup message can include additional settings for run-time parameters.) The server then uses this information and the contents of its configuration files (such as pg_hba.conf) to determine whether the connection is provisionally acceptable, and what additional authentication is required (if any). The server then sends an appropriate authentication request message, to which the frontend must reply with an appropriate authentication response message (such as a password). In principle the authentication request/response cycle could require multiple iterations, but none of the present authentication methods use more than one request and response. In some methods, no response at all is needed from the frontend, and so no authentication request occurs. The authentication cycle ends with the server either rejecting the connection attempt (ErrorResponse), or sending AuthenticationOk. The possible messages from the server in this phase are: ErrorResponse The connection attempt has been rejected. The server then immediately closes the connection. AuthenticationOk The authentication exchange is successfully completed. AuthenticationKerberosV4 The frontend must now take part in a Kerberos V4 authentication dialog (not described here, part of the Kerberos specification) with the server. If this is successful, the server responds with an AuthenticationOk, otherwise it responds with an ErrorResponse. AuthenticationKerberosV5 The frontend must now take part in a Kerberos V5 authentication dialog (not described here, part of the Kerberos specification) with the server. If this is successful, the server responds with an AuthenticationOk, otherwise it responds with an ErrorResponse. AuthenticationCleartextPassword The frontend must now send a PasswordMessage containing the password in clear-text form. If this is the correct password, the server responds with an AuthenticationOk, otherwise it responds with an ErrorResponse. AuthenticationCryptPassword The frontend must now send a PasswordMessage containing the password encrypted via crypt(3), using the 2-character salt specified in the AuthenticationCryptPassword message. If this is the correct password, the server responds with an AuthenticationOk, otherwise it responds with an ErrorResponse. AuthenticationMD5Password The frontend must now send a PasswordMessage containing the password encrypted via MD5, using the 4-character salt specified in the AuthenticationMD5Password message. If this is the correct password, the server responds with an AuthenticationOk, otherwise it responds with an ErrorResponse. AuthenticationSCMCredential This response is only possible for local Unix-domain connections on platforms that support SCM credential messages. The frontend must issue an SCM credential message and then send a single data byte. (The contents of the data byte are uninteresting; it's only used to ensure that the server waits long enough to receive the credential message.) If the credential is acceptable, the server responds with an AuthenticationOk, otherwise it responds with an ErrorResponse. If the frontend does not support the authentication method requested by the server, then it should immediately close the connection. After having received AuthenticationOk, the frontend must wait for further messages from the server. In this phase a backend process is being started, and the frontend is just an interested bystander. It is still possible for the startup attempt to fail (ErrorResponse), but in the normal case the backend will send some ParameterStatus messages, BackendKeyData, and finally ReadyForQuery. During this phase the backend will attempt to apply any additional run-time parameter settings that were given in the startup message. If successful, these values become session defaults. An error causes ErrorResponse and exit. The possible messages from the backend in this phase are: BackendKeyData This message provides secret-key data that the frontend must save if it wants to be able to issue cancel requests later. The frontend should not respond to this message, but should continue listening for a ReadyForQuery message. ParameterStatus This message informs the frontend about the current (initial) setting of backend parameters, such as client_encoding or DateStyle. The frontend may ignore this message, or record the settings for its future use; see for more detail. The frontend should not respond to this message, but should continue listening for a ReadyForQuery message. ReadyForQuery Start-up is completed. The frontend may now issue commands. ErrorResponse Start-up failed. The connection is closed after sending this message. NoticeResponse A warning message has been issued. The frontend should display the message but continue listening for ReadyForQuery or ErrorResponse. The ReadyForQuery message is the same one that the backend will issue after each command cycle. Depending on the coding needs of the frontend, it is reasonable to consider ReadyForQuery as starting a command cycle, or to consider ReadyForQuery as ending the start-up phase and each subsequent command cycle. Simple Query A simple query cycle is initiated by the frontend sending a Query message to the backend. The message includes an SQL command (or commands) expressed as a text string. The backend then sends one or more response messages depending on the contents of the query command string, and finally a ReadyForQuery response message. ReadyForQuery informs the frontend that it may safely send a new command. (It is not actually necessary for the frontend to wait for ReadyForQuery before issuing another command, but the frontend must then take responsibility for figuring out what happens if the earlier command fails and already-issued later commands succeed.) The possible response messages from the backend are: CommandComplete An SQL command completed normally. CopyInResponse The backend is ready to copy data from the frontend to a table; see . CopyOutResponse The backend is ready to copy data from a table to the frontend; see . RowDescription Indicates that rows are about to be returned in response to a SELECT, FETCH, etc query. The contents of this message describe the column layout of the rows. This will be followed by a DataRow message for each row being returned to the frontend. DataRow One of the set of rows returned by a SELECT, FETCH, etc query. EmptyQueryResponse An empty query string was recognized. ErrorResponse An error has occurred. ReadyForQuery Processing of the query string is complete. A separate message is sent to indicate this because the query string may contain multiple SQL commands. (CommandComplete marks the end of processing one SQL command, not the whole string.) ReadyForQuery will always be sent, whether processing terminates successfully or with an error. NoticeResponse A warning message has been issued in relation to the query. Notices are in addition to other responses, i.e., the backend will continue processing the command. The response to a SELECT query (or other queries that return rowsets, such as EXPLAIN or SHOW) normally consists of RowDescription, zero or more DataRow messages, and then CommandComplete. COPY to or from the frontend invokes special protocol as described in . All other query types normally produce only a CommandComplete message. Since a query string could contain several queries (separated by semicolons), there might be several such response sequences before the backend finishes processing the query string. ReadyForQuery is issued when the entire string has been processed and the backend is ready to accept a new query string. If a completely empty (no contents other than whitespace) query string is received, the response is EmptyQueryResponse followed by ReadyForQuery. In the event of an error, ErrorResponse is issued followed by ReadyForQuery. All further processing of the query string is aborted by ErrorResponse (even if more queries remained in it). Note that this may occur partway through the sequence of messages generated by an individual query. In simple Query mode, the format of retrieved values is always text, except when the given command is a FETCH from a cursor declared with the BINARY option. In that case, the retrieved values are in binary format. The format codes given in the RowDescription message tell which format is being used. A frontend must be prepared to accept ErrorResponse and NoticeResponse messages whenever it is expecting any other type of message. See also concerning messages that the backend may generate due to outside events. Recommended practice is to code frontends in a state-machine style that will accept any message type at any time that it could make sense, rather than wiring in assumptions about the exact sequence of messages. Extended Query The extended query protocol breaks down the above-described simple query protocol into multiple steps. The results of preparatory steps can be re-used multiple times for improved efficiency. Furthermore, additional features are available, such as the possibility of supplying data values as separate parameters instead of having to insert them directly into a query string. In the extended protocol, the frontend first sends a Parse message, which contains a textual query string, optionally some information about datatypes of parameter placeholders, and the name of a destination prepared-statement object (an empty string selects the unnamed prepared statement). The response is either ParseComplete or ErrorResponse. Parameter datatypes may be specified by OID; if not given, the parser attempts to infer the datatypes in the same way as it would do for untyped literal string constants. The query string contained in a Parse message cannot include more than one SQL statement; else a syntax error is reported. This restriction does not exist in the simple-query protocol, but it does exist in the extended protocol, because allowing prepared statements or portals to contain multiple commands would complicate the protocol unduly. If successfully created, a named prepared-statement object lasts till the end of the current session, unless explicitly destroyed. An unnamed prepared statement lasts only until the next Parse statement specifying the unnamed statement as destination is issued. (Note that a simple Query message also destroys the unnamed statement.) Named prepared statements must be explicitly closed before they can be redefined by a Parse message, but this is not required for the unnamed statement. Named prepared statements can also be created and accessed at the SQL command level, using PREPARE and EXECUTE. Once a prepared statement exists, it can be readied for execution using a Bind message. The Bind message gives the name of the source prepared statement (empty string denotes the unnamed prepared statement), the name of the destination portal (empty string denotes the unnamed portal), and the values to use for any parameter placeholders present in the prepared statement. The supplied parameter set must match those needed by the prepared statement. Bind also specifies the format to use for any data returned by the query; the format can be specified overall, or per-column. The response is either BindComplete or ErrorResponse. The choice between text and binary output is determined by the format codes given in Bind, regardless of the SQL command involved. The BINARY attribute in cursor declarations is irrelevant when using extended query protocol. If successfully created, a named portal object lasts till the end of the current transaction, unless explicitly destroyed. An unnamed portal is destroyed at the end of the transaction, or as soon as the next Bind statement specifying the unnamed portal as destination is issued. (Note that a simple Query message also destroys the unnamed portal.) Named portals must be explicitly closed before they can be redefined by a Bind message, but this is not required for the unnamed portal. Named portals can also be created and accessed at the SQL command level, using DECLARE CURSOR and FETCH. Once a portal exists, it can be executed using an Execute message. The Execute message specifies the portal name (empty string denotes the unnamed portal) and a maximum result-row count (zero meaning fetch all rows). The result-row count is only meaningful for portals containing commands that return rowsets; in other cases the command is always executed to completion, and the row count is ignored. The possible responses to Execute are the same as those described above for queries issued via simple query protocol, except that Execute doesn't cause ReadyForQuery to be issued. If Execute terminates before completing the execution of a portal (due to reaching a nonzero result-row count), it will send a PortalSuspended message; the appearance of this message tells the frontend that another Execute should be issued against the same portal to complete the operation. The CommandComplete message indicating completion of the source SQL command is not sent until the portal's execution is completed. Therefore, an Execute phase is always terminated by the appearance of exactly one of these messages: CommandComplete, EmptyQueryResponse (if the portal was created from an empty query string), ErrorResponse, or PortalSuspended. At completion of each series of extended-query messages, the frontend should issue a Sync message. This parameterless message causes the backend to close the current transaction if it's not inside a BEGIN/COMMIT transaction block (close meaning to commit if no error, or roll back if error). Then a ReadyForQuery response is issued. The purpose of Sync is to provide a resynchronization point for error recovery. When an error is detected while processing any extended-query message, the backend issues ErrorResponse, then reads and discards messages until a Sync is reached, then issues ReadyForQuery and returns to normal message processing. (But note that no skipping occurs if an error is detected while processing Sync --- this ensures that there is one and only one ReadyForQuery sent for each Sync.) Sync does not cause a transaction block opened with BEGIN to be closed. It is possible to detect this situation since the ReadyForQuery message includes transaction status information. In addition to these fundamental, required operations, there are several optional operations that can be used with extended-query protocol. The Describe message (portal variant) specifies the name of an existing portal (or an empty string for the unnamed portal). The response is a RowDescription message describing the rows that will be returned by executing the portal; or a NoData message if the portal does not contain a query that will return rows; or ErrorResponse if there is no such portal. The Describe message (statement variant) specifies the name of an existing prepared statement (or an empty string for the unnamed prepared statement). The response is a ParameterDescription message describing the parameters needed by the statement, followed by a RowDescription message describing the rows that will be returned when the statement is eventually executed (or a NoData message if the statement will not return rows). ErrorResponse is issued if there is no such prepared statement. Note that since Bind has not yet been issued, the formats to be used for returned columns are not yet known to the backend; the format code fields in the RowDescription message will be zeroes in this case. In most scenarios the frontend should issue one or the other variant of Describe before issuing Execute, to ensure that it knows how to interpret the results it will get back. The Close message closes an existing prepared statement or portal and releases resources. It is not an error to issue Close against a nonexistent statement or portal name. The response is normally CloseComplete, but could be ErrorResponse if some difficulty is encountered while releasing resources. Note that closing a prepared statement implicitly closes any open portals that were constructed from that statement. The Flush message does not cause any specific output to be generated, but forces the backend to deliver any data pending in its output buffers. A Flush must be sent after any extended-query command except Sync, if the frontend wishes to examine the results of that command before issuing more commands. Without Flush, messages returned by the backend will be combined into the minimum possible number of packets to minimize network overhead. The simple Query message is approximately equivalent to the series Parse, Bind, portal Describe, Execute, Close, Sync, using the unnamed prepared statement and portal objects and no parameters. One difference is that it will accept multiple SQL statements in the query string, automatically performing the bind/describe/execute sequence for each one in succession. Another difference is that it will not return ParseComplete, BindComplete, CloseComplete, or NoData messages. Function Call The Function Call sub-protocol allows the client to request a direct call of any function that exists in the database's pg_proc system catalog. The client must have execute permission for the function. The Function Call sub-protocol is a legacy feature that is probably best avoided in new code. Similar results can be accomplished by setting up a prepared statement that does SELECT function($1, ...). The Function Call cycle can then be replaced with Bind/Execute. A Function Call cycle is initiated by the frontend sending a FunctionCall message to the backend. The backend then sends one or more response messages depending on the results of the function call, and finally a ReadyForQuery response message. ReadyForQuery informs the frontend that it may safely send a new query or function call. The possible response messages from the backend are: ErrorResponse An error has occurred. FunctionCallResponse The function call was completed and returned the result given in the message. (Note that the Function Call protocol can only handle a single scalar result, not a rowtype or set of results.) ReadyForQuery Processing of the function call is complete. ReadyForQuery will always be sent, whether processing terminates successfully or with an error. NoticeResponse A warning message has been issued in relation to the function call. Notices are in addition to other responses, i.e., the backend will continue processing the command. COPY Operations The COPY command allows high-speed bulk data transfer to or from the server. Copy-in and copy-out operations each switch the connection into a distinct sub-protocol, which lasts until the operation is completed. Copy-in mode (data transfer to the server) is initiated when the backend executes a COPY FROM STDIN SQL statement. The backend sends a CopyInResponse message to the frontend. The frontend should then send zero or more CopyData messages, forming a stream of input data. (The message boundaries are not required to have anything to do with row boundaries, although that is often a reasonable choice.) The frontend can terminate the copy-in mode by sending either a CopyDone message (allowing successful termination) or a CopyFail message (which will cause the COPY SQL statement to fail with an error). The backend then reverts to the command-processing mode it was in before the COPY started, which will be either simple or extended query protocol. It will next send either CommandComplete (if successful) or ErrorResponse (if not). In the event of a backend-detected error during copy-in mode (including receipt of a CopyFail message), the backend will issue an ErrorResponse message. If the COPY command was issued via an extended-query message, the backend will now discard frontend messages until a Sync message is received, then it will issue ReadyForQuery and return to normal processing. If the COPY command was issued in a simple Query message, the rest of that message is discarded and ReadyForQuery is issued. In either case, any subsequent CopyData, CopyDone, or CopyFail messages issued by the frontend will simply be dropped. The backend will ignore Flush and Sync messages received during copy-in mode. Receipt of any other non-copy message type constitutes an error that will abort the copy-in state as described above. (The exception for Flush and Sync is for the convenience of client libraries that always send Flush or Sync after an Execute message, without checking whether the command to be executed is a COPY FROM STDIN.) Copy-out mode (data transfer from the server) is initiated when the backend executes a COPY TO STDOUT SQL statement. The backend sends a CopyOutResponse message to the frontend, followed by zero or more CopyData messages (always one per row), followed by CopyDone. The backend then reverts to the command-processing mode it was in before the COPY started, and sends CommandComplete. The frontend cannot abort the transfer (except by closing the connection or issuing a Cancel request), but it can discard unwanted CopyData and CopyDone messages. In the event of a backend-detected error during copy-out mode, the backend will issue an ErrorResponse message and revert to normal processing. The frontend should treat receipt of ErrorResponse (or indeed any message type other than CopyData or CopyDone) as terminating the copy-out mode. The CopyInResponse and CopyOutResponse messages include fields that inform the frontend of the number of columns per row and the format codes being used for each column. (As of the present implementation, all columns in a given COPY operation will use the same format, but the message design does not assume this.) Asynchronous Operations There are several cases in which the backend will send messages that are not specifically prompted by the frontend's command stream. Frontends must be prepared to deal with these messages at any time, even when not engaged in a query. At minimum, one should check for these cases before beginning to read a query response. It is possible for NoticeResponse messages to be generated due to outside activity; for example, if the database administrator commands a fast database shutdown, the backend will send a NoticeResponse indicating this fact before closing the connection. Accordingly, frontends should always be prepared to accept and display NoticeResponse messages, even when the connection is nominally idle. ParameterStatus messages will be generated whenever the active value changes for any of the parameters the backend believes the frontend should know about. Most commonly this occurs in response to a SET SQL command executed by the frontend, and this case is effectively synchronous --- but it is also possible for parameter status changes to occur because the administrator changed a configuration file and then SIGHUP'd the postmaster. Also, if a SET command is rolled back, an appropriate ParameterStatus message will be generated to report the current effective value. At present there is a hard-wired set of parameters for which ParameterStatus will be generated: they are server_version (a pseudo-parameter that cannot change after startup); client_encoding, is_superuser, session_authorization, and DateStyle. This set might change in the future, or even become configurable. Accordingly, a frontend should simply ignore ParameterStatus for parameters that it does not understand or care about. If a frontend issues a LISTEN command, then the backend will send a NotificationResponse message (not to be confused with NoticeResponse!) whenever a NOTIFY command is executed for the same notification name. At present, NotificationResponse can only be sent outside a transaction, and thus it will not occur in the middle of a command-response series, though it may occur just before ReadyForQuery. It is unwise to design frontend logic that assumes that, however. Good practice is to be able to accept NotificationResponse at any point in the protocol. Cancelling Requests in Progress During the processing of a query, the frontend may request cancellation of the query. The cancel request is not sent directly on the open connection to the backend for reasons of implementation efficiency: we don't want to have the backend constantly checking for new input from the frontend during query processing. Cancel requests should be relatively infrequent, so we make them slightly cumbersome in order to avoid a penalty in the normal case. To issue a cancel request, the frontend opens a new connection to the server and sends a CancelRequest message, rather than the StartupMessage message that would ordinarily be sent across a new connection. The server will process this request and then close the connection. For security reasons, no direct reply is made to the cancel request message. A CancelRequest message will be ignored unless it contains the same key data (PID and secret key) passed to the frontend during connection start-up. If the request matches the PID and secret key for a currently executing backend, the processing of the current query is aborted. (In the existing implementation, this is done by sending a special signal to the backend process that is processing the query.) The cancellation signal may or may not have any effect --- for example, if it arrives after the backend has finished processing the query, then it will have no effect. If the cancellation is effective, it results in the current command being terminated early with an error message. The upshot of all this is that for reasons of both security and efficiency, the frontend has no direct way to tell whether a cancel request has succeeded. It must continue to wait for the backend to respond to the query. Issuing a cancel simply improves the odds that the current query will finish soon, and improves the odds that it will fail with an error message instead of succeeding. Since the cancel request is sent across a new connection to the server and not across the regular frontend/backend communication link, it is possible for the cancel request to be issued by any process, not just the frontend whose query is to be canceled. This may have some benefits of flexibility in building multiple-process applications. It also introduces a security risk, in that unauthorized persons might try to cancel queries. The security risk is addressed by requiring a dynamically generated secret key to be supplied in cancel requests. Termination The normal, graceful termination procedure is that the frontend sends a Terminate message and immediately closes the connection. On receipt of this message, the backend closes the connection and terminates. In rare cases (such as an administrator-commanded database shutdown) the backend may disconnect without any frontend request to do so. In such cases the backend will attempt to send an error or notice message giving the reason for the disconnection before it closes the connection. Other termination scenarios arise from various failure cases, such as core dump at one end or the other, loss of the communications link, loss of message-boundary synchronization, etc. If either frontend or backend sees an unexpected closure of the connection, it should clean up and terminate. The frontend has the option of launching a new backend by recontacting the server if it doesn't want to terminate itself. Closing the connection is also advisable if an unrecognizable message type is received, since this probably indicates loss of message-boundary sync. For either normal or abnormal termination, any open transaction is rolled back, not committed. One should note however that if a frontend disconnects while a non-SELECT query is being processed, the backend will probably finish the query before noticing the disconnection. If the query is outside any transaction block (BEGIN ... COMMIT sequence) then its results may be committed before the disconnection is recognized. SSL Session Encryption If PostgreSQL was built with SSL support, frontend/backend communications can be encrypted using SSL. This provides communication security in environments where attackers might be able to capture the session traffic. To initiate an SSL-encrypted connection, the frontend initially sends an SSLRequest message rather than a StartupMessage. The server then responds with a single byte containing S or N, indicating that it is willing or unwilling to perform SSL, respectively. The frontend may close the connection at this point if it is dissatisfied with the response. To continue after S, perform an SSL startup handshake (not described here, part of the SSL specification) with the server. If this is successful, continue with sending the usual StartupMessage. In this case the StartupMessage and all subsequent data will be SSL-encrypted. To continue after N, send the usual StartupMessage and proceed without encryption. The frontend should also be prepared to handle an ErrorMessage response to SSLRequest from the server. This would only occur if the server predates the addition of SSL support to PostgreSQL. In this case the connection must be closed, but the frontend may choose to open a fresh connection and proceed without requesting SSL. An initial SSLRequest may also be used in a connection that is being opened to send a CancelRequest message. While the protocol itself does not provide a way for the server to force SSL encryption, the administrator may configure the server to reject unencrypted sessions as a byproduct of authentication checking. Message Data Types This section describes the base data types used in messages. Intn(i) An n-bit integer in network byte order (most significant byte first). If i is specified it is the exact value that will appear, otherwise the value is variable. Eg. Int16, Int32(42). Intn[k] An array of k n-bit integers, each in network byte order. The array length k is always determined by an earlier field in the message. Eg. Int16[M]. String(s) A null-terminated string (C-style string). There is no specific length limitation on strings. If s is specified it is the exact value that will appear, otherwise the value is variable. Eg. String, String("user"). There is no predefined limit on the length of a string that can be returned by the backend. Good coding strategy for a frontend is to use an expandable buffer so that anything that fits in memory can be accepted. If that's not feasible, read the full string and discard trailing characters that don't fit into your fixed-size buffer. Byten(c) Exactly n bytes. If the field width n is not a constant, it is always determinable from an earlier field in the message. If c is specified it is the exact value. Eg. Byte2, Byte1('\n'). Message Formats This section describes the detailed format of each message. Each is marked to indicate that it may be sent by a frontend (F), a backend (B), or both (F & B). Notice that although each message includes a byte count at the beginning, the message format is defined so that the message end can be found without reference to the byte count. This aids validity checking. (The CopyData message is an exception, because it forms part of a data stream; the contents of any individual CopyData message may not be interpretable on their own.) AuthenticationOk (B) Byte1('R') Identifies the message as an authentication request. Int32(8) Length of message contents in bytes, including self. Int32(0) Specifies that the authentication was successful. AuthenticationKerberosV4 (B) Byte1('R') Identifies the message as an authentication request. Int32(8) Length of message contents in bytes, including self. Int32(1) Specifies that Kerberos V4 authentication is required. AuthenticationKerberosV5 (B) Byte1('R') Identifies the message as an authentication request. Int32(8) Length of message contents in bytes, including self. Int32(2) Specifies that Kerberos V5 authentication is required. AuthenticationCleartextPassword (B) Byte1('R') Identifies the message as an authentication request. Int32(8) Length of message contents in bytes, including self. Int32(3) Specifies that a cleartext password is required. AuthenticationCryptPassword (B) Byte1('R') Identifies the message as an authentication request. Int32(10) Length of message contents in bytes, including self. Int32(4) Specifies that a crypt()-encrypted password is required. Byte2 The salt to use when encrypting the password. AuthenticationMD5Password (B) Byte1('R') Identifies the message as an authentication request. Int32(12) Length of message contents in bytes, including self. Int32(5) Specifies that an MD5-encrypted password is required. Byte4 The salt to use when encrypting the password. AuthenticationSCMCredential (B) Byte1('R') Identifies the message as an authentication request. Int32(8) Length of message contents in bytes, including self. Int32(6) Specifies that an SCM credentials message is required. BackendKeyData (B) Byte1('K') Identifies the message as cancellation key data. The frontend must save these values if it wishes to be able to issue CancelRequest messages later. Int32(12) Length of message contents in bytes, including self. Int32 The process ID of this backend. Int32 The secret key of this backend. Bind (F) Byte1('B') Identifies the message as a Bind command. Int32 Length of message contents in bytes, including self. String The name of the destination portal (an empty string selects the unnamed portal). String The name of the source prepared statement (an empty string selects the unnamed prepared statement). Int16 The number of parameter format codes that follow (denoted C below). This can be zero to indicate that there are no parameters or that the parameters all use the default format (text); or one, in which case the specified format code is applied to all parameters; or it can equal the actual number of parameters. Int16[C] The parameter format codes. Each must presently be zero (text) or one (binary). Int16 The number of parameter values that follow (possibly zero). This must match the number of parameters needed by the query. Next, the following pair of fields appear for each parameter: Int32 The length of the parameter value, in bytes (this count does not include itself). Can be zero. As a special case, -1 indicates a NULL parameter value. No value bytes follow in the NULL case. Byten The value of the parameter, in the format indicated by the associated format code. n is the above length. After the last parameter, the following fields appear: Int16 The number of result-column format codes that follow (denoted R below). This can be zero to indicate that there are no result columns or that the result columns should all use the default format (text); or one, in which case the specified format code is applied to all result columns (if any); or it can equal the actual number of result columns of the query. Int16[R] The result-column format codes. Each must presently be zero (text) or one (binary). BindComplete (B) Byte1('2') Identifies the message as a Bind-complete indicator. Int32(4) Length of message contents in bytes, including self. CancelRequest (F) Int32(16) Length of message contents in bytes, including self. Int32(80877102) The cancel request code. The value is chosen to contain 1234 in the most significant 16 bits, and 5678 in the least 16 significant bits. (To avoid confusion, this code must not be the same as any protocol version number.) Int32 The process ID of the target backend. Int32 The secret key for the target backend. Close (F) Byte1('C') Identifies the message as a Close command. Int32 Length of message contents in bytes, including self. Byte1 'S' to close a prepared statement; or 'P' to close a portal. String The name of the prepared statement or portal to close (an empty string selects the unnamed prepared statement or portal). CloseComplete (B) Byte1('3') Identifies the message as a Close-complete indicator. Int32(4) Length of message contents in bytes, including self. CommandComplete (B) Byte1('C') Identifies the message as a command-completed response. Int32 Length of message contents in bytes, including self. String The command tag. This is usually a single word that identifies which SQL command was completed. For an INSERT command, the tag is INSERT oid rows, where rows is the number of rows inserted. oid is the object ID of the inserted row if rows is 1 and the target table has OIDs; otherwise oid is 0. For a DELETE command, the tag is DELETE rows where rows is the number of rows deleted. For an UPDATE command, the tag is UPDATE rows where rows is the number of rows updated. For a MOVE command, the tag is MOVE rows where rows is the number of rows the cursor's position has been changed by. For a FETCH command, the tag is FETCH rows where rows is the number of rows that have been retrieved from the cursor. CopyData (F & B) Byte1('d') Identifies the message as COPY data. Int32 Length of message contents in bytes, including self. Byten Data that forms part of a COPY datastream. Messages sent from the backend will always correspond to single data rows, but messages sent by frontends may divide the datastream arbitrarily. CopyDone (F & B) Byte1('c') Identifies the message as a COPY-complete indicator. Int32(4) Length of message contents in bytes, including self. CopyFail (F) Byte1('f') Identifies the message as a COPY-failure indicator. Int32 Length of message contents in bytes, including self. String An error message to report as the cause of failure. CopyInResponse (B) Byte1('G') Identifies the message as a Start Copy In response. The frontend must now send copy-in data (if not prepared to do so, send a CopyFail message). Int32 Length of message contents in bytes, including self. Int8 0 indicates the overall copy format is textual (rows separated by newlines, columns separated by separator characters, etc). 1 indicates the overall copy format is binary (similar to DataRow format). See for more information. Int16 The number of columns in the data to be copied (denoted N below). Int16[N] The format codes to be used for each column. Each must presently be zero (text) or one (binary). All must be zero if the overall copy format is textual. CopyOutResponse (B) Byte1('H') Identifies the message as a Start Copy Out response. This message will be followed by copy-out data. Int32 Length of message contents in bytes, including self. Int8 0 indicates the overall copy format is textual (rows separated by newlines, columns separated by separator characters, etc). 1 indicates the overall copy format is binary (similar to DataRow format). See for more information. Int16 The number of columns in the data to be copied (denoted N below). Int16[N] The format codes to be used for each column. Each must presently be zero (text) or one (binary). All must be zero if the overall copy format is textual. DataRow (B) Byte1('D') Identifies the message as a data row. Int32 Length of message contents in bytes, including self. Int16 The number of column values that follow (possibly zero). Next, the following pair of fields appear for each column: Int32 The length of the column value, in bytes (this count does not include itself). Can be zero. As a special case, -1 indicates a NULL column value. No value bytes follow in the NULL case. Byten The value of the column, in the format indicated by the associated format code. n is the above length. Describe (F) Byte1('D') Identifies the message as a Describe command. Int32 Length of message contents in bytes, including self. Byte1 'S' to describe a prepared statement; or 'P' to describe a portal. String The name of the prepared statement or portal to describe (an empty string selects the unnamed prepared statement or portal). EmptyQueryResponse (B) Byte1('I') Identifies the message as a response to an empty query string. (This substitutes for CommandComplete.) Int32(4) Length of message contents in bytes, including self. ErrorResponse (B) Byte1('E') Identifies the message as an error. Int32 Length of message contents in bytes, including self. The message body consists of one or more identified fields, followed by a zero byte as a terminator. Fields may appear in any order. For each field there is the following: Byte1 A code identifying the field type; if zero, this is the message terminator and no string follows. The presently defined field types are listed in . Since more field types may be added in future, frontends should silently ignore fields of unrecognized type. String The field value. Execute (F) Byte1('E') Identifies the message as an Execute command. Int32 Length of message contents in bytes, including self. String The name of the portal to execute (an empty string selects the unnamed portal). Int32 Maximum number of rows to return, if portal contains a query that returns rows (ignored otherwise). Zero denotes no limit. Flush (F) Byte1('H') Identifies the message as a Flush command. Int32(4) Length of message contents in bytes, including self. FunctionCall (F) Byte1('F') Identifies the message as a function call. Int32 Length of message contents in bytes, including self. Int32 Specifies the object ID of the function to call. Int16 The number of argument format codes that follow (denoted C below). This can be zero to indicate that there are no arguments or that the arguments all use the default format (text); or one, in which case the specified format code is applied to all arguments; or it can equal the actual number of arguments. Int16[C] The argument format codes. Each must presently be zero (text) or one (binary). Int16 Specifies the number of arguments being supplied to the function. Next, the following pair of fields appear for each argument: Int32 The length of the argument value, in bytes (this count does not include itself). Can be zero. As a special case, -1 indicates a NULL argument value. No value bytes follow in the NULL case. Byten The value of the argument, in the format indicated by the associated format code. n is the above length. After the last argument, the following field appears: Int16 The format code for the function result. Must presently be zero (text) or one (binary). FunctionCallResponse (B) Byte1('V') Identifies the message as a function call result. Int32 Length of message contents in bytes, including self. Int32 The length of the function result value, in bytes (this count does not include itself). Can be zero. As a special case, -1 indicates a NULL function result. No value bytes follow in the NULL case. Byten The value of the function result, in the format indicated by the associated format code. n is the above length. NoData (B) Byte1('n') Identifies the message as a no-data indicator. Int32(4) Length of message contents in bytes, including self. NoticeResponse (B) Byte1('N') Identifies the message as a notice. Int32 Length of message contents in bytes, including self. The message body consists of one or more identified fields, followed by a zero byte as a terminator. Fields may appear in any order. For each field there is the following: Byte1 A code identifying the field type; if zero, this is the message terminator and no string follows. The presently defined field types are listed in . Since more field types may be added in future, frontends should silently ignore fields of unrecognized type. String The field value. NotificationResponse (B) Byte1('A') Identifies the message as a notification response. Int32 Length of message contents in bytes, including self. Int32 The process ID of the notifying backend process. String The name of the condition that the notify has been raised on. String Additional information passed from the notifying process. (Currently, this feature is unimplemented so the field is always an empty string.) ParameterDescription (B) Byte1('t') Identifies the message as a parameter description. Int32 Length of message contents in bytes, including self. Int16 The number of parameters used by the statement (may be zero). Then, for each parameter, there is the following: Int32 Specifies the object ID of the parameter datatype. ParameterStatus (B) Byte1('S') Identifies the message as a run-time parameter status report. Int32 Length of message contents in bytes, including self. String The name of the run-time parameter being reported. String The current value of the parameter. Parse (F) Byte1('P') Identifies the message as a Parse command. Int32 Length of message contents in bytes, including self. String The name of the destination prepared statement (an empty string selects the unnamed prepared statement). String The query string to be parsed. Int16 The number of parameter datatypes specified (may be zero). Note that this is not an indication of the number of parameters that might appear in the query string, only the number that the frontend wants to prespecify types for. Then, for each parameter, there is the following: Int32 Specifies the object ID of the parameter datatype. Placing a zero here is equivalent to leaving the type unspecified. ParseComplete (B) Byte1('1') Identifies the message as a Parse-complete indicator. Int32(4) Length of message contents in bytes, including self. PasswordMessage (F) Byte1('p') Identifies the message as a password response. Int32 Length of message contents in bytes, including self. String The password (encrypted, if requested). PortalSuspended (B) Byte1('s') Identifies the message as a portal-suspended indicator. Note this only appears if an Execute message's row-count limit was reached. Int32(4) Length of message contents in bytes, including self. Query (F) Byte1('Q') Identifies the message as a simple query. Int32 Length of message contents in bytes, including self. String The query string itself. ReadyForQuery (B) Byte1('Z') Identifies the message type. ReadyForQuery is sent whenever the backend is ready for a new query cycle. Int32(5) Length of message contents in bytes, including self. Byte1 Current backend transaction status indicator. Possible values are 'I' if idle (not in a transaction block); 'T' if in a transaction block; or 'E' if in a failed transaction block (queries will be rejected until block is ended). RowDescription (B) Byte1('T') Identifies the message as a row description. Int32 Length of message contents in bytes, including self. Int16 Specifies the number of fields in a row (may be zero). Then, for each field, there is the following: String The field name. Int32 If the field can be identified as a column of a specific table, the object ID of the table; otherwise zero. Int16 If the field can be identified as a column of a specific table, the attribute number of the column; otherwise zero. Int32 The object ID of the field's datatype. Int16 The datatype size (see pg_type.typlen). Note that negative values denote variable-width types. Int32 The type modifier (see pg_attribute.atttypmod). The meaning of the modifier is type-specific. Int16 The format code being used for the field. Currently will be zero (text) or one (binary). In a RowDescription returned from the statement variant of Describe, the format code is not yet known and will always be zero. SSLRequest (F) Int32(8) Length of message contents in bytes, including self. Int32(80877103) The SSL request code. The value is chosen to contain 1234 in the most significant 16 bits, and 5679 in the least 16 significant bits. (To avoid confusion, this code must not be the same as any protocol version number.) StartupMessage (F) Int32 Length of message contents in bytes, including self. Int32(196608) The protocol version number. The most significant 16 bits are the major version number (3 for the protocol described here). The least significant 16 bits are the minor version number (0 for the protocol described here). The protocol version number is followed by one or more pairs of parameter name and value strings. A zero byte is required as a terminator after the last name/value pair. Parameters can appear in any order. user is required, others are optional. Each parameter is specified as: String The parameter name. Currently recognized names are: user The database user name to connect as. Required; there is no default. database The database to connect to. Defaults to the user name. options Command-line arguments for the backend. (This is deprecated in favor of setting individual run-time parameters.) In addition to the above, any run-time parameter that can be set at backend start time may be listed. Such settings will be applied during backend start (after parsing the command-line options if any). The values will act as session defaults. String The parameter value. Sync (F) Byte1('S') Identifies the message as a Sync command. Int32(4) Length of message contents in bytes, including self. Terminate (F) Byte1('X') Identifies the message as a termination. Int32(4) Length of message contents in bytes, including self. Error and Notice Message Fields This section describes the fields that may appear in ErrorResponse and NoticeResponse messages. Each field type has a single-byte identification token. Note that any given field type should appear at most once per message. S Severity: the field contents are ERROR, FATAL, or PANIC (in an error message), or WARNING, NOTICE, DEBUG, INFO, or LOG (in a notice message), or a localized translation of one of these. Always present. C Code: the SQLSTATE code for the error (a 5-character string following SQL spec conventions). Not localizable. Always present. M Message: the primary human-readable error message. This should be accurate but terse (typically one line). Always present. D Detail: an optional secondary error message carrying more detail about the problem. May run to multiple lines. H Hint: an optional suggestion what to do about the problem. This is intended to differ from Detail in that it offers advice (potentially inappropriate) rather than hard facts. May run to multiple lines. P Position: the field value is a decimal ASCII integer, indicating an error cursor position as an index into the original query string. The first character has index 1, and positions are measured in characters not bytes. W Where: an indication of the context in which the error occurred. Presently this includes a call stack traceback of active PL functions. The trace is one entry per line, most recent first. F File: the file name of the source-code location where the error was reported. L Line: the line number of the source-code location where the error was reported. R Routine: the name of the source-code routine reporting the error. The client is responsible for formatting displayed information to meet its needs; in particular it should break long lines as needed. Newline characters appearing in the error message fields should be treated as paragraph breaks, not line breaks. Summary of Changes since Protocol 2.0 This section provides a quick checklist of changes, for the benefit of developers trying to update existing client libraries to protocol 3.0. The initial startup packet uses a flexible list-of-strings format instead of a fixed format. Notice that session default values for run-time parameters can now be specified directly in the startup packet. (Actually, you could do that before using the options field, but given the limited width of options and the lack of any way to quote whitespace in the values, it wasn't a very safe technique.) All messages now have a length count immediately following the message type byte (except for startup packets, which have no type byte). Also note that PasswordMessage now has a type byte. ErrorResponse and NoticeResponse ('E' and 'N') messages now contain multiple fields, from which the client code may assemble an error message of the desired level of verbosity. Note that individual fields will typically not end with a newline, whereas the single string sent in the older protocol always did. The ReadyForQuery ('Z') message includes a transaction status indicator. The distinction between BinaryRow and DataRow message types is gone; the single DataRow message type serves for returning data in all formats. Note that the layout of DataRow has changed to make it easier to parse. Also, the representation of binary values has changed: it is no longer directly tied to the server's internal representation. There is a new extended query sub-protocol, which adds the frontend message types Parse, Bind, Execute, Describe, Close, Flush, and Sync, and the backend message types ParseComplete, BindComplete, PortalSuspended, ParameterDescription, NoData, and CloseComplete. Existing clients do not have to concern themselves with this sub-protocol, but making use of it may allow improvements in performance or functionality. COPY data is now encapsulated into CopyData and CopyDone messages. There is a well-defined way to recover from errors during COPY. The special \. last line is not needed anymore, and is not sent during COPY OUT. (It is still recognized as a terminator during COPY IN, but its use is deprecated and will eventually be removed.) Binary COPY is supported. The CopyInResponse and CopyOutResponse messages include fields indicating the number of columns and the format of each column. The layout of FunctionCall and FunctionCallResponse messages has changed. FunctionCall can now support passing NULL arguments to functions. It also can handle passing parameters and retrieving results in either text or binary format. There is no longer any reason to consider FunctionCall a potential security hole, since it does not offer direct access to internal server data representations. The backend sends ParameterStatus ('S') messages during connection startup for all parameters it considers interesting to the client library. Subsequently, a ParameterStatus message is sent whenever the active value changes for any of these parameters. The RowDescription ('T') message carries new table OID and column number fields for each column of the described row. It also shows the format code for each column. The CursorResponse ('P') message is no longer generated by the backend. The NotificationResponse ('A') message has an additional string field, which is presently empty but may someday carry additional data passed from the NOTIFY event sender. The EmptyQueryResponse ('I') message used to include an empty string parameter; this has been removed.