From 347d2b07fcc250680f75b5f89ba49d4805782c6b Mon Sep 17 00:00:00 2001 From: Alvaro Herrera Date: Fri, 3 Apr 2020 13:23:20 -0300 Subject: [PATCH] Add a glossary to the documentation MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit More work is still needed, but this is a good start. Co-authored-by: Corey Huinker Co-authored-by: Jürgen Purtz Co-authored-by: Roger Harkavy Co-authored-by: Álvaro Herrera Reviewed-by: Justin Pryzby Discussion: https://postgr.es/m/CADkLM=eP6HOeqDjn0FdXuGRusQu4oWH_LFsKjjafmhvWD=aSpQ@mail.gmail.com --- doc/src/sgml/filelist.sgml | 1 + doc/src/sgml/glossary.sgml | 1775 ++++++++++++++++++++++++++++++++++++ doc/src/sgml/postgres.sgml | 1 + 3 files changed, 1777 insertions(+) create mode 100644 doc/src/sgml/glossary.sgml diff --git a/doc/src/sgml/filelist.sgml b/doc/src/sgml/filelist.sgml index 1043d0f7ab..cf21ef857e 100644 --- a/doc/src/sgml/filelist.sgml +++ b/doc/src/sgml/filelist.sgml @@ -170,6 +170,7 @@ + diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml new file mode 100644 index 0000000000..8c6cb6e942 --- /dev/null +++ b/doc/src/sgml/glossary.sgml @@ -0,0 +1,1775 @@ + + Glossary + + This is a list of terms and their meaning in the context of + PostgreSQL and relational database + systems in general. + + + + + Aggregate Function + + + A function that + combines (aggregates) multiple input values, + for example by counting, averaging or adding, + yielding a single output value. + + + For more information, see + . + + + + + + + Analyze (operation) + + + The process of collecting statistics from data in + tables + and other relations + to help the query planner + to make decisions about how to execute + queries. + + + + + + Analytic Function + + + + + Atomic + + + In reference to a datum: + the fact that its value that cannot be broken down into smaller + components. + + + + + In reference to a + database transaction: + see atomicity. + + + + + + Atomicity + + + The property of a transaction + that either all its operations complete as a single unit or none do. + In addition, if a system failure occurs during the execution of a + transaction, no partial results are visible after recovery. + This is one of the ACID properties. + + + + + + Attribute + + + An element with a certain name and data type found within a + tuple or + table. + + + + + + Autovacuum + + + A set of background processes that routinely perform + vacuum + and analyze + operations. + + + For more information, see + . + + + + + + Backend (process) + + + Process of an instance + which act on behalf of client sessions + and handle their requests. + + + (Don't confuse this term with the similar terms + Background Worker or + Background Writer). + + + + + + Background Worker (process) + + + Process within an instance, + which runs system- or user-supplied code. + Serves as infrastructure for several features in + PostgreSQL, such as + logical replication + and parallel queries. + In addition, Extensions can add + custom background worker processes. + + + For more information, see + . + + + + + + Background Writer (process) + + + A process that continuously writes dirty pages from + shared memory to + the file system. It wakes up periodically, but works only for a short + period in order to distribute its expensive I/O + activity over time to avoid generating larger + I/O peaks which could block other processes. + + + For more information, see + . + + + + + + Cast + + + A conversion of a datum + from its current data type to another data type. + + + For more information, see + . + + + + + + Catalog + + + The SQL standard uses this term to + indicate what is called a + database in + PostgreSQL's terminology. + + + (Don't confuse this term with + system catalog). + + + For more information, see + . + + + + + + Check Constraint + + + A type of constraint + defined on a relation + which restricts the values allowed in one or more + attributes. The + check constraint can make reference to any attribute of the same row in + the relation, but cannot reference other rows of the same relation or + other relations. + + + For more information, see + . + + + + + + Checkpointer (process) + + + A specialized process responsible for executing checkpoints. + + + + + + Checkpoint + + + A point in the WAL sequence + at which it is guaranteed that the heap and index data files have been + updated with all information from + shared memory + modified before that checkpoint; + a checkpoint record is written and flushed to WAL + to mark that point. + + + A checkpoint is also the act of carrying out all the actions that + are necessary to reach a checkpoint as defined above. + This process is initiated when predefined conditions are met, + such as a specified amount of time has passed, or a certain volume + of records has been written; or it can be invoked by the user + with the command CHECKPOINT. + + + For more information, see + . + + + + + + Class (archaic) + + + + + Client (process) + + + Any process, possibly remote, that establishes a + session + by connecting to an + instance + to interact with a database. + + + + + + Cluster + + + A group of databases plus their + global SQL objects. The + cluster is managed by exactly one + instance. A newly created + Cluster will have three databases created automatically. They are + template0, template1, and + postgres. It is expected that an application will + create one or more additional database aside from these three. + + + (Don't confuse the PostgreSQL-specific term + Cluster with the SQL + command CLUSTER). + + + + + + Column + + + An attribute found in + a table or + view. + + + + + + Commit + + + The act of finalizing a + transaction within + the database, which + makes it visible to other transactions and assures its + durability. + + + For more information, see + . + + + + + + Concurrency + + + The concept that multiple independent operations happen within the + database at the same time. + In PostgreSQL, concurrency is controlled by + the multiversion concurrency control + mechanism. + + + + + + Connection + + + An established line of communication between a client process and a + backend process, + usually over a network, supporting a + session. This term is + sometimes used as a synonym for session. + + + For more information, see + . + + + + + + Consistency + + + The property that the data in the + database + is always in compliance with + integrity constraints. + Transactions may be allowed to violate some of the constraints + transiently before it commits, but if such violations are not resolved + by the time it commits, such transaction is automatically + rolled back. + This is one of the ACID properties. + + + + + + Constraint + + + A restriction on the values of data allowed within a + Table. + + + For more information, see + . + + + + + + Data Area + + + + + Data Directory + + + The base directory on the filesystem of a + server that contains all + data files and subdirectories associated with a + cluster with the + exception of tablespaces. + The environment variable PGDATA is commonly used to + refer to the + data directory. + + + An instance's storage + space comprises the data directory plus any additional tablespaces. + + + For more information, see + . + + + + + + Database + + + A named collection of + SQL objects. + + + For more information, see + . + + + + + + Database Server + + + + + Datum + + + The internal representation of one value of a SQL + data type. + + + + + + Delete + + + A SQL command which removes + rows from a given + table + or relation. + + + For more information, see + . + + + + + + Durability + + + The assurance that once a + transaction has + been committed, the + changes remain even after a system failure or crash. + This is one of the ACID properties. + + + + + + Extension + + + A software add-on package that can be installed on an + instance to + get extra features. + + + For more information, see + . + + + + + + File Segment + + + A physical file which stores data for a given + relation. + File segments are limited in size by a configuration value, + so if a relation exceeds that size, it is split into multiple segments. + + + For more information, see + . + + + (Don't confuse this term with the similar term + WAL segment). + + + + + + Foreign Data Wrapper + + + A means of representing data that is not contained in the local + database so that it appears as if were in local + table(s). With a Foreign Data Wrapper it is + possible to define a foreign server and + foreign tables. + + + For more information, see + . + + + + + + Foreign Key + + + A type of constraint + defined on one or more columns + in a table which + requires the value(s) in those columns to + identify zero or one row + in another (or, infrequently, the same) + table. + + + + + + Foreign Server + + + A named collection of + foreign tables which + all use the same + foreign data wrapper + and have other configuration values in common. + + + For more information, see + . + + + + + + Foreign Table + + + A relation which appears to have + rows and + columns similar to a + regular table, but will forward + requests for data through its + foreign data wrapper, + which will return result sets + structured according to the definition of the + foreign table. + + + For more information, see + . + + + + + + Function + + + Any defined transformation of data. Many functions are already defined + within PostgreSQL itself, but user-defined + ones can also be added. + + + For more information, see + . + + + + + + Global SQL Object + + + SQL objects which do + not belong to a specific + database. + + + These objects are + roles, + tablespaces, + replication origins, and subscriptions for logical replication. + + + + + + Grant + + + A SQL command that is used to allow + users or + role to access + specific objects within the database. + + + For more information, see + . + + + + + + Heap + + + Contains the values of row + attributes (i.e. the data) for a + relation. + The heap is realized within + segment files. + + + + + + Host + + + A computer that communicates with other hosts over a network. + This term can be used to refer to either a + client + or a server. + + + + + + Index + + + A relation that contains + data derived from a table + (or relation types + such as a materialized view). + Its internal structure supports fast retrieval of and access to the original + data. + + + For more information, see + . + + + + + + Insert + + + A SQL command used to add new data into a + table. + + + For more information, see + . + + + + + + Instance + + + An instance is a group of processes, + its supporting storage space, + plus their + common shared memory, + running on a single server. + The instance + handles all key features of a DBMS: read and write + access to files and shared memory, assurance of + the ACID paradigm, MVCC, + connections to client programs, backup, + recovery, replication, privileges, etc. + + + An instance manages exactly one + cluster. + + + Many instances can run on the same server as + long as their TCP/IP ports do not conflict. + Different instances on a server may use the + same or different versions of PostgreSQL. + + + + + + Isolation + + + The property that the effects of a transaction are not visible to + concurrent transactions + before it commits. + This is one of the ACID properties. + + + For more information, see . + + + + + + Join + + + A SQL keyword used in SELECT statements for + combining data from multiple relations. + + + + + + Key + + + A means of identifying a row within a + table or + relation by + values contained within one or more + attributes + in that table. + + + + + + Lock + + + A mechanism that allows a process to limit or prevent simultaneous + access to a resource. + + + + + + Log File + + + Log files contain human-readable text lines about events. + Examples include login failures, long-running queries, etc. + + + For more information, see + . + + + + + + Logger (process) + + + If activated, the + Logger process + writes information about database events into the current + log file. + When reaching certain time- or + volume-dependent criteria, a new log file is created. + Also called syslogger. + + + For more information, see + . + + + + + + Log Record + + + Archaic term for a WAL record. + + + + + + Logged + + + A table is considered + logged if changes to it are sent to the + WAL. By default, all regular + tables are logged. A table can be specified as + unlogged either at + creation time or via the ALTER TABLE command. + + + + + + Master (server) + + + + + Materialized + + + The property that some information has been pre-computed and stored + for later use, rather than computing it on-the-fly. + + + This term is used in + materialized view, + to mean that the data derived from the view's query is stored on + disk separately from the sources of that data. + + + This term is also used to refer to some multi-step queries to mean that + the data resulting from executing a given step is stored in memory + (with the possibility of spilling to disk), so that it can be read multiple + times by another step. + + + + + + Materialized View + + + A relation that is + defined in the same way that a a view + is, but stores data in the same way that a + table does. It cannot be + modified via INSERT, UPDATE, or + DELETE operations. + + + For more information, see + . + + + + + + Multi-version concurrency control (MVCC) + + + A mechanism designed to allow several + transactions to be + reading and writing the same rows without one process causing other + processes to stall. + In PostgreSQL, MVCC is implemented by + creating copies (versions) of + tuples as they are + modified; after transactions that can see the old versions terminate, + those old versions need to be removed. + + + + + + Null + + + A concept of non-existence that is a central tenet of Relational + Database Theory. It represents the absence of value. + + + + + + Optimizer + + + + + Parallel Query + + + The ability to handle parts of executing a + query to take advantage + of parallel processes on servers with multiple CPUs. + + + + + + Partition + + + One of several disjoint (not overlapping) subsets of a larger set. + + + In reference to a + partitioned table: + One of the tables that each contain part of the data of the partitioned table, + which is said to be the parent. + The partition is itself a table, so it can also be queried directly; + at the same time, a partition can sometimes be a partitioned table, + allowing hierarchies to be created. + + + + + In reference to a window function: + a partition is a user-defined criteria that identifies which neighboring + rows can be considered by the + function. + + + + + + Partitioned Table + + + A relation that is + in semantic terms the same as a table, + but whose storage is distributed across several + partitions. + + + + + + Postmaster (process) + + + The very first process of an instance. + It starts and manages the other auxiliary processes and creates + backend processes + on demand. + + + For more information, see + . + + + + + + Primary (server) + + + When two or more databases + are linked via replication, + the server + that is considered the authoritative source of information is called + the primary, + also known as a master. + + + + + + Primary Key + + + A special case of a + unique constraint + defined on a + table or other + relation that also + guarantees that all of the + attributes + within the primary key + do not have null values. + As the name implies, there can be only one + primary key per table, though it is possible to have multiple unique + constraints that also have no null-capable attributes. + + + + + + Procedure + + + A defined set of instructions for manipulating data within a + database. + A procedure can + be written in a variety of programming languages. They are + similar to functions, + but are different in that they must be invoked via the CALL + command rather than the SELECT or PERFORM + commands, and they are allowed to make transactional statements such + as COMMIT and ROLLBACK. + + + For more information, see + . + + + + + + Query + + + A request sent by a client to a backend, + usually to return results or to modify data on the database. + + + + + + Query Planner + + + The part of PostgreSQL that is devoted to + determining (planning) the most efficient way to + execute queries. + Also known as query optimizer, + optimizer, or simply planner. + + + + + + Record + + + + + Recycling + + + + + Referential Integrity + + + A means of restricting data in one relation + by a foreign key + so that it must have matching data in another + relation. + + + + + + Relation + + + The generic term for all objects in a + database + that have a name and a list of + attributes + defined in a specific order. + Tables, + views, + foreign tables, + materialized views, and + indexes are all relations. + + + Class is an archaic synonym for + relation. + + + + + + Replica + + + A database that is paired + with a primary + database and is maintaining a copy of some or all of the primary database's + data. The foremost reasons for doing this are to allow for greater access + to that data, and to maintain availability of the data in the event that + the primary + becomes unavailable. + + + + + + Replication + + + The act of reproducing data on one + server onto another + server called a replica. + This can take the form of physical replication, + where all file changes from one server are copied verbatim, + or logical replication where a defined subset + of data changes are conveyed using a higher-level representation. + + + + + + Result Set + + + A data structure transmitted from a + backend process to + client program upon the completion of a SQL + command, usually a SELECT but it can be an + INSERT, UPDATE, or + DELETE command if the RETURNING + clause is specified. The data structure consists of zero or more + rows with the same ordered set of + attributes. + + + + + + Revoke + + + A command to prevent access to a named set of + database objects for a + named list of roles. + + + For more information, see + . + + + + + + Role + + + A collection of access privileges to the + instance. + Roless are themselves a privilege that can be granted to other roles. + This is often done for convenience or to ensure completeness + when multiple users need + the same privileges. + + + For more information, see + . + + + + + + Rollback + + + A command to undo all of the operations performed since the beginning + of a transaction. + + + For more information, see + . + + + + + + Row + + + + + Savepoint + + + A special mark inside the sequence of steps in a + transaction. + Data modifications after this point in time may be reverted + to the time of the savepoint. + + + For more information, see + . + + + + + + Schema + + + A schema is a namespace for SQL objects, + which all reside in the same + database. Each + SQL object must reside in exactly one schema. + + + The names of SQL objects of the same type in the same schema are enforced unique. + There is no restriction on reusing a name in multiple schemas. + + + All system-defined SQL objects reside in schema pg_catalog, + and commonly many user-defined SQL objects reside in the default schema + public, + but it is common and recommended that other schemas are created to hold + application-specific SQL objects. + + + + + More generically, the term Schema is used to mean + all data descriptions (table definitions, + constraints, comments, etc) + for a given database or + subset thereof. + + + For more information, see + . + + + + + + Segment + + + + + Select + + + The SQL command used to request data from a + database. + Normally, SELECT commands are not expected to modify the + database in any way, + but it is possible that + functions invoked within + the query could have side effects that do modify data. + + + For more information, see + . + + + + + + + + Server + + + A computer on which PostgreSQL + instances run. + The term server denotes real hardware, a + container, or a Virtual Machine. + + + + + + Session + + + A state that allows a client and a backend to interact, + communicating over a connection. + + + + + + + + Shared Memory + + + RAM which is used by the processes common to an + instance. + It mirrors parts of database + files, provides a transient area for + WAL records, + and stores additional common information. + Note that shared memory belongs to the complete instance, not to a single + database. + + + The largest part of shared memory is known as shared buffers + and is used to mirror part of data files, organized into pages. + When a page is modified, it is called a dirty page until it is + written back to the file system. + + + For more information, see + . + + + + + + SQL Object + + + A table, + view, + materialized view, + index, + constraint, + sequence, + function, + procedure, + trigger, + data type, or operator. Every one of those SQL objects + belong to exactly one Schema. + + + There also exist SQL objects that do not belong to schemas; those include + extensions, + data type cases, + and + foreign data wrappers. + + + For more information, see + . + + + + + + SQL Standard + + + A series of documents that define the SQL language. + + + + + + Stats Collector + + + This process collects statistical information about the + Cluster's activities. + + + For more information, see + . + + + + + + System Catalog + + + A collection of tables + which describe the structure of all + SQL objects + of each database + and the global SQL objects + of the cluster. + The system catalog resides in the schema pg_catalog. + These tables contain data in internal representation and are + not typically considered useful for user examination; + a number of user-friendlier views + also in schema pg_catalog offer more convenient access to + some of that information, while additional tables and views + exist in schema information_schema that expose some + of the same and additional information as mandated by the + SQL standard. + + + For more information, see + . + + + + + + Table + + + A collection of tuples having + a common data structure (the same number of + attributes, in the same + order, having the same name and type per position). + A table is the most common form of + Relation in + PostgreSQL. + + + For more information, see + . + + + + + + Tablespace + + + A named location on the server filesystem. + All SQL objects + which require storage beyond their definition in the + system catalog + must belong to a single tablespace. + Initially, an instance contains a single usable tablespace which is + used as the default one for all SQL objects, called pg_default. + + + For more information, see + . + + + + + + Temporary Table + + + Tables that exist either + for the lifetime of a + session or a + transaction, as + specified at the time of creation. + The data in them is not visible to other sessions, and is not + logged. + Temporary tables are often used to store intermediate data for a + multi-step operation. + + + For more information, see + . + + + + + + Transaction + + + A combination of commands that must act as a single + atomic command: they all + succeed or all fail as a single unit, and their effects are not visible to + other sessions until + the transaction is complete, and possibly even later, depending on the + isolation level. + + + For more information, see + . + + + + + + Trigger + + + A function which can + be defined to execute whenever a certain operation (INSERT, + UPDATE, DELETE, + TRUNCATE) is applied to a + relation. + A Trigger executes within the same + transaction as the + statement which invoked it, and if the function fails, then the invoking + statement also fails. + + + For more information, see + . + + + + + + Tuple + + + A collection of attributes + in a fixed order. + That order may be defined by the table + where the tuple is contained, in which case the tuple is often called a + row. It may also be defined by the structure of a + result set, in which case it is sometimes called a record. + + + + + + Unique Constraint + + + A type of constraint + defined on a relation + which restricts the values allowed in one or a combination of columns + so that each value or combination of values can only appear once in the + relation — that is, no other row in the relation contains values + that are equal to those. + + + Because null values are + not considered equal to each other, multiple rows with null values are + allowed to exist without violating the unique constraint. + + + + + + Unlogged + + + The property of certain relations + that the changes to them are not reflected in the + WAL. + This disables replication and crash recovery for these relations. + + + The primary use of unlogged tables is for storing + transient work data that must be shared across processes. + + + Temporary tables + are always unlogged. + + + + + + Update + + + A SQL command used to modify + rows + that may already exist in a specified table. + It cannot create or remove rows. + + + For more information, see + . + + + + + + User + + + A role that has the + LOGIN privilege. + + + + + + User mapping + + + The translation of login credentials in the local + database to credentials + in a remote data system defined by a + foreign data wrapper. + + + For more information, see + . + + + + + + Vacuum + + + The process of removing outdated tuple + versions from tables, and other closely related + garbage-collection-like processing required by PostgreSQL's + implementation of MVCC. + This can be initiated through the use of + the VACUUM command, but can also be handled automatically + via autovacuum processes. + + + For more information, see + . + + + + + + View + + + A relation that is defined by a + SELECT statement, but has no storage of its own. + Any time a query references a view, the definition of the view is + substituted into the query as if the user had typed it as a subquery + instead of the name of the view. + + + For more information, see + . + + + + + + WAL Archiver (process) + + + A process that saves copies of WAL files + for the purposes of creating backups or keeping + replicas current. + + + For more information, see + . + + + + + + WAL File + + + Also known as WAL segment or + WAL segment file. + Each of the sequentially-numbered files that provide storage space for + WAL. + The files are all of the same predefined size + and are written in sequential order, interspersing changes + as they occur in multiple simultaneous sessions. + If the system crashes, the files are read in order, and each of the + changes are replayed to restore the system to the state as it was + before the crash. + + + Each WAL file can be released after a + checkpoint + writes all the changes in it to the corresponding data files. + Releasing the file can be done either by deleting it, or by changing its + name so that it will be used in the future, which is called + recycling. + + + For more information, see + . + + + + + + WAL + + + + + WAL Record + + + A low-level description of an individual data change. + It contains sufficient information for the data change to be + re-executed (replayed) in case a system failure + causes the change to be lost. + WAL records use a non-printable binary format. + + + For more information, see + . + + + + + + WAL Segment + + + + + WAL Writer (process) + + + A process that writes WAL records + from shared memory to + WAL files. + + + For more information, see + . + + + + + + Window Function + + + A type of function whose + result is based on values found in + rows of the same + partition. + All aggregate functions + can be used as window functions, but window functions can also be + used to, for example, give ranks to each of the rows in the partition. + Also known as analytic functions. + + + For more information, see + . + + + + + + Write-Ahead Log + + + The journal that keeps track of the changes in the + instance as user- and + system-invoked operations take place. + It comprises many individual + WAL records written + sequentially to WAL files. + + + + + diff --git a/doc/src/sgml/postgres.sgml b/doc/src/sgml/postgres.sgml index 1f7bd32878..ba3d626102 100644 --- a/doc/src/sgml/postgres.sgml +++ b/doc/src/sgml/postgres.sgml @@ -278,6 +278,7 @@ &docguide; &limits; &acronyms; + &glossary; &color;