2005-03-02 00:45:00 +01:00
|
|
|
<!-- $PostgreSQL: pgsql/doc/src/sgml/dml.sgml,v 1.11 2005/03/01 23:45:00 neilc Exp $ -->
|
2002-08-05 21:44:58 +02:00
|
|
|
|
|
|
|
<chapter id="dml">
|
|
|
|
<title>Data Manipulation</title>
|
|
|
|
|
2003-11-24 20:08:02 +01:00
|
|
|
<remark>
|
2002-08-05 21:44:58 +02:00
|
|
|
This chapter is still quite incomplete.
|
2003-11-24 20:08:02 +01:00
|
|
|
</remark>
|
2002-08-05 21:44:58 +02:00
|
|
|
|
|
|
|
<para>
|
|
|
|
The previous chapter discussed how to create tables and other
|
|
|
|
structures to hold your data. Now it is time to fill the tables
|
|
|
|
with data. This chapter covers how to insert, update, and delete
|
|
|
|
table data. We also introduce ways to effect automatic data changes
|
|
|
|
when certain events occur: triggers and rewrite rules. The chapter
|
|
|
|
after this will finally explain how to extract your long-lost data
|
|
|
|
back out of the database.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<sect1 id="dml-insert">
|
|
|
|
<title>Inserting Data</title>
|
|
|
|
|
2003-08-31 19:32:24 +02:00
|
|
|
<indexterm zone="dml-insert">
|
|
|
|
<primary>inserting</primary>
|
|
|
|
</indexterm>
|
|
|
|
|
|
|
|
<indexterm zone="dml-insert">
|
|
|
|
<primary>INSERT</primary>
|
|
|
|
</indexterm>
|
|
|
|
|
2002-08-05 21:44:58 +02:00
|
|
|
<para>
|
|
|
|
When a table is created, it contains no data. The first thing to
|
|
|
|
do before a database can be of much use is to insert data. Data is
|
2002-11-11 21:14:04 +01:00
|
|
|
conceptually inserted one row at a time. Of course you can also
|
|
|
|
insert more than one row, but there is no way to insert less than
|
|
|
|
one row at a time. Even if you know only some column values, a
|
|
|
|
complete row must be created.
|
2002-08-05 21:44:58 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2005-03-02 00:45:00 +01:00
|
|
|
To create a new row, use the <xref linkend="sql-insert"
|
|
|
|
xreflabel="sql-insert-title"> command. The command requires the
|
|
|
|
table name and a value for each of the columns of the table. For
|
|
|
|
example, consider the products table from <xref linkend="ddl">:
|
2002-08-05 21:44:58 +02:00
|
|
|
<programlisting>
|
|
|
|
CREATE TABLE products (
|
|
|
|
product_no integer,
|
|
|
|
name text,
|
|
|
|
price numeric
|
|
|
|
);
|
|
|
|
</programlisting>
|
|
|
|
An example command to insert a row would be:
|
|
|
|
<programlisting>
|
|
|
|
INSERT INTO products VALUES (1, 'Cheese', 9.99);
|
|
|
|
</programlisting>
|
|
|
|
The data values are listed in the order in which the columns appear
|
|
|
|
in the table, separated by commas. Usually, the data values will
|
|
|
|
be literals (constants), but scalar expressions are also allowed.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
The above syntax has the drawback that you need to know the order
|
|
|
|
of the columns in the table. To avoid that you can also list the
|
|
|
|
columns explicitly. For example, both of the following commands
|
|
|
|
have the same effect as the one above:
|
|
|
|
<programlisting>
|
|
|
|
INSERT INTO products (product_no, name, price) VALUES (1, 'Cheese', 9.99);
|
|
|
|
INSERT INTO products (name, price, product_no) VALUES ('Cheese', 9.99, 1);
|
|
|
|
</programlisting>
|
|
|
|
Many users consider it good practice to always list the column
|
|
|
|
names.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
If you don't have values for all the columns, you can omit some of
|
|
|
|
them. In that case, the columns will be filled with their default
|
|
|
|
values. For example,
|
|
|
|
<programlisting>
|
|
|
|
INSERT INTO products (product_no, name) VALUES (1, 'Cheese');
|
|
|
|
INSERT INTO products VALUES (1, 'Cheese');
|
|
|
|
</programlisting>
|
|
|
|
The second form is a <productname>PostgreSQL</productname>
|
|
|
|
extension. It fills the columns from the left with as many values
|
|
|
|
as are given, and the rest will be defaulted.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
For clarity, you can also request default values explicitly, for
|
|
|
|
individual columns or for the entire row:
|
|
|
|
<programlisting>
|
|
|
|
INSERT INTO products (product_no, name, price) VALUES (1, 'Cheese', DEFAULT);
|
|
|
|
INSERT INTO products DEFAULT VALUES;
|
|
|
|
</programlisting>
|
|
|
|
</para>
|
2002-11-11 21:14:04 +01:00
|
|
|
|
|
|
|
<tip>
|
|
|
|
<para>
|
|
|
|
To do <quote>bulk loads</quote>, that is, inserting a lot of data,
|
2003-08-10 03:20:34 +02:00
|
|
|
take a look at the <xref linkend="sql-copy"
|
|
|
|
endterm="sql-copy-title"> command. It is not as flexible as the
|
2005-03-02 00:45:00 +01:00
|
|
|
<command>INSERT</command> command, but is more efficient. Refer to
|
|
|
|
<xref linkend="populate"> for more information on improving bulk
|
|
|
|
loading performance.
|
2002-11-11 21:14:04 +01:00
|
|
|
</para>
|
|
|
|
</tip>
|
2002-08-05 21:44:58 +02:00
|
|
|
</sect1>
|
|
|
|
|
|
|
|
<sect1 id="dml-update">
|
|
|
|
<title>Updating Data</title>
|
|
|
|
|
2003-08-31 19:32:24 +02:00
|
|
|
<indexterm zone="dml-update">
|
|
|
|
<primary>updating</primary>
|
|
|
|
</indexterm>
|
|
|
|
|
|
|
|
<indexterm zone="dml-update">
|
|
|
|
<primary>UPDATE</primary>
|
|
|
|
</indexterm>
|
|
|
|
|
2002-08-05 21:44:58 +02:00
|
|
|
<para>
|
|
|
|
The modification of data that is already in the database is
|
|
|
|
referred to as updating. You can update individual rows, all the
|
|
|
|
rows in a table, or a subset of all rows. Each column can be
|
|
|
|
updated separately; the other columns are not affected.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
To perform an update, you need three pieces of information:
|
|
|
|
<orderedlist spacing=compact>
|
|
|
|
<listitem>
|
|
|
|
<para>The name of the table and column to update,</para>
|
|
|
|
</listitem>
|
|
|
|
|
|
|
|
<listitem>
|
|
|
|
<para>The new value of the column,</para>
|
|
|
|
</listitem>
|
|
|
|
|
|
|
|
<listitem>
|
|
|
|
<para>Which row(s) to update.</para>
|
|
|
|
</listitem>
|
|
|
|
</orderedlist>
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
Recall from <xref linkend="ddl"> that SQL does not, in general,
|
|
|
|
provide a unique identifier for rows. Therefore it is not
|
|
|
|
necessarily possible to directly specify which row to update.
|
|
|
|
Instead, you specify which conditions a row must meet in order to
|
|
|
|
be updated. Only if you have a primary key in the table (no matter
|
2002-10-20 07:05:46 +02:00
|
|
|
whether you declared it or not) can you reliably address individual rows,
|
|
|
|
by choosing a condition that matches the primary key.
|
2002-08-05 21:44:58 +02:00
|
|
|
Graphical database access tools rely on this fact to allow you to
|
|
|
|
update rows individually.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
For example, this command updates all products that have a price of
|
|
|
|
5 to have a price of 10:
|
|
|
|
<programlisting>
|
|
|
|
UPDATE products SET price = 10 WHERE price = 5;
|
|
|
|
</programlisting>
|
|
|
|
This may cause zero, one, or many rows to be updated. It is not
|
|
|
|
an error to attempt an update that does not match any rows.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2004-12-23 06:37:40 +01:00
|
|
|
Let's look at that command in detail. First is the key word
|
2002-08-05 21:44:58 +02:00
|
|
|
<literal>UPDATE</literal> followed by the table name. As usual,
|
|
|
|
the table name may be schema-qualified, otherwise it is looked up
|
|
|
|
in the path. Next is the key word <literal>SET</literal> followed
|
|
|
|
by the column name, an equals sign and the new column value. The
|
|
|
|
new column value can be any scalar expression, not just a constant.
|
|
|
|
For example, if you want to raise the price of all products by 10%
|
|
|
|
you could use:
|
|
|
|
<programlisting>
|
|
|
|
UPDATE products SET price = price * 1.10;
|
|
|
|
</programlisting>
|
2004-12-23 06:37:40 +01:00
|
|
|
As you see, the expression for the new value can refer to the existing
|
|
|
|
value(s) in the row. We also left out the <literal>WHERE</literal> clause.
|
2002-08-05 21:44:58 +02:00
|
|
|
If it is omitted, it means that all rows in the table are updated.
|
2004-12-23 06:37:40 +01:00
|
|
|
If it is present, only those rows that match the
|
|
|
|
<literal>WHERE</literal> condition are updated. Note that the equals
|
2002-08-05 21:44:58 +02:00
|
|
|
sign in the <literal>SET</literal> clause is an assignment while
|
|
|
|
the one in the <literal>WHERE</literal> clause is a comparison, but
|
2004-12-23 06:37:40 +01:00
|
|
|
this does not create any ambiguity. Of course, the
|
|
|
|
<literal>WHERE</literal> condition does
|
2002-08-05 21:44:58 +02:00
|
|
|
not have to be an equality test. Many other operators are
|
|
|
|
available (see <xref linkend="functions">). But the expression
|
|
|
|
needs to evaluate to a Boolean result.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2004-12-23 06:37:40 +01:00
|
|
|
You can update more than one column in an
|
2005-03-02 00:45:00 +01:00
|
|
|
<command>UPDATE</command> command by listing more than one
|
2002-08-05 21:44:58 +02:00
|
|
|
assignment in the <literal>SET</literal> clause. For example:
|
|
|
|
<programlisting>
|
2005-01-22 23:56:36 +01:00
|
|
|
UPDATE mytable SET a = 5, b = 3, c = 1 WHERE a > 0;
|
2002-08-05 21:44:58 +02:00
|
|
|
</programlisting>
|
|
|
|
</para>
|
|
|
|
</sect1>
|
|
|
|
|
|
|
|
<sect1 id="dml-delete">
|
|
|
|
<title>Deleting Data</title>
|
|
|
|
|
2003-08-31 19:32:24 +02:00
|
|
|
<indexterm zone="dml-delete">
|
|
|
|
<primary>deleting</primary>
|
|
|
|
</indexterm>
|
|
|
|
|
|
|
|
<indexterm zone="dml-delete">
|
|
|
|
<primary>DELETE</primary>
|
|
|
|
</indexterm>
|
|
|
|
|
2002-08-05 21:44:58 +02:00
|
|
|
<para>
|
|
|
|
So far we have explained how to add data to tables and how to
|
|
|
|
change data. What remains is to discuss how to remove data that is
|
|
|
|
no longer needed. Just as adding data is only possible in whole
|
|
|
|
rows, you can only remove entire rows from a table. In the
|
2004-12-23 06:37:40 +01:00
|
|
|
previous section we explained that SQL does not provide a way to
|
2002-08-05 21:44:58 +02:00
|
|
|
directly address individual rows. Therefore, removing rows can
|
|
|
|
only be done by specifying conditions that the rows to be removed
|
|
|
|
have to match. If you have a primary key in the table then you can
|
|
|
|
specify the exact row. But you can also remove groups of rows
|
|
|
|
matching a condition, or you can remove all rows in the table at
|
|
|
|
once.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2005-03-02 00:45:00 +01:00
|
|
|
You use the <xref linkend="sql-delete"
|
|
|
|
xreflabel="sql-delete-title"> command to remove rows; the syntax is
|
|
|
|
very similar to the <command>UPDATE</command> command. For
|
|
|
|
instance, to remove all rows from the products table that have a
|
|
|
|
price of 10, use
|
2002-08-05 21:44:58 +02:00
|
|
|
<programlisting>
|
|
|
|
DELETE FROM products WHERE price = 10;
|
|
|
|
</programlisting>
|
|
|
|
</para>
|
2002-10-20 07:05:46 +02:00
|
|
|
|
|
|
|
<para>
|
|
|
|
If you simply write
|
|
|
|
<programlisting>
|
|
|
|
DELETE FROM products;
|
|
|
|
</programlisting>
|
|
|
|
then all rows in the table will be deleted! Caveat programmer.
|
|
|
|
</para>
|
2002-08-05 21:44:58 +02:00
|
|
|
</sect1>
|
|
|
|
</chapter>
|