postgresql/doc/src/sgml/hstore.sgml

656 lines
23 KiB
Plaintext
Raw Normal View History

2010-09-20 22:08:53 +02:00
<!-- doc/src/sgml/hstore.sgml -->
<sect1 id="hstore" xreflabel="hstore">
<title>hstore</title>
<indexterm zone="hstore">
<primary>hstore</primary>
</indexterm>
<para>
This module implements the <type>hstore</> data type for storing sets of
key/value pairs within a single <productname>PostgreSQL</> value.
This can be useful in various scenarios, such as rows with many attributes
that are rarely examined, or semi-structured data. Keys and values are
simply text strings.
</para>
<sect2>
<title><type>hstore</> External Representation</title>
<para>
The text representation of an <type>hstore</>, used for input and output,
includes zero or more <replaceable>key</> <literal>=&gt;</>
<replaceable>value</> pairs separated by commas. Some examples:
<synopsis>
k =&gt; v
foo =&gt; bar, baz =&gt; whatever
"1-a" =&gt; "anything at all"
</synopsis>
The order of the pairs is not significant (and may not be reproduced on
output). Whitespace between pairs or around the <literal>=&gt;</> sign is
ignored. Double-quote keys and values that include whitespace, commas,
<literal>=</>s or <literal>&gt;</>s. To include a double quote or a
backslash in a key or value, escape it with a backslash.
</para>
<para>
Each key in an <type>hstore</> is unique. If you declare an <type>hstore</>
with duplicate keys, only one will be stored in the <type>hstore</> and
there is no guarantee as to which will be kept:
<programlisting>
SELECT 'a=&gt;1,a=&gt;2'::hstore;
hstore
----------
"a"=&gt;"1"
</programlisting>
</para>
<para>
A value (but not a key) can be an SQL <literal>NULL</>. For example:
<programlisting>
key =&gt; NULL
</programlisting>
The <literal>NULL</> keyword is case-insensitive. Double-quote the
<literal>NULL</> to treat it as the ordinary string <quote>NULL</quote>.
</para>
<note>
<para>
Keep in mind that the <type>hstore</> text format, when used for input,
applies <emphasis>before</> any required quoting or escaping. If you are
passing an <type>hstore</> literal via a parameter, then no additional
processing is needed. But if you're passing it as a quoted literal
constant, then any single-quote characters and (depending on the setting of
the <varname>standard_conforming_strings</> configuration parameter)
backslash characters need to be escaped correctly. See
<xref linkend="sql-syntax-strings"> for more on the handling of string
constants.
</para>
</note>
<para>
On output, double quotes always surround keys and values, even when it's
not strictly necessary.
</para>
</sect2>
<sect2>
<title><type>hstore</> Operators and Functions</title>
<para>
The operators provided by the <literal>hstore</literal> module are
shown in <xref linkend="hstore-op-table">, the functions
in <xref linkend="hstore-func-table">.
</para>
<table id="hstore-op-table">
<title><type>hstore</> Operators</title>
<tgroup cols="4">
<thead>
<row>
<entry>Operator</entry>
<entry>Description</entry>
<entry>Example</entry>
<entry>Result</entry>
</row>
</thead>
<tbody>
<row>
<entry><type>hstore</> <literal>-&gt;</> <type>text</></entry>
<entry>get value for key (<literal>NULL</> if not present)</entry>
<entry><literal>'a=&gt;x, b=&gt;y'::hstore -&gt; 'a'</literal></entry>
<entry><literal>x</literal></entry>
</row>
<row>
<entry><type>hstore</> <literal>-&gt;</> <type>text[]</></entry>
<entry>get values for keys (<literal>NULL</> if not present)</entry>
<entry><literal>'a=&gt;x, b=&gt;y, c=&gt;z'::hstore -&gt; ARRAY['c','a']</literal></entry>
<entry><literal>{"z","x"}</literal></entry>
</row>
<row>
<entry><type>hstore</> <literal>||</> <type>hstore</></entry>
<entry>concatenate <type>hstore</>s</entry>
<entry><literal>'a=&gt;b, c=&gt;d'::hstore || 'c=&gt;x, d=&gt;q'::hstore</literal></entry>
<entry><literal>"a"=&gt;"b", "c"=&gt;"x", "d"=&gt;"q"</literal></entry>
</row>
<row>
<entry><type>hstore</> <literal>?</> <type>text</></entry>
<entry>does <type>hstore</> contain key?</entry>
<entry><literal>'a=&gt;1'::hstore ? 'a'</literal></entry>
<entry><literal>t</literal></entry>
</row>
<row>
<entry><type>hstore</> <literal>?&amp;</> <type>text[]</></entry>
<entry>does <type>hstore</> contain all specified keys?</entry>
<entry><literal>'a=&gt;1,b=&gt;2'::hstore ?&amp; ARRAY['a','b']</literal></entry>
<entry><literal>t</literal></entry>
</row>
<row>
<entry><type>hstore</> <literal>?|</> <type>text[]</></entry>
<entry>does <type>hstore</> contain any of the specified keys?</entry>
<entry><literal>'a=&gt;1,b=&gt;2'::hstore ?| ARRAY['b','c']</literal></entry>
<entry><literal>t</literal></entry>
</row>
<row>
<entry><type>hstore</> <literal>@&gt;</> <type>hstore</></entry>
<entry>does left operand contain right?</entry>
<entry><literal>'a=&gt;b, b=&gt;1, c=&gt;NULL'::hstore @&gt; 'b=&gt;1'</literal></entry>
<entry><literal>t</literal></entry>
</row>
<row>
<entry><type>hstore</> <literal>&lt;@</> <type>hstore</></entry>
<entry>is left operand contained in right?</entry>
<entry><literal>'a=&gt;c'::hstore &lt;@ 'a=&gt;b, b=&gt;1, c=&gt;NULL'</literal></entry>
<entry><literal>f</literal></entry>
</row>
<row>
<entry><type>hstore</> <literal>-</> <type>text</></entry>
<entry>delete key from left operand</entry>
<entry><literal>'a=&gt;1, b=&gt;2, c=&gt;3'::hstore - 'b'::text</literal></entry>
<entry><literal>"a"=&gt;"1", "c"=&gt;"3"</literal></entry>
</row>
<row>
<entry><type>hstore</> <literal>-</> <type>text[]</></entry>
<entry>delete keys from left operand</entry>
<entry><literal>'a=&gt;1, b=&gt;2, c=&gt;3'::hstore - ARRAY['a','b']</literal></entry>
<entry><literal>"c"=&gt;"3"</literal></entry>
</row>
<row>
<entry><type>hstore</> <literal>-</> <type>hstore</></entry>
<entry>delete matching pairs from left operand</entry>
<entry><literal>'a=&gt;1, b=&gt;2, c=&gt;3'::hstore - 'a=&gt;4, b=&gt;2'::hstore</literal></entry>
<entry><literal>"a"=&gt;"1", "c"=&gt;"3"</literal></entry>
</row>
<row>
<entry><type>record</> <literal>#=</> <type>hstore</></entry>
<entry>replace fields in <type>record</> with matching values from <type>hstore</></entry>
<entry>see Examples section</entry>
<entry></entry>
</row>
<row>
<entry><literal>%%</> <type>hstore</></entry>
<entry>convert <type>hstore</> to array of alternating keys and values</entry>
<entry><literal>%% 'a=&gt;foo, b=&gt;bar'::hstore</literal></entry>
<entry><literal>{a,foo,b,bar}</literal></entry>
</row>
<row>
<entry><literal>%#</> <type>hstore</></entry>
<entry>convert <type>hstore</> to two-dimensional key/value array</entry>
<entry><literal>%# 'a=&gt;foo, b=&gt;bar'::hstore</literal></entry>
<entry><literal>{{a,foo},{b,bar}}</literal></entry>
</row>
</tbody>
</tgroup>
</table>
<note>
<para>
Prior to PostgreSQL 8.2, the containment operators <literal>@&gt;</>
and <literal>&lt;@</> were called <literal>@</> and <literal>~</>,
respectively. These names are still available, but are deprecated and will
eventually be removed. Notice that the old names are reversed from the
2010-08-17 06:37:21 +02:00
convention formerly followed by the core geometric data types!
</para>
</note>
<table id="hstore-func-table">
<title><type>hstore</> Functions</title>
<tgroup cols="5">
<thead>
<row>
<entry>Function</entry>
<entry>Return Type</entry>
<entry>Description</entry>
<entry>Example</entry>
<entry>Result</entry>
</row>
</thead>
<tbody>
<row>
<entry><function>hstore(record)</function><indexterm><primary>hstore</primary></indexterm></entry>
<entry><type>hstore</type></entry>
<entry>construct an <type>hstore</> from a record or row</entry>
<entry><literal>hstore(ROW(1,2))</literal></entry>
<entry><literal>f1=&gt;1,f2=&gt;2</literal></entry>
</row>
<row>
<entry><function>hstore(text[])</function></entry>
<entry><type>hstore</type></entry>
<entry>construct an <type>hstore</> from an array, which may be either
a key/value array, or a two-dimensional array</entry>
<entry><literal>hstore(ARRAY['a','1','b','2']) || hstore(ARRAY[['c','3'],['d','4']])</literal></entry>
<entry><literal>a=&gt;1, b=&gt;2, c=&gt;3, d=&gt;4</literal></entry>
</row>
<row>
<entry><function>hstore(text[], text[])</function></entry>
<entry><type>hstore</type></entry>
<entry>construct an <type>hstore</> from separate key and value arrays</entry>
<entry><literal>hstore(ARRAY['a','b'], ARRAY['1','2'])</literal></entry>
<entry><literal>"a"=&gt;"1","b"=&gt;"2"</literal></entry>
</row>
<row>
<entry><function>hstore(text, text)</function></entry>
<entry><type>hstore</type></entry>
<entry>make single-item <type>hstore</></entry>
<entry><literal>hstore('a', 'b')</literal></entry>
<entry><literal>"a"=&gt;"b"</literal></entry>
</row>
<row>
<entry><function>akeys(hstore)</function><indexterm><primary>akeys</primary></indexterm></entry>
<entry><type>text[]</type></entry>
<entry>get <type>hstore</>'s keys as an array</entry>
<entry><literal>akeys('a=&gt;1,b=&gt;2')</literal></entry>
<entry><literal>{a,b}</literal></entry>
</row>
<row>
<entry><function>skeys(hstore)</function><indexterm><primary>skeys</primary></indexterm></entry>
<entry><type>setof text</type></entry>
<entry>get <type>hstore</>'s keys as a set</entry>
<entry><literal>skeys('a=&gt;1,b=&gt;2')</literal></entry>
<entry>
<programlisting>
a
b
</programlisting></entry>
</row>
<row>
<entry><function>avals(hstore)</function><indexterm><primary>avals</primary></indexterm></entry>
<entry><type>text[]</type></entry>
<entry>get <type>hstore</>'s values as an array</entry>
<entry><literal>avals('a=&gt;1,b=&gt;2')</literal></entry>
<entry><literal>{1,2}</literal></entry>
</row>
<row>
<entry><function>svals(hstore)</function><indexterm><primary>svals</primary></indexterm></entry>
<entry><type>setof text</type></entry>
<entry>get <type>hstore</>'s values as a set</entry>
<entry><literal>svals('a=&gt;1,b=&gt;2')</literal></entry>
<entry>
<programlisting>
1
2
</programlisting></entry>
</row>
<row>
<entry><function>hstore_to_array(hstore)</function><indexterm><primary>hstore_to_array</primary></indexterm></entry>
<entry><type>text[]</type></entry>
<entry>get <type>hstore</>'s keys and values as an array of alternating
keys and values</entry>
<entry><literal>hstore_to_array('a=&gt;1,b=&gt;2')</literal></entry>
<entry><literal>{a,1,b,2}</literal></entry>
</row>
<row>
<entry><function>hstore_to_matrix(hstore)</function><indexterm><primary>hstore_to_matrix</primary></indexterm></entry>
<entry><type>text[]</type></entry>
<entry>get <type>hstore</>'s keys and values as a two-dimensional array</entry>
<entry><literal>hstore_to_matrix('a=&gt;1,b=&gt;2')</literal></entry>
<entry><literal>{{a,1},{b,2}}</literal></entry>
</row>
<row>
<entry><function>hstore_to_json(hstore)</function><indexterm><primary>hstore_to_json</primary></indexterm></entry>
<entry><type>json</type></entry>
<entry>get <type>hstore</type> as a <type>json</type> value, converting
all non-null values to JSON strings</entry>
<entry><literal>hstore_to_json('"a key"=&gt;1, b=&gt;t, c=&gt;null, d=&gt;12345, e=&gt;012345, f=&gt;1.234, g=&gt;2.345e+4')</literal></entry>
<entry><literal>{"a key": "1", "b": "t", "c": null, "d": "12345", "e": "012345", "f": "1.234", "g": "2.345e+4"}</literal></entry>
</row>
<row>
<entry><function>hstore_to_jsonb(hstore)</function><indexterm><primary>hstore_to_jsonb</primary></indexterm></entry>
<entry><type>jsonb</type></entry>
<entry>get <type>hstore</type> as a <type>jsonb</type> value, converting
all non-null values to JSON strings</entry>
<entry><literal>hstore_to_jsonb('"a key"=&gt;1, b=&gt;t, c=&gt;null, d=&gt;12345, e=&gt;012345, f=&gt;1.234, g=&gt;2.345e+4')</literal></entry>
<entry><literal>{"a key": "1", "b": "t", "c": null, "d": "12345", "e": "012345", "f": "1.234", "g": "2.345e+4"}</literal></entry>
</row>
<row>
<entry><function>hstore_to_json_loose(hstore)</function><indexterm><primary>hstore_to_json_loose</primary></indexterm></entry>
<entry><type>json</type></entry>
<entry>get <type>hstore</type> as a <type>json</type> value, but attempt to distinguish numerical and Boolean values so they are unquoted in the JSON</entry>
<entry><literal>hstore_to_json_loose('"a key"=&gt;1, b=&gt;t, c=&gt;null, d=&gt;12345, e=&gt;012345, f=&gt;1.234, g=&gt;2.345e+4')</literal></entry>
<entry><literal>{"a key": 1, "b": true, "c": null, "d": 12345, "e": "012345", "f": 1.234, "g": 2.345e+4}</literal></entry>
</row>
<row>
<entry><function>hstore_to_jsonb_loose(hstore)</function><indexterm><primary>hstore_to_jsonb_loose</primary></indexterm></entry>
<entry><type>jsonb</type></entry>
<entry>get <type>hstore</type> as a <type>jsonb</type> value, but attempt to distinguish numerical and Boolean values so they are unquoted in the JSON</entry>
<entry><literal>hstore_to_jsonb_loose('"a key"=&gt;1, b=&gt;t, c=&gt;null, d=&gt;12345, e=&gt;012345, f=&gt;1.234, g=&gt;2.345e+4')</literal></entry>
<entry><literal>{"a key": 1, "b": true, "c": null, "d": 12345, "e": "012345", "f": 1.234, "g": 2.345e+4}</literal></entry>
</row>
<row>
<entry><function>slice(hstore, text[])</function><indexterm><primary>slice</primary></indexterm></entry>
<entry><type>hstore</type></entry>
<entry>extract a subset of an <type>hstore</></entry>
<entry><literal>slice('a=&gt;1,b=&gt;2,c=&gt;3'::hstore, ARRAY['b','c','x'])</literal></entry>
<entry><literal>"b"=&gt;"2", "c"=&gt;"3"</literal></entry>
</row>
<row>
<entry><function>each(hstore)</function><indexterm><primary>each</primary></indexterm></entry>
<entry><type>setof(key text, value text)</type></entry>
<entry>get <type>hstore</>'s keys and values as a set</entry>
<entry><literal>select * from each('a=&gt;1,b=&gt;2')</literal></entry>
<entry>
<programlisting>
key | value
-----+-------
a | 1
b | 2
</programlisting></entry>
</row>
<row>
<entry><function>exist(hstore,text)</function><indexterm><primary>exist</primary></indexterm></entry>
<entry><type>boolean</type></entry>
<entry>does <type>hstore</> contain key?</entry>
<entry><literal>exist('a=&gt;1','a')</literal></entry>
<entry><literal>t</literal></entry>
</row>
<row>
<entry><function>defined(hstore,text)</function><indexterm><primary>defined</primary></indexterm></entry>
<entry><type>boolean</type></entry>
<entry>does <type>hstore</> contain non-<literal>NULL</> value for key?</entry>
<entry><literal>defined('a=&gt;NULL','a')</literal></entry>
<entry><literal>f</literal></entry>
</row>
<row>
<entry><function>delete(hstore,text)</function><indexterm><primary>delete</primary></indexterm></entry>
<entry><type>hstore</type></entry>
<entry>delete pair with matching key</entry>
<entry><literal>delete('a=&gt;1,b=&gt;2','b')</literal></entry>
<entry><literal>"a"=&gt;"1"</literal></entry>
</row>
<row>
<entry><function>delete(hstore,text[])</function></entry>
<entry><type>hstore</type></entry>
<entry>delete pairs with matching keys</entry>
<entry><literal>delete('a=&gt;1,b=&gt;2,c=&gt;3',ARRAY['a','b'])</literal></entry>
<entry><literal>"c"=&gt;"3"</literal></entry>
</row>
<row>
<entry><function>delete(hstore,hstore)</function></entry>
<entry><type>hstore</type></entry>
<entry>delete pairs matching those in the second argument</entry>
<entry><literal>delete('a=&gt;1,b=&gt;2','a=&gt;4,b=&gt;2'::hstore)</literal></entry>
<entry><literal>"a"=&gt;"1"</literal></entry>
</row>
<row>
<entry><function>populate_record(record,hstore)</function><indexterm><primary>populate_record</primary></indexterm></entry>
<entry><type>record</type></entry>
<entry>replace fields in <type>record</> with matching values from <type>hstore</></entry>
<entry>see Examples section</entry>
<entry></entry>
</row>
</tbody>
</tgroup>
</table>
<note>
<para>
The function <function>hstore_to_json</function> is used when
an <type>hstore</type> value is cast to <type>json</type>.
Likewise, <function>hstore_to_jsonb</function> is used when
an <type>hstore</type> value is cast to <type>jsonb</type>.
</para>
</note>
<note>
<para>
The function <function>populate_record</function> is actually declared
with <type>anyelement</>, not <type>record</>, as its first argument,
2010-08-17 06:37:21 +02:00
but it will reject non-record types with a run-time error.
</para>
</note>
</sect2>
<sect2>
<title>Indexes</title>
<para>
<type>hstore</> has GiST and GIN index support for the <literal>@&gt;</>,
<literal>?</>, <literal>?&amp;</> and <literal>?|</> operators. For example:
</para>
<programlisting>
CREATE INDEX hidx ON testhstore USING GIST (h);
CREATE INDEX hidx ON testhstore USING GIN (h);
</programlisting>
<para>
<type>hstore</> also supports <type>btree</> or <type>hash</> indexes for
the <literal>=</> operator. This allows <type>hstore</> columns to be
declared <literal>UNIQUE</>, or to be used in <literal>GROUP BY</>,
<literal>ORDER BY</> or <literal>DISTINCT</> expressions. The sort ordering
for <type>hstore</> values is not particularly useful, but these indexes
may be useful for equivalence lookups. Create indexes for <literal>=</>
comparisons as follows:
</para>
<programlisting>
CREATE INDEX hidx ON testhstore USING BTREE (h);
CREATE INDEX hidx ON testhstore USING HASH (h);
</programlisting>
</sect2>
<sect2>
<title>Examples</title>
<para>
Add a key, or update an existing key with a new value:
<programlisting>
UPDATE tab SET h = h || hstore('c', '3');
</programlisting>
</para>
<para>
Delete a key:
<programlisting>
UPDATE tab SET h = delete(h, 'k1');
</programlisting>
</para>
<para>
Convert a <type>record</> to an <type>hstore</>:
<programlisting>
CREATE TABLE test (col1 integer, col2 text, col3 text);
INSERT INTO test VALUES (123, 'foo', 'bar');
SELECT hstore(t) FROM test AS t;
hstore
---------------------------------------------
"col1"=&gt;"123", "col2"=&gt;"foo", "col3"=&gt;"bar"
(1 row)
</programlisting>
</para>
<para>
Convert an <type>hstore</> to a predefined <type>record</> type:
<programlisting>
CREATE TABLE test (col1 integer, col2 text, col3 text);
SELECT * FROM populate_record(null::test,
'"col1"=&gt;"456", "col2"=&gt;"zzz"');
col1 | col2 | col3
------+------+------
456 | zzz |
(1 row)
</programlisting>
</para>
<para>
Modify an existing record using the values from an <type>hstore</>:
<programlisting>
CREATE TABLE test (col1 integer, col2 text, col3 text);
INSERT INTO test VALUES (123, 'foo', 'bar');
SELECT (r).* FROM (SELECT t #= '"col3"=&gt;"baz"' AS r FROM test t) s;
col1 | col2 | col3
------+------+------
123 | foo | baz
(1 row)
</programlisting>
</para>
</sect2>
<sect2>
<title>Statistics</title>
<para>
The <type>hstore</> type, because of its intrinsic liberality, could
contain a lot of different keys. Checking for valid keys is the task of the
application. The following examples demonstrate several techniques for
checking keys and obtaining statistics.
</para>
<para>
Simple example:
<programlisting>
SELECT * FROM each('aaa=&gt;bq, b=&gt;NULL, ""=&gt;1');
</programlisting>
</para>
<para>
Using a table:
<programlisting>
SELECT (each(h)).key, (each(h)).value INTO stat FROM testhstore;
</programlisting>
</para>
<para>
Online statistics:
<programlisting>
SELECT key, count(*) FROM
(SELECT (each(h)).key FROM testhstore) AS stat
GROUP BY key
ORDER BY count DESC, key;
key | count
-----------+-------
line | 883
query | 207
pos | 203
node | 202
space | 197
status | 195
public | 194
title | 190
org | 189
...................
</programlisting>
</para>
</sect2>
<sect2>
<title>Compatibility</title>
<para>
As of PostgreSQL 9.0, <type>hstore</> uses a different internal
representation than previous versions. This presents no obstacle for
dump/restore upgrades since the text representation (used in the dump) is
unchanged.
</para>
<para>
In the event of a binary upgrade, upward compatibility is maintained by
having the new code recognize old-format data. This will entail a slight
performance penalty when processing data that has not yet been modified by
the new code. It is possible to force an upgrade of all values in a table
column by doing an <literal>UPDATE</> statement as follows:
<programlisting>
UPDATE tablename SET hstorecol = hstorecol || '';
</programlisting>
</para>
<para>
Another way to do it is:
<programlisting>
ALTER TABLE tablename ALTER hstorecol TYPE hstore USING hstorecol || '';
</programlisting>
The <command>ALTER TABLE</> method requires an exclusive lock on the table,
but does not result in bloating the table with old row versions.
</para>
</sect2>
<sect2>
<title>Transforms</title>
<para>
Additional extensions are available that implement transforms for
the <type>hstore</type> type for the languages PL/Perl and PL/Python. The
extensions for PL/Perl are called <literal>hstore_plperl</literal>
and <literal>hstore_plperlu</literal>, for trusted and untrusted PL/Perl.
If you install these transforms and specify them when creating a
function, <type>hstore</type> values are mapped to Perl hashes. The
extensions for PL/Python are
called <literal>hstore_plpythonu</literal>, <literal>hstore_plpython2u</literal>,
and <literal>hstore_plpython3u</literal>
(see <xref linkend="plpython-python23"> for the PL/Python naming
convention). If you use them, <type>hstore</type> values are mapped to
Python dictionaries.
</para>
</sect2>
<sect2>
<title>Authors</title>
<para>
Oleg Bartunov <email>oleg@sai.msu.su</email>, Moscow, Moscow University, Russia
</para>
<para>
Teodor Sigaev <email>teodor@sigaev.ru</email>, Moscow, Delta-Soft Ltd., Russia
</para>
<para>
Additional enhancements by Andrew Gierth <email>andrew@tao11.riddles.org.uk</email>,
United Kingdom
</para>
</sect2>
</sect1>