ltree
ltree
This module implements a data type ltree for representing
labels of data stored in a hierarchical tree-like structure.
Extensive facilities for searching through label trees are provided.
Definitions
A label is a sequence of alphanumeric characters
and underscores (for example, in C locale the characters
A-Za-z0-9_ are allowed). Labels must be less than 256 bytes
long.
Examples: 42, Personal_Services
A label path is a sequence of zero or more
labels separated by dots, for example L1.L2.L3, representing
a path from the root of a hierarchical tree to a particular node. The
length of a label path must be less than 65kB, but keeping it under 2kB is
preferable.
Example: Top.Countries.Europe.Russia
The ltree module provides several data types:
ltree stores a label path.
lquery represents a regular-expression-like pattern
for matching ltree values. A simple word matches that
label within a path. A star symbol (*) matches zero
or more labels. For example:
foo Match the exact label path foo
*.foo.* Match any label path containing the label foo
*.foo Match any label path whose last label is foo
Star symbols can also be quantified to restrict how many labels
they can match:
*{n} Match exactly n labels
*{n,} Match at least n labels
*{n,m} Match at least n but not more than m labels
*{,m} Match at most m labels — same as *{0,m}
There are several modifiers that can be put at the end of a non-star
label in lquery to make it match more than just the exact match:
@ Match case-insensitively, for example a@ matches A
* Match any label with this prefix, for example foo* matches foobar
% Match initial underscore-separated words
The behavior of % is a bit complicated. It tries to match
words rather than the entire label. For example
foo_bar% matches foo_bar_baz but not
foo_barbaz. If combined with *, prefix
matching applies to each word separately, for example
foo_bar%* matches foo1_bar2_baz but
not foo1_br2_baz.
Also, you can write several possibly-modified labels separated with
| (OR) to match any of those labels, and you can put
! (NOT) at the start to match any label that doesn't
match any of the alternatives.
Here's an annotated example of lquery:
Top.*{0,2}.sport*@.!football|tennis.Russ*|Spain
a. b. c. d. e.
This query will match any label path that:
begins with the label Top
and next has zero to two labels before
a label beginning with the case-insensitive prefix sport
then a label not matching football nor
tennis
and then ends with a label beginning with Russ or
exactly matching Spain.
ltxtquery represents a full-text-search-like
pattern for matching ltree values. An
ltxtquery value contains words, possibly with the
modifiers @, *, % at the end;
the modifiers have the same meanings as in lquery.
Words can be combined with & (AND),
| (OR), ! (NOT), and parentheses.
The key difference from
lquery is that ltxtquery matches words without
regard to their position in the label path.
Here's an example ltxtquery:
Europe & Russia*@ & !Transportation
This will match paths that contain the label Europe and
any label beginning with Russia (case-insensitive),
but not paths containing the label Transportation.
The location of these words within the path is not important.
Also, when % is used, the word can be matched to any
underscore-separated word within a label, regardless of position.
Note: ltxtquery allows whitespace between symbols, but
ltree and lquery do not.
Operators and Functions
Type ltree has the usual comparison operators
=, <>,
<, >, <=, >=.
Comparison sorts in the order of a tree traversal, with the children
of a node sorted by label text. In addition, the specialized
operators shown in are available.
ltree Operators
Operator
Returns
Description
ltree @> ltree
boolean
is left argument an ancestor of right (or equal)?
ltree <@ ltree
boolean
is left argument a descendant of right (or equal)?
ltree ~ lquery
boolean
does ltree match lquery?
lquery ~ ltree
boolean
does ltree match lquery?
ltree ? lquery[]
boolean
does ltree match any lquery in array?
lquery[] ? ltree
boolean
does ltree match any lquery in array?
ltree @ ltxtquery
boolean
does ltree match ltxtquery?
ltxtquery @ ltree
boolean
does ltree match ltxtquery?
ltree || ltree
ltree
concatenate ltree paths
ltree || text
ltree
convert text to ltree and concatenate
text || ltree
ltree
convert text to ltree and concatenate
ltree[] @> ltree
boolean
does array contain an ancestor of ltree?
ltree <@ ltree[]
boolean
does array contain an ancestor of ltree?
ltree[] <@ ltree
boolean
does array contain a descendant of ltree?
ltree @> ltree[]
boolean
does array contain a descendant of ltree?
ltree[] ~ lquery
boolean
does array contain any path matching lquery?
lquery ~ ltree[]
boolean
does array contain any path matching lquery?
ltree[] ? lquery[]
boolean
does ltree array contain any path matching any lquery?
lquery[] ? ltree[]
boolean
does ltree array contain any path matching any lquery?
ltree[] @ ltxtquery
boolean
does array contain any path matching ltxtquery?
ltxtquery @ ltree[]
boolean
does array contain any path matching ltxtquery?
ltree[] ?@> ltree
ltree
first array entry that is an ancestor of ltree; NULL if none
ltree[] ?<@ ltree
ltree
first array entry that is a descendant of ltree; NULL if none
ltree[] ?~ lquery
ltree
first array entry that matches lquery; NULL if none
ltree[] ?@ ltxtquery
ltree
first array entry that matches ltxtquery; NULL if none
The operators <@, @>,
@ and ~ have analogues
^<@, ^@>, ^@,
^~, which are the same except they do not use
indexes. These are useful only for testing purposes.
The available functions are shown in .
ltree Functions
Function
Return Type
Description
Example
Result
subltree(ltree, int start, int end)subltree
ltree
subpath of ltree from position start to
position end-1 (counting from 0)
subltree('Top.Child1.Child2',1,2)
Child1
subpath(ltree, int offset, int len)subpath
ltree
subpath of ltree starting at position
offset, length len.
If offset is negative, subpath starts that far from the
end of the path. If len is negative, leaves that many
labels off the end of the path.
subpath('Top.Child1.Child2',0,2)
Top.Child1
subpath(ltree, int offset)
ltree
subpath of ltree starting at position
offset, extending to end of path.
If offset is negative, subpath starts that far from the
end of the path.
subpath('Top.Child1.Child2',1)
Child1.Child2
nlevel(ltree)nlevel
integer
number of labels in path
nlevel('Top.Child1.Child2')
3
index(ltree a, ltree b)index
integer
position of first occurrence of b in
a; -1 if not found
index('0.1.2.3.5.4.5.6.8.5.6.8','5.6')
6
index(ltree a, ltree b, int offset)
integer
position of first occurrence of b in
a, searching starting at offset;
negative offset means start -offset
labels from the end of the path
index('0.1.2.3.5.4.5.6.8.5.6.8','5.6',-4)
9
text2ltree(text)text2ltree
ltree
cast text to ltree
ltree2text(ltree)ltree2text
text
cast ltree to text
lca(ltree, ltree, ...)lca
ltree
longest common ancestor of paths
(up to 8 arguments supported)
lca('1.2.3','1.2.3.4.5.6')
1.2
lca(ltree[])
ltree
longest common ancestor of paths in array
lca(array['1.2.3'::ltree,'1.2.3.4'])
1.2
Indexes
ltree supports several types of indexes that can speed
up the indicated operators:
B-tree index over ltree:
<, <=, =,
>=, >
GiST index over ltree:
<, <=, =,
>=, >,
@>, <@,
@, ~, ?
Example of creating such an index:
CREATE INDEX path_gist_idx ON test USING GIST (path);
GiST index over ltree[]:
ltree[] <@ ltree, ltree @> ltree[],
@, ~, ?
Example of creating such an index:
CREATE INDEX path_gist_idx ON test USING GIST (array_path);
Note: This index type is lossy.
Example
This example uses the following data (also available in file
contrib/ltree/ltreetest.sql in the source distribution):
CREATE TABLE test (path ltree);
INSERT INTO test VALUES ('Top');
INSERT INTO test VALUES ('Top.Science');
INSERT INTO test VALUES ('Top.Science.Astronomy');
INSERT INTO test VALUES ('Top.Science.Astronomy.Astrophysics');
INSERT INTO test VALUES ('Top.Science.Astronomy.Cosmology');
INSERT INTO test VALUES ('Top.Hobbies');
INSERT INTO test VALUES ('Top.Hobbies.Amateurs_Astronomy');
INSERT INTO test VALUES ('Top.Collections');
INSERT INTO test VALUES ('Top.Collections.Pictures');
INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy');
INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy.Stars');
INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy.Galaxies');
INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy.Astronauts');
CREATE INDEX path_gist_idx ON test USING GIST (path);
CREATE INDEX path_idx ON test USING BTREE (path);
Now, we have a table test populated with data describing
the hierarchy shown below:
Top
/ | \
Science Hobbies Collections
/ | \
Astronomy Amateurs_Astronomy Pictures
/ \ |
Astrophysics Cosmology Astronomy
/ | \
Galaxies Stars Astronauts
We can do inheritance:
ltreetest=> SELECT path FROM test WHERE path <@ 'Top.Science';
path
------------------------------------
Top.Science
Top.Science.Astronomy
Top.Science.Astronomy.Astrophysics
Top.Science.Astronomy.Cosmology
(4 rows)
Here are some examples of path matching:
ltreetest=> SELECT path FROM test WHERE path ~ '*.Astronomy.*';
path
-----------------------------------------------
Top.Science.Astronomy
Top.Science.Astronomy.Astrophysics
Top.Science.Astronomy.Cosmology
Top.Collections.Pictures.Astronomy
Top.Collections.Pictures.Astronomy.Stars
Top.Collections.Pictures.Astronomy.Galaxies
Top.Collections.Pictures.Astronomy.Astronauts
(7 rows)
ltreetest=> SELECT path FROM test WHERE path ~ '*.!pictures@.*.Astronomy.*';
path
------------------------------------
Top.Science.Astronomy
Top.Science.Astronomy.Astrophysics
Top.Science.Astronomy.Cosmology
(3 rows)
Here are some examples of full text search:
ltreetest=> SELECT path FROM test WHERE path @ 'Astro*% & !pictures@';
path
------------------------------------
Top.Science.Astronomy
Top.Science.Astronomy.Astrophysics
Top.Science.Astronomy.Cosmology
Top.Hobbies.Amateurs_Astronomy
(4 rows)
ltreetest=> SELECT path FROM test WHERE path @ 'Astro* & !pictures@';
path
------------------------------------
Top.Science.Astronomy
Top.Science.Astronomy.Astrophysics
Top.Science.Astronomy.Cosmology
(3 rows)
Path construction using functions:
ltreetest=> SELECT subpath(path,0,2)||'Space'||subpath(path,2) FROM test WHERE path <@ 'Top.Science.Astronomy';
?column?
------------------------------------------
Top.Science.Space.Astronomy
Top.Science.Space.Astronomy.Astrophysics
Top.Science.Space.Astronomy.Cosmology
(3 rows)
We could simplify this by creating a SQL function that inserts a label
at a specified position in a path:
CREATE FUNCTION ins_label(ltree, int, text) RETURNS ltree
AS 'select subpath($1,0,$2) || $3 || subpath($1,$2);'
LANGUAGE SQL IMMUTABLE;
ltreetest=> SELECT ins_label(path,2,'Space') FROM test WHERE path <@ 'Top.Science.Astronomy';
ins_label
------------------------------------------
Top.Science.Space.Astronomy
Top.Science.Space.Astronomy.Astrophysics
Top.Science.Space.Astronomy.Cosmology
(3 rows)
Transforms
Additional extensions are available that implement transforms for
the ltree type for PL/Python. The extensions are
called ltree_plpythonu, ltree_plpython2u,
and ltree_plpython3u
(see for the PL/Python naming
convention). If you install these transforms and specify them when
creating a function, ltree values are mapped to Python lists.
(The reverse is currently not supported, however.)
Authors
All work was done by Teodor Sigaev (teodor@stack.net) and
Oleg Bartunov (oleg@sai.msu.su). See
for
additional information. Authors would like to thank Eugeny Rodichev for
helpful discussions. Comments and bug reports are welcome.