Just-in-Time Compilation (<acronym>JIT</acronym>) JIT Just-In-Time compilation JIT This chapter explains what just-in-time compilation is, and how it can be configured in PostgreSQL. What is <acronym>JIT</acronym> compilation? Just-in-time compilation (JIT) is the process of turning some form of interpreted program evaluation into a native program, and doing so at runtime. For example, instead of using a facility that can evaluate arbitrary SQL expressions to evaluate an SQL predicate like WHERE a.col = 3, it is possible to generate a function than can be natively executed by the CPU that just handles that expression, yielding a speedup. PostgreSQL has builtin support to perform JIT compilation using LLVM when PostgreSQL was built with --with-llvm (see ). See src/backend/jit/README for further details. <acronym>JIT</acronym> Accelerated Operations Currently PostgreSQL's JIT implementation has support for accelerating expression evaluation and tuple deforming. Several other operations could be accelerated in the future. Expression evaluation is used to evaluate WHERE clauses, target lists, aggregates and projections. It can be accelerated by generating code specific to each case. Tuple deforming is the process of transforming an on-disk tuple (see ) into its in-memory representation. It can be accelerated by creating a function specific to the table layout and the number of columns to be extracted. Optimization LLVM has support for optimizing generated code. Some of the optimizations are cheap enough to be performed whenever JIT is used, while others are only beneficial for longer running queries. See for more details about optimizations. Inlining PostgreSQL is very extensible and allows new datatypes, functions, operators and other database objects to be defined; see . In fact the built-in ones are implemented using nearly the same mechanisms. This extensibility implies some overhead, for example due to function calls (see ). To reduce that overhead JIT compilation can inline the body for small functions into the expression using them. That allows a significant percentage of the overhead to be optimized away. When to <acronym>JIT</acronym>? JIT compilation is beneficial primarily for long-running CPU bound queries. Frequently these will be analytical queries. For short queries the added overhead of performing JIT compilation will often be higher than the time it can save. To determine whether JIT compilation is used, the total cost of a query (see and ) is used. The cost of the query will be compared with GUC. If the cost is higher, JIT compilation will be performed. If the planner, based on the above criterion, decided that JIT compilation is beneficial, two further decisions are made. Firstly, if the query is more costly than the GUC, expensive optimizations are used to improve the generated code. Secondly, if the query is more costly than the GUC, short functions and operators used in the query will be inlined. Both of these operations increase the JIT overhead, but can reduce query execution time considerably. This cost based decision will be made at plan time, not execution time. This means that when prepared statements are in use, and the generic plan is used (see ), the values of the GUCs set at prepare time take effect, not the settings at execution time. If is set to off, or no JIT implementation is available (for example because the server was compiled without --with-llvm), JIT will not be performed, even if considered to be beneficial based on the above criteria. Setting to off takes effect both at plan and at execution time. can be used to see whether JIT is used or not. As an example, here is a query that is not using JIT: =# EXPLAIN ANALYZE SELECT SUM(relpages) FROM pg_class; ┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ QUERY PLAN │ ├─────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Aggregate (cost=16.27..16.29 rows=1 width=8) (actual time=0.303..0.303 rows=1 loops=1) │ │ -> Seq Scan on pg_class (cost=0.00..15.42 rows=342 width=4) (actual time=0.017..0.111 rows=356 loops=1) │ │ Planning Time: 0.116 ms │ │ Execution Time: 0.365 ms │ └─────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ (4 rows) Given the cost of the plan, it is entirely reasonable that no JIT was used, the cost of JIT would have been bigger than the savings. Adjusting the cost limits will lead to JIT use: =# SET jit_above_cost = 10; SET =# EXPLAIN ANALYZE SELECT SUM(relpages) FROM pg_class; ┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ QUERY PLAN │ ├─────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Aggregate (cost=16.27..16.29 rows=1 width=8) (actual time=6.049..6.049 rows=1 loops=1) │ │ -> Seq Scan on pg_class (cost=0.00..15.42 rows=342 width=4) (actual time=0.019..0.052 rows=356 loops=1) │ │ Planning Time: 0.133 ms │ │ JIT: │ │ Functions: 3 │ │ Generation Time: 1.259 ms │ │ Inlining: false │ │ Inlining Time: 0.000 ms │ │ Optimization: false │ │ Optimization Time: 0.797 ms │ │ Emission Time: 5.048 ms │ │ Execution Time: 7.416 ms │ └─────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ As visible here, JIT was used, but inlining and expensive optimization were not. If , were lowered, just like , that would change. Configuration determines whether JIT compilation is enabled or disabled. As explained in the configuration variables , , decide whether JIT compilation is performed for a query, and how much effort is spent doing so. For development and debugging purposes a few additional GUCs exist. allows the generated bitcode to be inspected. allows GDB to see generated functions. emits information so the perf profiler can interpret JIT generated functions sensibly. determines which JIT implementation is used. It rarely is required to be changed. See . Extensibility Inlining Support for Extensions PostgreSQL's JIT implementation can inline the implementation of operators and functions (of type C and internal). See . To do so for functions in extensions, the definition of these functions needs to be made available. When using PGXS to build an extension against a server that has been compiled with LLVM support, the relevant files will be installed automatically. The relevant files have to be installed into $pkglibdir/bitcode/$extension/ and a summary of them to $pkglibdir/bitcode/$extension.index.bc, where $pkglibdir is the directory returned by pg_config --pkglibdir and $extension the basename of the extension's shared library. For functions built into PostgreSQL itself, the bitcode is installed into $pkglibdir/bitcode/postgres. Pluggable <acronym>JIT</acronym> Provider PostgreSQL provides a JIT implementation based on LLVM. The interface to the JIT provider is pluggable and the provider can be changed without recompiling. The provider is chosen via the GUC. <acronym>JIT</acronym> Provider Interface A JIT provider is loaded by dynamically loading the named shared library. The normal library search path is used to locate the library. To provide the required JIT provider callbacks and to indicate that the library is actually a JIT provider it needs to provide a function named _PG_jit_provider_init. This function is passed a struct that needs to be filled with the callback function pointers for individual actions. struct JitProviderCallbacks { JitProviderResetAfterErrorCB reset_after_error; JitProviderReleaseContextCB release_context; JitProviderCompileExprCB compile_expr; }; extern void _PG_jit_provider_init(JitProviderCallbacks *cb);