find out about it is to read the documentation that tells you how
dangerous it is. Add default_transaction_read_only to documentation;
seems to have been overlooked in patch that added read-only transactions.
Clean up check_guc comparison script, which has been suffering bit rot.
page when it's read in, per pghackers discussion around 17-Feb. Add a
GUC variable zero_damaged_pages that causes the response to be a WARNING
followed by zeroing the page, rather than the normal ERROR; this is per
Hiroshi's suggestion that there needs to be a way to get at the data
in the rest of the table.
(materialization into a tuple store) discussed on pgsql-hackers earlier.
I've updated the documentation and the regression tests.
Notes on the implementation:
- I needed to change the tuple store API slightly -- it assumes that it
won't be used to hold data across transaction boundaries, so the temp
files that it uses for on-disk storage are automatically reclaimed at
end-of-transaction. I added a flag to tuplestore_begin_heap() to control
this behavior. Is changing the tuple store API in this fashion OK?
- in order to store executor results in a tuple store, I added a new
CommandDest. This works well for the most part, with one exception: the
current DestFunction API doesn't provide enough information to allow the
Executor to store results into an arbitrary tuple store (where the
particular tuple store to use is chosen by the call site of
ExecutorRun). To workaround this, I've temporarily hacked up a solution
that works, but is not ideal: since the receiveTuple DestFunction is
passed the portal name, we can use that to lookup the Portal data
structure for the cursor and then use that to get at the tuple store the
Portal is using. This unnecessarily ties the Portal code with the
tupleReceiver code, but it works...
The proper fix for this is probably to change the DestFunction API --
Tom suggested passing the full QueryDesc to the receiveTuple function.
In that case, callers of ExecutorRun could "subclass" QueryDesc to add
any additional fields that their particular CommandDest needed to get
access to. This approach would work, but I'd like to think about it for
a little bit longer before deciding which route to go. In the mean time,
the code works fine, so I don't think a fix is urgent.
- (semi-related) I added a NO SCROLL keyword to DECLARE CURSOR, and
adjusted the behavior of SCROLL in accordance with the discussion on
-hackers.
- (unrelated) Cleaned up some SGML markup in sql.sgml, copy.sgml
Neil Conway
them as arrays of the internal datatype. This requires treating the
stavalues columns as 'anyarray' rather than 'text[]', which is not 100%
kosher but seems to work fine for the purposes we need for pg_statistic.
Perhaps in the future 'anyarray' will be allowed more generally.
some of the algorithms for higher functions. I see about a factor of ten
speedup on the 'numeric' regression test, but it's unlikely that that test
is representative of real-world applications.
initdb forced due to change of on-disk representation for NUMERIC.
what is capable using integer-datatime timestamps. It does attempt
to exercise the maximum allowable timestamp range.
Also is a small error check when converting a timestamp from external
to internal format that prevents out of range timestamps from being
entered.
Files patched:
Index: src/backend/utils/adt/timestamp.c
Added range check to prevent out of range timestamps
from being used.
Index: src/test/regress/sql/horology.sql
Index: src/test/regress/expected/horology-no-DST-before-1970.out
Index: src/test/regress/expected/horology-solaris-1947.out
Limited range of timestamps being checked to
Jan 1, 4713 BC to Dec 31, 294276
In creating this patch, I have seen some definite problems with integer
timestamps and how they react when used near their limits. For example,
the following statement gives the correct result:
SELECT timestamp without time zone 'Jan 1, 4713 BC'
+ interval '109203489 days' AS "Dec 31, 294276";
However, this statement which is the logical inverse of the above
gives incorrect results:
SELECT timestamp without time zone '12/31/294276'
- timestamp without time zone 'Jan 1, 4713 BC' AS "109203489 Days";
John Cochran
patch fix it -- but this patch doesn't contains tests or docs fixes. I
will send it later.
Fixed outputs:
select to_char(x, '9999.999') as x,
to_char(x, 'S9999.999') as s,
to_char(x, 'SG9999.999') as sg,
to_char(x, 'MI9999.999') as mi,
to_char(x, 'PL9999.999') as pl,
to_char(x, 'PLMI9999.999') as plmi,
to_char(x, '9999.999SG') as sg2,
to_char(x, '9999.999PL') as pl2,
to_char(x, '9999.999MI') as mi2 from num;
Karel Zak
> weird behavior across fork boundaries; (b) the additional memory space
> that has to be duplicated into child processes will cost something per
> child launch, even if the child never uses it. But these are only
> arguments that it might not *always* be a prudent thing to do, not that
> we shouldn't give the DBA the tool to do it if he wants. So fire away.
Here is a patch for the above, including a documentation update. It
creates a new GUC variable "preload_libraries", that accepts a list in
the form:
preload_libraries = '$libdir/mylib1:initfunc,$libdir/mylib2'
If ":initfunc" is omitted or not found, no initialization function is
executed, but the library is still preloaded. If "$libdir/mylib" isn't
found, the postmaster refuses to start.
In my testing with PL/R, it reduces the first call to a PL/R function
(after connecting) from almost 2 seconds, down to about 8 ms.
Joe Conway
division and modulo functions, to avoid problems on OS X (which fails to
trap 0 divide at all) and Windows (which traps it in some bizarre
nonstandard fashion). Standardize on 'division by zero' as the one true
spelling of this error message. Add regression tests as suggested by
Neil Conway.
utility statement (DeclareCursorStmt) with a SELECT query dangling from
it, rather than a SELECT query with a few unusual fields in it. Add
code to determine whether a planned query can safely be run backwards.
If DECLARE CURSOR specifies SCROLL, ensure that the plan can be run
backwards by adding a Materialize plan node if it can't. Without SCROLL,
you get an error if you try to fetch backwards from a cursor that can't
handle it. (There is still some discussion about what the exact
behavior should be, but this is necessary infrastructure in any case.)
Along the way, make EXPLAIN DECLARE CURSOR work.
entire contents of the subplan into the tuplestore before we can return
any tuples. Instead, the tuplestore holds what we've already read, and
we fetch additional rows from the subplan as needed. Random access to
the previously-read rows works with the tuplestore, and doesn't affect
the state of the partially-read subplan. This is a step towards fixing
the problems with cursors over complex queries --- we don't want to
stick in Materialize nodes if they'll prevent quick startup for a cursor.
Adjustable threshold is gone in favor of keeping track of total requested
page storage and doling out proportional fractions to each relation
(with a minimum amount per relation, and some quantization of the results
to avoid thrashing with small changes in page counts). Provide special-
case code for indexes so as not to waste space storing useless page
free space counts. Restructure internal data storage to be a flat array
instead of list-of-chunks; this may cost a little more work in data
copying when reorganizing, but allows binary search to be used during
lookup_fsm_page_entry().
is assumed to be in local time, not GMT. This improves consistency with
other operations, which all assume local timezone when it matters. Per
bug #897.
setting timezone-related variables during transaction start. They were
not used anyway in platforms that HAVE_TM_ZONE or HAVE_INT_TIMEZONE,
which it appears is *all* the platforms we are currently supporting.
For platforms that have neither, we now only support UTC or numeric-
offset-from-UTC timezones.
answer when SET TIMEZONE has been done since the start of the current
transaction. Per bug report from Robert Haas.
I plan some futher cleanup in HEAD, but this is a low-risk patch for
the immediate issue in 7.3.
correctly. See following thread for more details.
Subject: [HACKERS] client_encoding directive is ignored in postgresql.conf
From: Tatsuo Ishii <t-ishii@sra.co.jp>
Date: Wed, 29 Jan 2003 22:24:04 +0900 (JST)
RelOid_pg_class, and transaction locks XactLockTableId. RelId is renamed
to objId.
- LockObject() and UnlockObject() functions created, and their use
sprinkled throughout the code to do descent locking for domains and
types. They accept lock modes AccessShare and AccessExclusive, as we
only really need a 'read' and 'write' lock at the moment. Most locking
cases are held until the end of the transaction.
This fixes the cases Tom mentioned earlier in regards to locking with
Domains. If the patch is good, I'll work on cleaning up issues with
other database objects that have this problem (most of them).
Rod Taylor
functions which limited the maximum date for a timestamp to AD 1465001.
The new limit is AD 5874897.
The files affected are:
doc/src/sgml/datatype.sgml:
Documentation change due to patch. Included is a notice about
the reduced range when using an eight-byte integer for timestamps.
src/backend/utils/adt/datetime.c:
Replacement functions for j2date() and date2j() functions.
src/include/utils/datetime.h:
Corrected a bug with the limit on the earliest possible date,
Nov 23,-4713 has a Julian day count of -1. The earliest possible
date should be Nov 24, -4713 with a day count of 0.
src/test/regress/expected/horology-no-DST-before-1970.out:
src/test/regress/expected/horology-solaris-1947.out:
src/test/regress/expected/horology.out:
Copies of expected output for regression testing.
Note: Only horology.out has been physically tested. I do not have access
to a Solaris box and I don't know how to provoke the "pre-1970" test.
src/test/regress/sql/horology.sql:
Added some test cases to check extended range.
John Cochran
takes two parameters, an OID x and an integer y, and returns "true" with
probability 1/y (the OID argument is ignored). This can be useful -- for
example, it can be used to select a random sampling of the rows in a
table (which is what the "random" regression test uses it for).
This patch removes that function, because it was old and messy. The old
function had the following problems:
- it was undocumented
- it was poorly named
- it was designed to workaround an optimizer bug that no longer exists
(the OID argument is to ensure that the optimizer won't optimize away
calls to the function; AFAIK marking the function as 'volatile' suffices
nowadays)
- it used a different random-number generation technique than the other
PSRNG-related functions in the backend do (it called random() like they
do, but it had its own logic for setting a set and deciding when to
reseed the RNG).
Ok, this patch removes oidrand(), oidsrand(), and userfntest(), and
improves the SGML docs a little bit (un-commenting the setseed()
documentation).
Neil Conway
CHECK constraints.
There are apparently no other types of constraint in pg_constraint, so
now all bases are covered. Also, this patch assumes that consrc for a
CHECK constraint is always bracketed so that it's not necessary to add
extra brackets.
Christopher Kings-Lynne
rid of the assumption that sizeof(Oid)==sizeof(int). This is one small
step towards someday supporting 8-byte OIDs. For the moment, it doesn't
do much except get rid of a lot of unsightly casts.
expression accepted by the regex operators, per discussion yesterday.
Along the way, reduce deadlock_timeout from PGC_POSTMASTER to PGC_SIGHUP
category. It is probably best to insist that all backends share the same
setting, but that doesn't mean it has to be frozen at startup.
(extracted from Tcl 8.4.1 release, as Henry still hasn't got round to
making it a separate library). This solves a performance problem for
multibyte, as well as upgrading our regexp support to match recent Tcl
and nearly match recent Perl.
startup, not in the parser; this allows ALTER DOMAIN to work correctly
with domain constraint operations stored in rules. Rod Taylor;
code review by Tom Lane.
for type 'time without time zone', as we already did for type
'timestamp without time zone'. This patch was proposed by Tom Lockhart
on 7-Nov-02, but he never got around to applying it. Adjust regression
tests and documentation to match.
cannot actually happen at present because ArrayCount() is only called
on strings beginning with '{', but seems best to prevent it going forward.
Per report from Yichen Xie.
value of MAX_TIME_PRECISION in floating-point-timestamp-storage case
from 13 to 10, which is as much as time_out is actually willing to print.
(The alternative of increasing the number of digits we are willing to
print looks risky; we might find ourselves printing roundoff garbage.)
passed to join selectivity estimators. Make use of this in eqjoinsel
to derive non-bogus selectivity for IN clauses. Further tweaking of
cost estimation for IN.
initdb forced because of pg_proc.h changes.
Try to model the effect of rescanning input tuples in mergejoins;
account for JOIN_IN short-circuiting where appropriate. Also, recognize
that mergejoin and hashjoin clauses may now be more than single operator
calls, so we have to charge appropriate execution costs.
necessarily following the JOIN syntax to develop the query plan. The old
behavior is still available by setting GUC variable JOIN_COLLAPSE_LIMIT
to 1. Also create a GUC variable FROM_COLLAPSE_LIMIT to control the
similar decision about when to collapse sub-SELECT lists into their parent
lists. (This behavior existed already, but the limit was always
GEQO_THRESHOLD/2; now it's separately adjustable.)
of the socket file and socket lock file; this should prevent both of them
from being removed by even the stupidest varieties of /tmp-cleaning
script. Per suggestion from Giles Lean.
of known-equal expressions includes any constant expressions (including
Params from outer queries), we actively suppress any 'var = var'
clauses that are or could be deduced from the set, generating only the
deducible 'var = const' clauses instead. The idea here is to push down
the restrictions implied by the equality set to base relations whenever
possible. Once we have applied the 'var = const' clauses, the 'var = var'
clauses are redundant, and should be suppressed both to save work at
execution and to avoid double-counting restrictivity.
There are two implementation techniques: the executor understands a new
JOIN_IN jointype, which emits at most one matching row per left-hand row,
or the result of the IN's sub-select can be fed through a DISTINCT filter
and then joined as an ordinary relation.
Along the way, some minor code cleanup in the optimizer; notably, break
out most of the jointree-rearrangement preprocessing in planner.c and
put it in a new file prep/prepjointree.c.
datetime token tables. Even more embarrassing, the regression tests
revealed some of the problems --- but evidently the bogus output wasn't
questioned. Add code to postmaster startup to directly check the tables
for correct ordering, in hopes of not being embarrassed like this again.
containing a volatile function), rather than only on 'Var = Var' clauses
as before. This makes it practical to do flatten_join_alias_vars at the
start of planning, which in turn eliminates a bunch of klugery inside the
planner to deal with alias vars. As a free side effect, we now detect
implied equality of non-Var expressions; for example in
SELECT ... WHERE a.x = b.y and b.y = 42
we will deduce a.x = 42 and use that as a restriction qual on a. Also,
we can remove the restriction introduced 12/5/02 to prevent pullup of
subqueries whose targetlists contain sublinks.
Still TODO: make statistical estimation routines in selfuncs.c and costsize.c
smarter about expressions that are more complex than plain Vars. The need
for this is considerably greater now that we have to be able to estimate
the suitability of merge and hash join techniques on such expressions.
costs for expression evaluation, not only per-tuple cost as before.
This extension is needed in order to deal realistically with hashed or
materialized sub-selects.
>
> I'd suggest that the runtime.sgml description explicitly say "values of
> at least a few thousand are recommended for production installations".
Neil Conway
Simplify SubLink by storing just a List of operator OIDs, instead of
a list of incomplete OpExprs --- that was a bizarre and bulky choice,
with no redeeming social value since we have to build new OpExprs
anyway when forming the plan tree.
'NOT (x IN (subselect))', that is 'NOT (x = ANY (subselect))',
rather than 'x <> ALL (subselect)' as we formerly did. This
opens the door to optimizing NOT IN the same way as IN, whereas
there's no hope of optimizing the expression using <>. Also,
convert 'x <> ALL (subselect)' to the NOT(IN) style, so that
the optimization will be available when processing rules dumped
by older Postgres versions.
initdb forced due to small change in SubLink node representation.
per gripe from Csaba Nagy. There is still potential for platform-specific
behavior for values that are exactly halfway between integers, but at
least we now get the expected answer for all other cases.
> The big problem is that while pg_dump's dump_trigger() looks at
> tginitdeferred and dumps accordingly, pg_get_constraintdef doesn't look
> at tginitdeferred, and therefore doesn't record the requirement as part
> of ALTER TABLE ADD CONSTRAINT.
pg_get_constraintdef should probably be looking at condeferrable and
condeferred in the pg_constraint row it's looking at. Maybe something
like the attached.
(Added, output only non-default values.)
Stephan Szabo
disallowed by CREATE TABLE (eg, pseudo-types); also disallow these types
from being introduced by the range-function syntax. While at it, allow
CREATE TABLE to create zero-column tables, per recent pghackers discussion.
I am back-patching this into 7.3 since failure to disallow pseudo-types
is arguably a security hole.
practice of evaluating MemSet's arguments multiple times, except for
the special case of newNode(), where we can assume the argument is
a constant sizeof() operator.
Also, add GetMemoryChunkContext() to mcxt.c's API, in preparation for
fixing recent GEQO breakage.
given any malloc block until something is first allocated in it; but
thereafter, MemoryContextReset won't release that first malloc block.
This preserves the quick-reset property of the original policy, without
forcing 8K to be allocated to every context whether any of it is ever
used or not. Also, remove some more no-longer-needed explicit freeing
during ExecEndPlan.
in the planned representation of a subplan at all any more, only SubPlan.
This means subselect.c doesn't scribble on its input anymore, which seems
like a good thing; and there are no longer three different possible
interpretations of a SubLink. Simplify node naming and improve comments
in primnodes.h. No change to stored rules, though.
execution state trees, and ExecEvalExpr takes an expression state tree
not an expression plan tree. The plan tree is now read-only as far as
the executor is concerned. Next step is to begin actually exploiting
this property.
so that all executable expression nodes inherit from a common supertype
Expr. This is somewhat of an exercise in code purity rather than any
real functional advance, but getting rid of the extra Oper or Func node
formerly used in each operator or function call should provide at least
a little space and speed improvement.
initdb forced by changes in stored-rules representation.
documentation and regression test mods. It seemed small and unobtrusive enough
to not require a specific proposal on the hackers list -- but if not, let me
know and I'll make a pitch. Otherwise, if there are no objections please apply.
Joe Conway
cleaning up locale names and nothing else. Since all the locale names
are in plain ASCII I think it will be safe to use ASCII-only lower-case
conversion.
Nicolai Tufar
to plan nodes, not vice-versa. All executor state nodes now inherit from
struct PlanState. Copying of plan trees has been simplified by not
storing a list of SubPlans in Plan nodes (eliminating duplicate links).
The executor still needs such a list, but it can build it during
ExecutorStart since it has to scan the plan tree anyway.
No initdb forced since no stored-on-disk structures changed, but you
will need a full recompile because of node-numbering changes.
('SELECT expression') inline, like macros, during the constant-folding
phase of planning. The actual expansion is not difficult, but checking
that we're not changing the semantics of the call turns out to be more
subtle than one might think; in particular must pay attention to
permissions issues, strictness, and volatility.
operations: make sure we use operators that are compatible, as determined
by a mergejoin link in pg_operator. Also, add code to planner to ensure
we don't try to use hashed grouping when the grouping operators aren't
marked hashable.
sublink results and COPY's domain constraint checking. A Const that
isn't really constant is just a Bad Idea(tm). Remove hacks in
parse_coerce and other places that were needed because of the former
klugery.
-hackers a couple days ago.
Notes/caveats:
- added regression tests for the new functionality, all
regression tests pass on my machine
- added pg_dump support
- updated PL/PgSQL to support per-statement triggers; didn't
look at the other procedural languages.
- there's (even) more code duplication in trigger.c than there
was previously. Any suggestions on how to refactor the
ExecXXXTriggers() functions to reuse more code would be
welcome -- I took a brief look at it, but couldn't see an
easy way to do it (there are several subtly-different
versions of the code in question)
- updated the documentation. I also took the liberty of
removing a big chunk of duplicated syntax documentation in
the Programmer's Guide on triggers, and moving that
information to the CREATE TRIGGER reference page.
- I also included some spelling fixes and similar small
cleanups I noticed while making the changes. If you'd like
me to split those into a separate patch, let me know.
Neil Conway
results due to doing arithmetic on uninitialized values. Add some
documentation about the AT TIME ZONE construct. Update some other
date/time documentation that seemed out of date for 7.3.
database access outside a transaction; revert bogus performance improvement
in SIBackendInit(); improve comments; add documentation (this part courtesy
Neil Conway).
parameter to allow it to be forced off for comparison purposes.
Add ORDER BY clauses to a bunch of regression test queries that will
otherwise produce randomly-ordered output in the new regime.
of groups produced by GROUP BY. This improves the accuracy of planning
estimates for grouped subselects, and is needed to check whether a
hashed aggregation plan risks memory overflow.
The code was not making TupleConstr structs for such catalogs in
several places; with the consequence that the not-null constraint
wasn't actually enforced. With this change,
INSERT INTO pg_proc VALUES('sdf');
generates a 'Fail to add null value' error instead of a core dump.
anymore given the mktime() workaround now done in DetermineLocalTimeZone.
This has now been confirmed by Robert Bruccoleri for Irix, and I'm going
to extrapolate to AIX as well.
postgres.h or c.h includes a system header (such as stdio.h or
stdlib.h), there's no need to specifically include it in any of the .c
files in the backend.
Neil Conway
in hopes of reducing platform-to-platform variations in its results.
This will cause the geometry regression test to start failing on some
platforms. I plan to update the test later today.
precision for float4, float8, and geometric types. Set it in pg_dump
so that float data can be dumped/reloaded exactly (at least on platforms
where the float I/O support is properly implemented). Initial patch by
Pedro Ferreira, some additional work by Tom Lane.
now)" item on the open items, and subsequent plpgsql function I sent in,
made me realize it was too hard to get the upper and lower bound of an
array. The attached creates two functions that I think will be very
useful when combined with the ability of plpgsql to return sets.
array_lower(array, dim_num)
- and -
array_upper(array, dim_num)
They return the value (as an int) of the upper and lower bound of the
requested dim in the provided array.
Joe Conway
where it's safe to do database access. Along the way, fix core dump
for 'DEFAULT' parameters to CREATE DATABASE. initdb forced due to
change in pg_proc entry.
(usually bison output files), not as standalone files. This hack
works around flex's insistence on including <stdio.h> before we are
able to include postgres.h; postgres.h will already be read before
the compiler starts to read the flex output file. Needed for largefile
support on some platforms.
core file to be produced for debugging, and avoids trying to run the
normal proc-exit cleanup hooks, which are likely to cause additional
problems if the system is hosed.
between signal handler and enable/disable code, avoid accumulation of
timing error due to trying to maintain remaining-time instead of
absolute-end-time, disable timeout before commit not after.
Ray Ontko 28-June-02. Also, fix prefix_selectivity for NAME lefthand
variables (it was bogusly assuming binary compatibility), and adjust
make_greater_string() to not call pg_mbcliplen() with invalid multibyte
data (this last per bug report that I can't find at the moment, but it
was in July '02).
specifically ceil(), floor(), and sign(). There may be other functions
that need to be added, but this is a start. I've included some simple
regression tests.
Neil Conway