Commit Graph

31 Commits

Author SHA1 Message Date
Bruce Momjian
4baaf863ec Update copyright for 2015
Backpatch certain files through 9.0
2015-01-06 11:43:47 -05:00
Andres Freund
d1c575230d Fix off-by-one in pg_xlogdump's fuzzy_open_file().
In the unlikely case of stdin (fd 0) being closed, the off-by-one
would lead to pg_xlogdump failing to open files.

Spotted by Coverity.

Backpatch to 9.3 where pg_xlogdump was introduced.
2015-01-04 15:35:46 +01:00
Alvaro Herrera
dcbfc00aba pg_xlogdump/.gitignore: add committsdesc.c
Author: Michael Paquier
2014-12-09 09:54:14 -03:00
Heikki Linnakangas
ebc2b681b8 Fix pg_xlogdump's calculation of full-page image data.
The old formula was completely bogus with the new WAL record format.
2014-12-05 11:40:27 +02:00
Alvaro Herrera
73c986adde Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData().  This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.

This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.

A new test in src/test/modules is included.

Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.

Authors: Álvaro Herrera and Petr Jelínek

Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 11:53:02 -03:00
Heikki Linnakangas
2c03216d83 Revamp the WAL record format.
Each WAL record now carries information about the modified relation and
block(s) in a standardized format. That makes it easier to write tools that
need that information, like pg_rewind, prefetching the blocks to speed up
recovery, etc.

There's a whole new API for building WAL records, replacing the XLogRecData
chains used previously. The new API consists of XLogRegister* functions,
which are called for each buffer and chunk of data that is added to the
record. The new API also gives more control over when a full-page image is
written, by passing flags to the XLogRegisterBuffer function.

This also simplifies the XLogReadBufferForRedo() calls. The function can dig
the relation and block number from the WAL record, so they no longer need to
be passed as arguments.

For the convenience of redo routines, XLogReader now disects each WAL record
after reading it, copying the main data part and the per-block data into
MAXALIGNed buffers. The data chunks are not aligned within the WAL record,
but the redo routines can assume that the pointers returned by XLogRecGet*
functions are. Redo routines are now passed the XLogReaderState, which
contains the record in the already-disected format, instead of the plain
XLogRecord.

The new record format also makes the fixed size XLogRecord header smaller,
by removing the xl_len field. The length of the "main data" portion is now
stored at the end of the WAL record, and there's a separate header after
XLogRecord for it. The alignment padding at the end of XLogRecord is also
removed. This compansates for the fact that the new format would otherwise
be more bulky than the old format.

Reviewed by Andres Freund, Amit Kapila, Michael Paquier, Alvaro Herrera,
Fujii Masao.
2014-11-20 18:46:41 +02:00
Robert Haas
99e8f08fab Update pg_xlogdump's .gitignore for brindesc.c. 2014-11-07 15:41:52 -05:00
Alvaro Herrera
7516f52594 BRIN: Block Range Indexes
BRIN is a new index access method intended to accelerate scans of very
large tables, without the maintenance overhead of btrees or other
traditional indexes.  They work by maintaining "summary" data about
block ranges.  Bitmap index scans work by reading each summary tuple and
comparing them with the query quals; all pages in the range are returned
in a lossy TID bitmap if the quals are consistent with the values in the
summary tuple, otherwise not.  Normal index scans are not supported
because these indexes do not store TIDs.

As new tuples are added into the index, the summary information is
updated (if the block range in which the tuple is added is already
summarized) or not; in the latter case, a subsequent pass of VACUUM or
the brin_summarize_new_values() function will create the summary
information.

For data types with natural 1-D sort orders, the summary info consists
of the maximum and the minimum values of each indexed column within each
page range.  This type of operator class we call "Minmax", and we
supply a bunch of them for most data types with B-tree opclasses.
Since the BRIN code is generalized, other approaches are possible for
things such as arrays, geometric types, ranges, etc; even for things
such as enum types we could do something different than minmax with
better results.  In this commit I only include minmax.

Catalog version bumped due to new builtin catalog entries.

There's more that could be done here, but this is a good step forwards.

Loosely based on ideas from Simon Riggs; code mostly by Álvaro Herrera,
with contribution by Heikki Linnakangas.

Patch reviewed by: Amit Kapila, Heikki Linnakangas, Robert Haas.
Testing help from Jeff Janes, Erik Rijkers, Emanuel Calvo.

PS:
  The research leading to these results has received funding from the
  European Union's Seventh Framework Programme (FP7/2007-2013) under
  grant agreement n° 318633.
2014-11-07 16:38:14 -03:00
Heikki Linnakangas
2076db2aea Move the backup-block logic from XLogInsert to a new file, xloginsert.c.
xlog.c is huge, this makes it a little bit smaller, which is nice. Functions
related to putting together the WAL record are in xloginsert.c, and the
lower level stuff for managing WAL buffers and such are in xlog.c.

Also move the definition of XLogRecord to a separate header file. This
causes churn in the #includes of all the files that write WAL records, and
redo routines, but it avoids pulling in xlog.h into most places.

Reviewed by Michael Paquier, Alvaro Herrera, Andres Freund and Amit Kapila.
2014-11-06 13:55:36 +02:00
Andres Freund
604f7956b9 Improve code around the recently added rm_identify rmgr callback.
There are four weaknesses in728f152e07f998d2cb4fe5f24ec8da2c3bda98f2:

* append_init() in heapdesc.c was ugly and required that rm_identify
  return values are only valid till the next call. Instead just add a
  couple more switch() cases for the INIT_PAGE cases. Now the returned
  value will always be valid.
* a couple rm_identify() callbacks missed masking xl_info with
  ~XLR_INFO_MASK.
* pg_xlogdump didn't map a NULL rm_identify to UNKNOWN or a similar
  string.
* append_init() was called when id=NULL - which should never actually
  happen. But it's better to be careful.
2014-09-22 17:49:34 +02:00
Andres Freund
bdd5726c34 Add the capability to display summary statistics to pg_xlogdump.
The new --stats/--stats=record options to pg_xlogdump display per
rmgr/per record statistics about the parsed WAL. This is useful to
understand what the WAL primarily consists of, to allow targeted
optimizations on application, configuration, and core code level.

It is likely that we will want to fine tune the statistics further,
but the feature already is quite helpful.

Author: Abhijit Menon-Sen, slightly editorialized by me
Reviewed-By: Andres Freund, Dilip Kumar and Furuya Osamu
Discussion: 20140604104716.GA3989@toroid.org
2014-09-19 16:33:16 +02:00
Andres Freund
728f152e07 Add rmgr callback to name xlog record types for display purposes.
This is primarily useful for the upcoming pg_xlogdump --stats feature,
but also allows to remove some duplicated code in the rmgr_desc
routines.

Due to the separation and harmonization, the output of dipsplayed
records changes somewhat. But since this isn't enduser oriented
content that's ok.

It's potentially desirable to further change pg_xlogdump's display of
records. It previously wasn't possible to show the record type
separately from the description forcing it to be in the last
column. But that's better done in a separate commit.

Author: Abhijit Menon-Sen, slightly editorialized by me
Reviewed-By: Álvaro Herrera, Andres Freund, and Heikki Linnakangas
Discussion: 20140604104716.GA3989@toroid.org
2014-09-19 16:20:29 +02:00
Noah Misch
0ffc201a51 Add file version information to most installed Windows binaries.
Prominent binaries already had this metadata.  A handful of minor
binaries, such as pg_regress.exe, still lack it; efforts to eliminate
such exceptions are welcome.

Michael Paquier, reviewed by MauMau.
2014-07-14 14:07:52 -04:00
Heikki Linnakangas
0ef0b6784c Change the signature of rm_desc so that it's passed a XLogRecord.
Just feels more natural, and is more consistent with rm_redo.
2014-06-14 10:46:48 +03:00
Bruce Momjian
0a78320057 pgindent run for 9.4
This includes removing tabs after periods in C comments, which was
applied to back branches, so this change should not effect backpatching.
2014-05-06 12:12:18 -04:00
Tom Lane
2d00190495 Rationalize common/relpath.[hc].
Commit a730183926 created rather a mess by
putting dependencies on backend-only include files into include/common.
We really shouldn't do that.  To clean it up:

* Move TABLESPACE_VERSION_DIRECTORY back to its longtime home in
catalog/catalog.h.  We won't consider this symbol part of the FE/BE API.

* Push enum ForkNumber from relfilenode.h into relpath.h.  We'll consider
relpath.h as the source of truth for fork numbers, since relpath.c was
already partially serving that function, and anyway relfilenode.h was
kind of a random place for that enum.

* So, relfilenode.h now includes relpath.h rather than vice-versa.  This
direction of dependency is fine.  (That allows most, but not quite all,
of the existing explicit #includes of relpath.h to go away again.)

* Push forkname_to_number from catalog.c to relpath.c, just to centralize
fork number stuff a bit better.

* Push GetDatabasePath from catalog.c to relpath.c; it was rather odd
that the previous commit didn't keep this together with relpath().

* To avoid needing relfilenode.h in common/, redefine the underlying
function (now called GetRelationPath) as taking separate OID arguments,
and make the APIs using RelFileNode or RelFileNodeBackend into macro
wrappers.  (The macros have a potential multiple-eval risk, but none of
the existing call sites have an issue with that; one of them had such a
risk already anyway.)

* Fix failure to follow the directions when "init" fork type was added;
specifically, the errhint in forkname_to_number wasn't updated, and neither
was the SGML documentation for pg_relation_size().

* Fix tablespace-path-too-long check in CreateTableSpace() to account for
fork-name component of maximum-length pathnames.  This requires putting
FORKNAMECHARS into a header file, but it was rather useless (and
actually unreferenced) where it was.

The last couple of items are potentially back-patchable bug fixes,
if anyone is sufficiently excited about them; but personally I'm not.

Per a gripe from Christoph Berg about how include/common wasn't
self-contained.
2014-04-30 17:30:50 -04:00
Heikki Linnakangas
28475f8e58 Use pg_usleep() instead of plain sleep(), to fix Windows build
Per buildfarm.
2014-03-26 15:25:39 +02:00
Heikki Linnakangas
ce9bb92f8f Add -f/--follow option to pg_xlogdump.
This is useful for seeing what WAL records are inserted in real-time, by
pointing pg_xlogdump to a live server.
2014-03-26 13:48:20 +02:00
Heikki Linnakangas
033dc1c92c Fix compilation of pg_xlogdump, now that rm_safe_restartpoint is no more.
Oops. Pointed out by Andres Freund.
2014-03-18 22:23:00 +02:00
Bruce Momjian
7e04792a1c Update copyright for 2014
Update all files in head, and files COPYRIGHT and legal.sgml in all back
branches.
2014-01-07 16:05:30 -05:00
Alvaro Herrera
dddc91ddd3 Remove broken PGXS code for pg_xlogdump
With the PGXS boilerplate in place, pg_xlogdump currently fails with an
ominous error message that certain targets cannot be built because
certain files do not exist.  Remove that and instead throw a quick error
message alerting the user of the actual problem, which should be easier
to diagnose that the statu quo.

Andres Freund
2013-10-01 17:36:15 -03:00
Heikki Linnakangas
79e15c7d86 Fix off-by-one in pg_xlogdump -r option.
Because of the bug, -r would not accept the rmgr with the highest ID.
2013-06-04 18:51:43 +03:00
Bruce Momjian
9af4159fce pgindent run for release 9.3
This is the first run of the Perl-based pgindent script.  Also update
pgindent instructions.
2013-05-29 16:58:43 -04:00
Peter Eisentraut
a4fd3366a6 pg_xlogdump: Improve --help output 2013-05-11 22:02:53 -04:00
Heikki Linnakangas
26b45dc54f Fix typo in "pg_xlogdump --help" and error message.
Fujii Masao and me.
2013-02-27 21:27:01 +02:00
Tom Lane
08f9728057 Add missing .gitignore file. 2013-02-26 15:58:34 -05:00
Tom Lane
1418e6e07b Clean up "stopgap" implementation of timestamptz_to_str().
Use correct type for "result", fix bogus strftime argument, don't use
unnecessary static variables, improve comments.

Andres Freund and Tom Lane
2013-02-26 15:50:22 -05:00
Tom Lane
e5bf0c376e Fix build of contrib/pg_xlogdump.
rmgrdesc.c is not auto-generated now, though it apparently was the last
time the Makefile was updated.
2013-02-24 08:58:00 -05:00
Alvaro Herrera
4591933549 Fix some typos and grammatical mistakes
... as well a update copyrights statements to 2013.

Noted by Thom Brown and Peter Geoghegan
2013-02-22 18:52:59 -03:00
Alvaro Herrera
f03a779751 Fix copy-and-pasteo
Harmless, but it's certainly better like this.

Noticed by Andres Freund
2013-02-22 17:04:12 -03:00
Alvaro Herrera
639ed4e84b Add pg_xlogdump contrib program
This program relies on rm_desc backend routines and the xlogreader
infrastructure to emit human-readable rendering of WAL records.

Author: Andres Freund, with many reworks by Álvaro
Reviewed (in a much earlier version) by Peter Eisentraut
2013-02-22 16:56:55 -03:00