postgresql

Commit Graph

Author	SHA1	Message	Date
Tom Lane	b5b0db19b8	Fix handling of extended statistics during ALTER COLUMN TYPE. ALTER COLUMN TYPE on a column used by a statistics object fails since commit `928c4de30`, because the relevant switch in ATExecAlterColumnType is unprepared for columns to have dependencies from OCLASS_STATISTIC_EXT objects. Although the existing types of extended statistics don't actually need us to do any work for a column type change, it seems completely indefensible that that assumption is hidden behind the failure of an unrelated module to contain any code for the case. Hence, create and call an API function in statscmds.c where the assumption can be explained, and where we could add code to deal with the problem when it inevitably becomes real. Also, the reason this wasn't handled before, neither for extended stats nor for the last half-dozen new OCLASS kinds :-(, is that the default: in that switch suppresses compiler warnings, allowing people to miss the need to consider it when adding an OCLASS. We don't really need a default because surely getObjectClass should only return valid values of the enum; so remove it, and add the missed OCLASS entries where they should be. Discussion: https://postgr.es/m/20170512221010.nglatgt5azzdxjlj@alvherre.pgsql	2017-05-14 12:22:25 -04:00
Tom Lane	f04c9a6146	Standardize terminology for pg_statistic_ext entries. Consistently refer to such an entry as a "statistics object", not just "statistics" or "extended statistics". Previously we had a mismash of terms, accompanied by utter confusion as to whether the term was singular or plural. That's not only grating (at least to the ear of a native English speaker) but could be outright misleading, eg in error messages that seemed to be referring to multiple objects where only one could be meant. This commit fixes the code and a lot of comments (though I may have missed a few). I also renamed two new SQL functions, pg_get_statisticsextdef -> pg_get_statisticsobjdef pg_statistic_ext_is_visible -> pg_statistics_obj_is_visible to conform better with this terminology. I have not touched the SGML docs other than fixing those function names; the docs certainly need work but it seems like a separable task. Discussion: https://postgr.es/m/22676.1494557205@sss.pgh.pa.us	2017-05-14 10:55:01 -04:00
Tom Lane	928c4de309	Fix dependencies for extended statistics objects. A stats object ought to have a dependency on each individual column it reads, not the entire table. Doing this honestly lets us get rid of the hard-wired logic in RemoveStatisticsExt, which seems to have been misguidedly modeled on RemoveStatistics; and it will be far easier to extend to multiple tables later. Also, add overlooked dependency on owner, and make the dependency on schema be NORMAL like every other such dependency. There remains some unfinished work here, which is to allow statistics objects to be extension members. That takes more effort than just adding the dependency call, though, so I left it out for now. initdb forced because this changes the set of pg_depend records that should exist for a statistics object. Discussion: https://postgr.es/m/22676.1494557205@sss.pgh.pa.us	2017-05-12 16:26:31 -04:00
Alvaro Herrera	bc085205c8	Change CREATE STATISTICS syntax Previously, we had the WITH clause in the middle of the command, where you'd specify both generic options as well as statistic types. Few people liked this, so this commit changes it to remove the WITH keyword from that clause and makes it accept statistic types only. (We currently don't have any generic options, but if we invent in the future, we will gain a new WITH clause, probably at the end of the command). Also, the column list is now specified without parens, which makes the whole command look more similar to a SELECT command. This change will let us expand the command to supporting expressions (not just columns names) as well as multiple tables and their join conditions. Tom added lots of code comments and fixed some parts of the CREATE STATISTICS reference page, too; more changes in this area are forthcoming. He also fixed a potential problem in the alter_generic regression test, reducing verbosity on a cascaded drop to avoid dependency on message ordering, as we do in other tests. Tom also closed a security bug: we documented that table ownership was required in order to create a statistics object on it, but didn't actually implement it. Implement tab-completion for statistics objects. This can stand some more improvement. Authors: Alvaro Herrera, with lots of cleanup by Tom Lane Discussion: https://postgr.es/m/20170420212426.ltvgyhnefvhixm6i@alvherre.pgsql	2017-05-12 14:59:35 -03:00
Alvaro Herrera	93bbeec6a2	extstats: change output functions to emit valid JSON Manipulating extended statistics is more convenient as JSON than the current ad-hoc format, so let's change before it's too late. Discussion: https://postgr.es/m/20170420193828.k3fliiock5hdnehn@alvherre.pgsql	2017-05-02 18:49:32 -03:00
Tom Lane	4b34624daa	Code review for commands/statscmds.c. Fix machine-dependent sorting of column numbers. (Odd behavior would only materialize for column numbers above 255, but that's certainly legal.) Fix poor choice of SQLSTATE for some errors, and improve error message wording. (Notably, "is not a scalar type" is a totally misleading way to explain "does not have a default btree opclass".) Avoid taking AccessExclusiveLock on the associated relation during DROP STATISTICS. That's neither necessary nor desirable, and it could easily have put us into situations where DROP fails (compare commit `68ea2b7f9`). Adjust/improve comments. David Rowley and Tom Lane Discussion: https://postgr.es/m/CAKJS1f-GmCfPvBbAEaM5xoVOaYdVgVN1gicALSoYQ77z-+vLbw@mail.gmail.com	2017-04-24 11:15:15 -04:00
Alvaro Herrera	ee6922112e	Rename columns in new pg_statistic_ext catalog The new catalog reused a column prefix "sta" from pg_statistic, but this is undesirable, so change the catalog to use prefix "stx" instead. Also, rename the column that lists enabled statistic kinds as "stxkind" rather than "enabled". Discussion: https://postgr.es/m/CAKJS1f_2t5jhSN7huYRFH3w3rrHfG2QU7hiUHsu-Vdjd1rYT3w@mail.gmail.com	2017-04-17 18:34:29 -03:00
Alvaro Herrera	8c5cdb7f4f	Tighten up relation kind checks for extended statistics We were accepting creation of extended statistics only for regular tables, but they can usefully be created for foreign tables, partitioned tables, and materialized views, too. Allow those cases. While at it, make sure all the rejected cases throw a consistent error message, and add regression tests for the whole thing. Author: David Rowley, Álvaro Herrera Discussion: https://postgr.es/m/CAKJS1f-BmGo410bh5RSPZUvOO0LhmHL2NYmdrC_Jm8pk_FfyCA@mail.gmail.com	2017-04-17 17:55:55 -03:00
Alvaro Herrera	bf2a691e02	Fix extended statistics with partial analyzes Either because of a previous ALTER TABLE .. SET STATISTICS 0 or because of being invoked with a partial column list, ANALYZE could fail to acquire sufficient data to build extended statistics. Previously, this would draw an ERROR and fail to collect any statistics at all (extended and regular). Change things so that we raise a WARNING instead, and remove a hint that was wrong in half the cases. Reported by: David Rowley Discussion: https://postgr.es/m/CAKJS1f9Kk0NF6Fg7TA=JUXsjpS9kX6NVu27pb5QDCpOYAvb-Og@mail.gmail.com	2017-04-17 14:00:47 -03:00
Simon Riggs	2686ee1b7c	Collect and use multi-column dependency stats Follow on patch in the multi-variate statistics patch series. CREATE STATISTICS s1 WITH (dependencies) ON (a, b) FROM t; ANALYZE; will collect dependency stats on (a, b) and then use the measured dependency in subsequent query planning. Commit `7b504eb282` added CREATE STATISTICS with n-distinct coefficients. These are now specified using the mutually exclusive option WITH (ndistinct). Author: Tomas Vondra, David Rowley Reviewed-by: Kyotaro HORIGUCHI, Álvaro Herrera, Dean Rasheed, Robert Haas and many other comments and contributions Discussion: https://postgr.es/m/56f40b20-c464-fad2-ff39-06b668fac47c@2ndquadrant.com	2017-04-05 18:00:42 -04:00
Alvaro Herrera	bed9ef5a16	Rework the stats_ext test As suggested by Tom Lane, avoid printing specific estimated cost values, because they vary across architectures; instead, verify plan shapes (in this case, HashAggregate vs. GroupAggregate), as we do in other planner tests. We can now remove expected/stats_ext_1.out. Author: Tomas Vondra	2017-03-27 12:43:04 -03:00
Alvaro Herrera	2c3e47527a	Fix a couple of problems in pg_get_statisticsextdef There was a thinko whereby we tested the wrong tuple after fetching it from cache; avoid that by using generate_relation_name instead, which is simpler. Also, the statistics name was not qualified, so add that. (It could be argued that qualification should be conditional on the schema not being on search path. We can add that later, but at least this form is correct.) Author: David Rowley, Álvaro Herrera Discussion: https://postgr.es/m/CAKJS1f8RjLeVZJ2+93pdQGuZJeBF-ifsHaFMR-q-6-Z0qxA8cA@mail.gmail.com	2017-03-27 01:03:50 -03:00
Alvaro Herrera	7b504eb282	Implement multivariate n-distinct coefficients Add support for explicitly declared statistic objects (CREATE STATISTICS), allowing collection of statistics on more complex combinations that individual table columns. Companion commands DROP STATISTICS and ALTER STATISTICS ... OWNER TO / SET SCHEMA / RENAME are added too. All this DDL has been designed so that more statistic types can be added later on, such as multivariate most-common-values and multivariate histograms between columns of a single table, leaving room for permitting columns on multiple tables, too, as well as expressions. This commit only adds support for collection of n-distinct coefficient on user-specified sets of columns in a single table. This is useful to estimate number of distinct groups in GROUP BY and DISTINCT clauses; estimation errors there can cause over-allocation of memory in hashed aggregates, for instance, so it's a worthwhile problem to solve. A new special pseudo-type pg_ndistinct is used. (num-distinct estimation was deemed sufficiently useful by itself that this is worthwhile even if no further statistic types are added immediately; so much so that another version of essentially the same functionality was submitted by Kyotaro Horiguchi: https://postgr.es/m/20150828.173334.114731693.horiguchi.kyotaro@lab.ntt.co.jp though this commit does not use that code.) Author: Tomas Vondra. Some code rework by Álvaro. Reviewed-by: Dean Rasheed, David Rowley, Kyotaro Horiguchi, Jeff Janes, Ideriha Takeshi Discussion: https://postgr.es/m/543AFA15.4080608@fuzzy.cz https://postgr.es/m/20170320190220.ixlaueanxegqd5gr@alvherre.pgsql	2017-03-24 14:06:10 -03:00

13 Commits