diff --git a/doc/TODO.detail/qsort b/doc/TODO.detail/qsort index 695cabcc5f..e7833d01ff 100644 --- a/doc/TODO.detail/qsort +++ b/doc/TODO.detail/qsort @@ -1077,3 +1077,1382 @@ a pointer list). +From pgsql-hackers-owner+M81165@postgresql.org Thu Mar 16 18:37:28 2006 +Return-path: +Received: from ams.hub.org (ams.hub.org [200.46.204.13]) + by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id k2GNbOu11277 + for ; Thu, 16 Mar 2006 18:37:25 -0500 (EST) +Received: from postgresql.org (postgresql.org [200.46.204.71]) + by ams.hub.org (Postfix) with ESMTP id A609567BADC; + Thu, 16 Mar 2006 19:37:21 -0400 (AST) +X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org +Received: from localhost (av.hub.org [200.46.204.144]) + by postgresql.org (Postfix) with ESMTP id 8E8E19DC828 + for ; Thu, 16 Mar 2006 19:36:50 -0400 (AST) +Received: from postgresql.org ([200.46.204.71]) + by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024) + with ESMTP id 31174-02 + for ; + Thu, 16 Mar 2006 19:36:52 -0400 (AST) +X-Greylist: from auto-whitelisted by SQLgrey- +Received: from sss.pgh.pa.us (sss.pgh.pa.us [66.207.139.130]) + by postgresql.org (Postfix) with ESMTP id 8CA419DC840 + for ; Thu, 16 Mar 2006 19:36:46 -0400 (AST) +Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) + by sss.pgh.pa.us (8.13.1/8.13.1) with ESMTP id k2GNagfd023078; + Thu, 16 Mar 2006 18:36:42 -0500 (EST) +To: "Dann Corbit" +cc: "Jonah H. Harris" , pgsql-hackers@postgresql.org, + "Jerry Sievers" +Subject: Re: [HACKERS] qsort, once again +In-Reply-To: +References: +Comments: In-reply-to "Dann Corbit" + message dated "Thu, 16 Mar 2006 13:27:33 -0800" +MIME-Version: 1.0 +Content-Type: multipart/mixed; boundary="----- =_aaaaaaaaaa0" +Content-ID: <23060.1142551929.0@sss.pgh.pa.us> +Date: Thu, 16 Mar 2006 18:36:42 -0500 +Message-ID: <23077.1142552202@sss.pgh.pa.us> +From: Tom Lane +X-Virus-Scanned: by amavisd-new at hub.org +X-Spam-Status: No, score=0.113 required=5 tests=[AWL=0.113] +X-Spam-Score: 0.113 +X-Mailing-List: pgsql-hackers +List-Archive: +List-Help: +List-Id: +List-Owner: +List-Post: +List-Subscribe: +List-Unsubscribe: +Precedence: bulk +Sender: pgsql-hackers-owner@postgresql.org +Status: OR + +------- =_aaaaaaaaaa0 +Content-Type: text/plain; charset="us-ascii" +Content-ID: <23060.1142551929.1@sss.pgh.pa.us> + +>> So at least on randomized data, the swap_cnt thing is a serious loser. +>> Need to run some tests on special-case inputs though. Anyone have a +>> test suite they like? + +> Here is a distribution maker that will create some torture tests for +> sorting programs. + +I fleshed out the sort tester that Bentley & McIlroy give pseudocode for +in their paper (attached below in case anyone wants to hack on it). Not +very surprisingly, it shows the unmodified B&M algorithm as +significantly better than the BSD-lite version: + +Our current BSD qsort: +distribution SAWTOOTH: max cratio 12.9259, average 0.870261 over 252 tests +distribution RAND: max cratio 1.07917, average 0.505924 over 252 tests +distribution STAGGER: max cratio 12.9259, average 1.03706 over 252 tests +distribution PLATEAU: max cratio 12.9259, average 0.632514 over 252 tests +distribution SHUFFLE: max cratio 12.9259, average 1.21631 over 252 tests +method COPY: max cratio 3.87533, average 0.666927 over 210 tests +method REVERSE: max cratio 5.6248, average 0.710284 over 210 tests +method FREVERSE: max cratio 12.9259, average 1.58323 over 210 tests +method BREVERSE: max cratio 5.72661, average 1.13674 over 210 tests +method SORT: max cratio 0.758625, average 0.350092 over 210 tests +method DITHER: max cratio 3.13417, average 0.667222 over 210 tests +Overall: average cratio 0.852415 over 1260 tests + +without the swap_cnt code: +distribution SAWTOOTH: max cratio 5.6248, average 0.745818 over 252 tests +distribution RAND: max cratio 1.07917, average 0.510097 over 252 tests +distribution STAGGER: max cratio 5.6248, average 1.0494 over 252 tests +distribution PLATEAU: max cratio 3.57655, average 0.411549 over 252 tests +distribution SHUFFLE: max cratio 5.72661, average 1.05988 over 252 tests +method COPY: max cratio 3.87533, average 0.712122 over 210 tests +method REVERSE: max cratio 5.6248, average 0.751011 over 210 tests +method FREVERSE: max cratio 4.80869, average 0.690224 over 210 tests +method BREVERSE: max cratio 5.72661, average 1.13673 over 210 tests +method SORT: max cratio 0.806618, average 0.539829 over 210 tests +method DITHER: max cratio 3.13417, average 0.702174 over 210 tests +Overall: average cratio 0.755348 over 1260 tests + +("cratio" is the ratio of the actual number of comparison function calls +to the theoretical expectation of N*lg2(N).) The insertion sort +switchover is a loser for both average and worst-case measurements. + +I tried Dann's distributions too, with N = 100000: + +Our current BSD qsort: +dist fib: cratio 0.0694229 +dist camel: cratio 0.0903228 +dist constant: cratio 0.0602126 +dist five: cratio 0.132288 +dist ramp: cratio 4.29937 +dist random: cratio 1.09286 +dist reverse: cratio 0.5663 +dist sorted: cratio 0.18062 +dist ten: cratio 0.174781 +dist twenty: cratio 0.238098 +dist two: cratio 0.090365 +dist perverse: cratio 0.334503 +dist trig: cratio 0.679846 +Overall: max cratio 4.29937, average cratio 0.616076 over 13 tests + +without the swap_cnt code: +dist fib: cratio 0.0694229 +dist camel: cratio 0.0903228 +dist constant: cratio 0.0602126 +dist five: cratio 0.132288 +dist ramp: cratio 4.29937 +dist random: cratio 1.09286 +dist reverse: cratio 0.89184 +dist sorted: cratio 0.884907 +dist ten: cratio 0.174781 +dist twenty: cratio 0.238098 +dist two: cratio 0.090365 +dist perverse: cratio 0.334503 +dist trig: cratio 0.679846 +Overall: max cratio 4.29937, average cratio 0.695293 over 13 tests + +In this set of tests the behavior is just about identical, except for +the case of already-sorted input, where the BSD coding runs in O(N) +instead of O(N lg2 N) time. So that evidently is why some unknown +person put in the special case. + +Some further experimentation destroys my original proposal to limit the +size of subfile we'll use the swap_cnt code for: it turns out that that +eliminates the BSD code's advantage for presorted input (at least for +inputs bigger than the limit) without doing anything much in return. + +So my feeling is we should just remove the swap_cnt code and return to +the original B&M algorithm. Being much faster than expected for +presorted input doesn't justify being far slower than expected for +other inputs, IMHO. In the context of Postgres I doubt that perfectly +sorted input shows up very often anyway. + +Comments? + + regards, tom lane + + +------- =_aaaaaaaaaa0 +Content-Type: application/octet-stream +Content-ID: <23060.1142551929.2@sss.pgh.pa.us> +Content-Description: sorttester.c +Content-Transfer-Encoding: base64 + +I2luY2x1ZGUgPHN0ZGlvLmg+CiNpbmNsdWRlIDxzdGRsaWIuaD4KI2luY2x1 +ZGUgPHN0cmluZy5oPgojaW5jbHVkZSA8bWF0aC5oPgoKI2lmZGVmIFVTRV9R +U09SVAojZGVmaW5lIHRlc3RfcXNvcnQgcXNvcnQKI2Vsc2UKZXh0ZXJuIHZv +aWQgdGVzdF9xc29ydCh2b2lkICphLCBzaXplX3Qgbiwgc2l6ZV90IGVzLAoJ +CQkJCSAgIGludCAoKmNtcCkgKGNvbnN0IHZvaWQgKiwgY29uc3Qgdm9pZCAq +KSk7CiNlbmRpZgoKc3RhdGljIGNvbnN0IGludCBuX3ZhbHVlc1tdID0geyAx +MDAsIDEwMjMsIDEwMjQsIDEwMjUgfSA7CiNkZWZpbmUgTUFYX04gMTAyNQoK +dHlwZWRlZiBlbnVtCnsKCVNBV1RPT1RILCBSQU5ELCBTVEFHR0VSLCBQTEFU +RUFVLCBTSFVGRkxFCn0gRElTVDsKI2RlZmluZSBESVNUX0ZJUlNUCVNBV1RP +T1RICiNkZWZpbmUgRElTVF9MQVNUCVNIVUZGTEUKCnN0YXRpYyBjb25zdCBj +aGFyICogY29uc3QgZGlzdG5hbWVbXSA9IHsKCSJTQVdUT09USCIsICJSQU5E +IiwgIlNUQUdHRVIiLCAiUExBVEVBVSIsICJTSFVGRkxFIgp9OwoKdHlwZWRl +ZiBlbnVtCnsKCU1DT1BZLCBNUkVWRVJTRSwgTUZSRVZFUlNFLCBNQlJFVkVS +U0UsIE1TT1JULCBNRElUSEVSCn0gTU9ETUVUSE9EOwojZGVmaW5lIE1FVEhf +RklSU1QJTUNPUFkKI2RlZmluZSBNRVRIX0xBU1QJTURJVEhFUgoKc3RhdGlj +IGNvbnN0IGNoYXIgKiBjb25zdCBtZXRobmFtZVtdID0gewoJIkNPUFkiLCAi +UkVWRVJTRSIsICJGUkVWRVJTRSIsICJCUkVWRVJTRSIsICJTT1JUIiwgIkRJ +VEhFUiIKfTsKCi8qIHBlci10ZXN0IGNvdW50ZXIgKi8Kc3RhdGljIGxvbmcg +bmNvbXBhcmVzOwoKLyogYWNjdW11bGF0ZSByZXN1bHRzIGFjcm9zcyB0ZXN0 +cywgZGlzdC13aXNlICovCnN0YXRpYyBkb3VibGUgc3VtY3JhdGlvX2RbRElT +VF9MQVNUKzFdOwpzdGF0aWMgZG91YmxlIG1heGNyYXRpb19kW0RJU1RfTEFT +VCsxXTsKc3RhdGljIGludCBudGVzdHNfZFtESVNUX0xBU1QrMV07Ci8qIGFj +Y3VtdWxhdGUgcmVzdWx0cyBhY3Jvc3MgdGVzdHMsIG1vZC1tZXRob2Qtd2lz +ZSAqLwpzdGF0aWMgZG91YmxlIHN1bWNyYXRpb19tW01FVEhfTEFTVCsxXTsK +c3RhdGljIGRvdWJsZSBtYXhjcmF0aW9fbVtNRVRIX0xBU1QrMV07CnN0YXRp +YyBpbnQgbnRlc3RzX21bTUVUSF9MQVNUKzFdOwoKCnN0YXRpYyBpbnQKaW50 +X2NtcChjb25zdCB2b2lkICphLCBjb25zdCB2b2lkICpiKQp7CglpbnQJCWFh +ID0gKihjb25zdCBpbnQgKikgYTsKCWludAkJYmIgPSAqKGNvbnN0IGludCAq +KSBiOwoKCW5jb21wYXJlcysrOwoKCWlmIChhYSA8IGJiKQoJCXJldHVybiAt +MTsKCWlmIChhYSA+IGJiKQoJCXJldHVybiAxOwoJcmV0dXJuIDA7Cn0KCgpz +dGF0aWMgdm9pZAp0ZXN0X2NvbW1vbihESVNUIGRpc3QsIE1PRE1FVEhPRCBt +ZXRoLCB2b2lkICp4dCwgc2l6ZV90IG4sIHNpemVfdCBzeiwKCQkJaW50ICgq +Y21wKSAoY29uc3Qgdm9pZCAqLCBjb25zdCB2b2lkICopKQp7Cglkb3VibGUg +bmxvZ247Cglkb3VibGUgY3JhdGlvOwoKCW5jb21wYXJlcyA9IDA7Cgl0ZXN0 +X3Fzb3J0KHh0LCBuLCBzeiwgY21wKTsKCW5sb2duID0gbiAqIGxvZygoZG91 +YmxlKSBuKSAvIGxvZygyLjApOwoJY3JhdGlvID0gbmNvbXBhcmVzIC8gbmxv +Z247CglzdW1jcmF0aW9fZFtkaXN0XSArPSBjcmF0aW87CglpZiAoY3JhdGlv +ID4gbWF4Y3JhdGlvX2RbZGlzdF0pCgkJbWF4Y3JhdGlvX2RbZGlzdF0gPSBj +cmF0aW87CgludGVzdHNfZFtkaXN0XSsrOwoJc3VtY3JhdGlvX21bbWV0aF0g +Kz0gY3JhdGlvOwoJaWYgKGNyYXRpbyA+IG1heGNyYXRpb19tW21ldGhdKQoJ +CW1heGNyYXRpb19tW21ldGhdID0gY3JhdGlvOwoJbnRlc3RzX21bbWV0aF0r +KzsKfQoKCi8qIHdvcmsgb24gYSBjb3B5IG9mIHggKi8Kc3RhdGljIHZvaWQK +dGVzdF9pbnRfY29weShESVNUIGRpc3QsIGludCB4W10sIGludCBuKQp7Cglp +bnQJCXh0W01BWF9OXTsKCgltZW1jcHkoeHQsIHgsIG4gKiBzaXplb2YoaW50 +KSk7Cgl0ZXN0X2NvbW1vbihkaXN0LCBNQ09QWSwgKHZvaWQgKikgeHQsIG4s +IHNpemVvZihpbnQpLCBpbnRfY21wKTsKfQoKLyogcmV2ZXJzZSB0aGUgcmFu +Z2Ugc3RhcnQgPD0gaSA8IHN0b3AgKi8Kc3RhdGljIHZvaWQKdGVzdF9pbnRf +cmV2ZXJzZShESVNUIGRpc3QsIE1PRE1FVEhPRCBtZXRoLCBpbnQgeFtdLCBp +bnQgbiwgaW50IHN0YXJ0LCBpbnQgc3RvcCkKewoJaW50CQl4dFtNQVhfTl07 +CglpbnQJCWk7CgoJbWVtY3B5KHh0LCB4LCBuICogc2l6ZW9mKGludCkpOwoJ +Zm9yIChpID0gc3RhcnQ7IGkgPCBzdG9wOyBpKyspCgl7CgkJeHRbaV0gPSB4 +W3N0b3AgLSAxIC0gaV07Cgl9Cgl0ZXN0X2NvbW1vbihkaXN0LCBtZXRoLCAo +dm9pZCAqKSB4dCwgbiwgc2l6ZW9mKGludCksIGludF9jbXApOwp9CgovKiBw +cmUtc29ydCB1c2luZyBhIHRydXN0ZWQgc29ydCAqLwpzdGF0aWMgdm9pZAp0 +ZXN0X2ludF9zb3J0KERJU1QgZGlzdCwgaW50IHhbXSwgaW50IG4pCnsKCWlu +dAkJeHRbTUFYX05dOwoKCW1lbWNweSh4dCwgeCwgbiAqIHNpemVvZihpbnQp +KTsKCXFzb3J0KCh2b2lkICopIHh0LCBuLCBzaXplb2YoaW50KSwgaW50X2Nt +cCk7Cgl0ZXN0X2NvbW1vbihkaXN0LCBNU09SVCwgKHZvaWQgKikgeHQsIG4s +IHNpemVvZihpbnQpLCBpbnRfY21wKTsKfQoKLyogYWRkIGklNSB0byB4W2ld +ICovCnN0YXRpYyB2b2lkCnRlc3RfaW50X2RpdGhlcihESVNUIGRpc3QsIGlu +dCB4W10sIGludCBuKQp7CglpbnQJCXh0W01BWF9OXTsKCWludAkJaTsKCglm +b3IgKGkgPSAwOyBpIDwgbjsgaSsrKQoJewoJCXh0W2ldID0geFtpXSArIGkl +NTsKCX0KCXRlc3RfY29tbW9uKGRpc3QsIE1ESVRIRVIsICh2b2lkICopIHh0 +LCBuLCBzaXplb2YoaW50KSwgaW50X2NtcCk7Cn0KCgppbnQKbWFpbigpCnsK +CWludAkJeFtNQVhfTl07CglpbnQJCWlfbjsKCWludAkJbjsKCWludAkJbTsK +CURJU1QJZGlzdDsKCU1PRE1FVEhPRCBtZXRoOwoJaW50CQlpOwoJaW50CQlq +OwoJaW50CQlrOwoJZG91YmxlCXN1bWNyYXRpbzsKCWludAkJbnRlc3RzOwoK +CWZvciAoaV9uID0gMDsgaV9uIDwgc2l6ZW9mKG5fdmFsdWVzKS9zaXplb2Yo +bl92YWx1ZXNbMF0pOyBpX24rKykKCXsKCQluID0gbl92YWx1ZXNbaV9uXTsK +CQlmb3IgKG0gPSAxOyBtIDwgMipuOyBtICo9IDIpCgkJewoJCQlmb3IgKGRp +c3QgPSBESVNUX0ZJUlNUOyBkaXN0IDw9IERJU1RfTEFTVDsgZGlzdCsrKQoJ +CQl7CgkJCQlzd2l0Y2ggKGRpc3QpCgkJCQl7CgkJCQkJY2FzZSBTQVdUT09U +SDoKCQkJCQkJZm9yIChpID0gaiA9IDAsIGsgPSAxOyBpIDwgbjsgaSsrKQoJ +CQkJCQl7CgkJCQkJCQl4W2ldID0gaSAlIG07CgkJCQkJCX0KCQkJCQkJYnJl +YWs7CgkJCQkJY2FzZSBSQU5EOgoJCQkJCQlmb3IgKGkgPSBqID0gMCwgayA9 +IDE7IGkgPCBuOyBpKyspCgkJCQkJCXsKCQkJCQkJCXhbaV0gPSByYW5kKCkg +JSBtOwoJCQkJCQl9CgkJCQkJCWJyZWFrOwoJCQkJCWNhc2UgU1RBR0dFUjoK +CQkJCQkJZm9yIChpID0gaiA9IDAsIGsgPSAxOyBpIDwgbjsgaSsrKQoJCQkJ +CQl7CgkJCQkJCQl4W2ldID0gKGkqbSArIGkpICUgbjsKCQkJCQkJfQoJCQkJ +CQlicmVhazsKCQkJCQljYXNlIFBMQVRFQVU6CgkJCQkJCWZvciAoaSA9IGog +PSAwLCBrID0gMTsgaSA8IG47IGkrKykKCQkJCQkJewoJCQkJCQkJeFtpXSA9 +IGkgPCBtID8gaSA6IG07CgkJCQkJCX0KCQkJCQkJYnJlYWs7CgkJCQkJY2Fz +ZSBTSFVGRkxFOgoJCQkJCQlmb3IgKGkgPSBqID0gMCwgayA9IDE7IGkgPCBu +OyBpKyspCgkJCQkJCXsKCQkJCQkJCXhbaV0gPSAocmFuZCgpJW0pID8gKGor +PTIpIDogKGsrPTIpOwoJCQkJCQl9CgkJCQkJCWJyZWFrOwoJCQkJfQoJCQkJ +dGVzdF9pbnRfY29weShkaXN0LCB4LCBuKTsgLyogd29yayBvbiBhIGNvcHkg +b2YgeCAqLwoJCQkJdGVzdF9pbnRfcmV2ZXJzZShkaXN0LCBNUkVWRVJTRSwg +eCwgbiwgMCwgbik7IC8qIG9uIGEgcmV2ZXJzZWQgY29weSAqLwoJCQkJdGVz +dF9pbnRfcmV2ZXJzZShkaXN0LCBNRlJFVkVSU0UsIHgsIG4sIDAsIG4vMik7 +CS8qIGZyb250IGhhbGYgcmV2ZXJzZWQgKi8KCQkJCXRlc3RfaW50X3JldmVy +c2UoZGlzdCwgTUJSRVZFUlNFLCB4LCBuLCBuLzIsIG4pOwkvKiBiYWNrIGhh +bGYgcmV2ZXJzZWQgKi8KCQkJCXRlc3RfaW50X3NvcnQoZGlzdCwgeCwgbik7 +IC8qIGFuIG9yZGVyZWQgY29weSAqLwoJCQkJdGVzdF9pbnRfZGl0aGVyKGRp +c3QsIHgsIG4pOyAvKiBhZGQgaSU1IHRvIHhbaV0gKi8KCQkJfQoJCX0KCX0K +Cglmb3IgKGRpc3QgPSBESVNUX0ZJUlNUOyBkaXN0IDw9IERJU1RfTEFTVDsg +ZGlzdCsrKQoJewoJCXByaW50ZigiZGlzdHJpYnV0aW9uICVzOiBtYXggY3Jh +dGlvICVnLCBhdmVyYWdlICVnIG92ZXIgJWQgdGVzdHNcbiIsCgkJCSAgIGRp +c3RuYW1lW2Rpc3RdLAoJCQkgICBtYXhjcmF0aW9fZFtkaXN0XSwKCQkJICAg +c3VtY3JhdGlvX2RbZGlzdF0gLyBudGVzdHNfZFtkaXN0XSwKCQkJICAgbnRl +c3RzX2RbZGlzdF0pOwoJfQoKCXN1bWNyYXRpbyA9IDA7CgludGVzdHMgPSAw +OwoKCWZvciAobWV0aCA9IE1FVEhfRklSU1Q7IG1ldGggPD0gTUVUSF9MQVNU +OyBtZXRoKyspCgl7CgkJcHJpbnRmKCJtZXRob2QgJXM6IG1heCBjcmF0aW8g +JWcsIGF2ZXJhZ2UgJWcgb3ZlciAlZCB0ZXN0c1xuIiwKCQkJICAgbWV0aG5h +bWVbbWV0aF0sCgkJCSAgIG1heGNyYXRpb19tW21ldGhdLAoJCQkgICBzdW1j +cmF0aW9fbVttZXRoXSAvIG50ZXN0c19tW21ldGhdLAoJCQkgICBudGVzdHNf +bVttZXRoXSk7CgkJc3VtY3JhdGlvICs9IHN1bWNyYXRpb19tW21ldGhdOwoJ +CW50ZXN0cyArPSBudGVzdHNfbVttZXRoXTsKCX0KCglwcmludGYoIk92ZXJh +bGw6IGF2ZXJhZ2UgY3JhdGlvICVnIG92ZXIgJWQgdGVzdHNcbiIsCgkJICAg +c3VtY3JhdGlvIC8gbnRlc3RzLAoJCSAgIG50ZXN0cyk7CgoJcmV0dXJuIDA7 +Cn0K + +------- =_aaaaaaaaaa0 +Content-Type: text/plain +Content-Disposition: inline +Content-Transfer-Encoding: 8bit +MIME-Version: 1.0 + + +---------------------------(end of broadcast)--------------------------- +TIP 6: explain analyze is your friend + +------- =_aaaaaaaaaa0-- + +From pgsql-hackers-owner+M81167@postgresql.org Thu Mar 16 18:48:37 2006 +Return-path: +Received: from ams.hub.org (ams.hub.org [200.46.204.13]) + by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id k2GNmbu12770 + for ; Thu, 16 Mar 2006 18:48:37 -0500 (EST) +Received: from postgresql.org (postgresql.org [200.46.204.71]) + by ams.hub.org (Postfix) with ESMTP id 570D567BADC; + Thu, 16 Mar 2006 19:48:35 -0400 (AST) +X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org +Received: from localhost (av.hub.org [200.46.204.144]) + by postgresql.org (Postfix) with ESMTP id B49219DCBC2 + for ; Thu, 16 Mar 2006 19:48:12 -0400 (AST) +Received: from postgresql.org ([200.46.204.71]) + by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024) + with ESMTP id 28142-10 + for ; + Thu, 16 Mar 2006 19:48:15 -0400 (AST) +X-Greylist: from auto-whitelisted by SQLgrey- +Received: from sss.pgh.pa.us (sss.pgh.pa.us [66.207.139.130]) + by postgresql.org (Postfix) with ESMTP id 9E95A9DCBAD + for ; Thu, 16 Mar 2006 19:48:10 -0400 (AST) +Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) + by sss.pgh.pa.us (8.13.1/8.13.1) with ESMTP id k2GNm9Kt023199; + Thu, 16 Mar 2006 18:48:09 -0500 (EST) +To: Darcy Buskermolen +cc: pgsql-hackers@postgresql.org, "Dann Corbit" , + "Jonah H. Harris" , + "Jerry Sievers" +Subject: Re: [HACKERS] qsort, once again +In-Reply-To: <200603161541.25929.darcy@wavefire.com> +References: <19646.1142539750@sss.pgh.pa.us> <200603161541.25929.darcy@wavefire.com> +Comments: In-reply-to Darcy Buskermolen + message dated "Thu, 16 Mar 2006 15:41:24 -0800" +Date: Thu, 16 Mar 2006 18:48:09 -0500 +Message-ID: <23198.1142552889@sss.pgh.pa.us> +From: Tom Lane +X-Virus-Scanned: by amavisd-new at hub.org +X-Spam-Status: No, score=0.113 required=5 tests=[AWL=0.113] +X-Spam-Score: 0.113 +X-Mailing-List: pgsql-hackers +List-Archive: +List-Help: +List-Id: +List-Owner: +List-Post: +List-Subscribe: +List-Unsubscribe: +Precedence: bulk +Sender: pgsql-hackers-owner@postgresql.org +Status: OR + +Darcy Buskermolen writes: +> On Thursday 16 March 2006 12:09, Tom Lane wrote: +>> So we still have a problem of software archaeology: who added the +>> insertion sort switch to the NetBSD version, and on what grounds? + +> This is when that particular code was pushed in, as to why exactly, you'll +> have to ask mycroft. +> http://cvsweb.netbsd.org/bsdweb.cgi/src/lib/libc/stdlib/qsort.c.diff?r1=1.3&r2=1.4&only_with_tag=MAIN + +Interesting. It looks to me like he replaced the former +vaguely-Knuth-based coding with B&M's code, but kept the insertion- +sort-after-no-swap special case that was in the previous code. I'll +betcha he didn't test to see whether this was actually such a great +idea ... + + regards, tom lane + +---------------------------(end of broadcast)--------------------------- +TIP 1: if posting/reading through Usenet, please send an appropriate + subscribe-nomail command to majordomo@postgresql.org so that your + message can get through to the mailing list cleanly + +From pgsql-hackers-owner+M81168@postgresql.org Thu Mar 16 19:42:51 2006 +Return-path: +Received: from ams.hub.org (ams.hub.org [200.46.204.13]) + by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id k2H0gpu19805 + for ; Thu, 16 Mar 2006 19:42:51 -0500 (EST) +Received: from postgresql.org (postgresql.org [200.46.204.71]) + by ams.hub.org (Postfix) with ESMTP id 5F20967BADE; + Thu, 16 Mar 2006 20:42:48 -0400 (AST) +X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org +Received: from localhost (av.hub.org [200.46.204.144]) + by postgresql.org (Postfix) with ESMTP id AA2239DCBAD + for ; Thu, 16 Mar 2006 20:42:20 -0400 (AST) +Received: from postgresql.org ([200.46.204.71]) + by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024) + with ESMTP id 46728-01 + for ; + Thu, 16 Mar 2006 20:42:23 -0400 (AST) +X-Greylist: from auto-whitelisted by SQLgrey- +Received: from postal.corporate.connx.com (postal.corporate.connx.com [65.212.159.187]) + by postgresql.org (Postfix) with ESMTP id 062EC9DCBA4 + for ; Thu, 16 Mar 2006 20:42:17 -0400 (AST) +X-MimeOLE: Produced By Microsoft Exchange V6.5 +Content-class: urn:content-classes:message +MIME-Version: 1.0 +Content-Type: text/plain; + charset="us-ascii" +Subject: Re: [HACKERS] qsort, once again +Date: Thu, 16 Mar 2006 16:42:20 -0800 +Message-ID: +Thread-Topic: [HACKERS] qsort, once again +Thread-Index: AcZJUnwKofAzJ+OKTcqF67UTsFWJEQACP7AA +From: "Dann Corbit" +To: "Tom Lane" +cc: "Jonah H. Harris" , , + "Jerry Sievers" +X-Virus-Scanned: by amavisd-new at hub.org +X-Spam-Status: No, score=0.096 required=5 tests=[AWL=0.096] +X-Spam-Score: 0.096 +X-Mailing-List: pgsql-hackers +List-Archive: +List-Help: +List-Id: +List-Owner: +List-Post: +List-Subscribe: +List-Unsubscribe: +Precedence: bulk +Sender: pgsql-hackers-owner@postgresql.org +Content-Transfer-Encoding: 8bit +X-MIME-Autoconverted: from quoted-printable to 8bit by candle.pha.pa.us id k2H0gpu19805 +Status: OR + +> So my feeling is we should just remove the swap_cnt code and return to +> the original B&M algorithm. Being much faster than expected for +> presorted input doesn't justify being far slower than expected for +> other inputs, IMHO. In the context of Postgres I doubt that perfectly +> sorted input shows up very often anyway. +> +> Comments? + +Checking for presorted input is O(n). +If the input is random, an average of 3 elements will be tested. +So adding an in-order check of the data should not be too expensive. + +I would benchmark several approaches and see which one is best when used +in-place. + + +---------------------------(end of broadcast)--------------------------- +TIP 3: Have you checked our extensive FAQ? + + http://www.postgresql.org/docs/faq + +From pgsql-hackers-owner+M81169@postgresql.org Thu Mar 16 20:13:08 2006 +Return-path: +Received: from ams.hub.org (ams.hub.org [200.46.204.13]) + by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id k2H1D7u25008 + for ; Thu, 16 Mar 2006 20:13:07 -0500 (EST) +Received: from postgresql.org (postgresql.org [200.46.204.71]) + by ams.hub.org (Postfix) with ESMTP id 6B48F67BADE; + Thu, 16 Mar 2006 21:13:04 -0400 (AST) +X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org +Received: from localhost (av.hub.org [200.46.204.144]) + by postgresql.org (Postfix) with ESMTP id 19FAD9DCC5C + for ; Thu, 16 Mar 2006 21:12:36 -0400 (AST) +Received: from postgresql.org ([200.46.204.71]) + by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024) + with ESMTP id 53608-01 + for ; + Thu, 16 Mar 2006 21:12:37 -0400 (AST) +X-Greylist: from auto-whitelisted by SQLgrey- +Received: from postal.corporate.connx.com (postal.corporate.connx.com [65.212.159.187]) + by postgresql.org (Postfix) with ESMTP id 839069DCC2D + for ; Thu, 16 Mar 2006 21:12:32 -0400 (AST) +X-MimeOLE: Produced By Microsoft Exchange V6.5 +Content-class: urn:content-classes:message +MIME-Version: 1.0 +Content-Type: text/plain; + charset="us-ascii" +Subject: Re: [HACKERS] qsort, once again +Date: Thu, 16 Mar 2006 17:12:35 -0800 +Message-ID: +Thread-Topic: [HACKERS] qsort, once again +Thread-Index: AcZJUnwKofAzJ+OKTcqF67UTsFWJEQACP7AAAADU8dA= +From: "Dann Corbit" +To: "Dann Corbit" , "Tom Lane" +cc: "Jonah H. Harris" , , + "Jerry Sievers" +X-Virus-Scanned: by amavisd-new at hub.org +X-Spam-Status: No, score=0.096 required=5 tests=[AWL=0.096] +X-Spam-Score: 0.096 +X-Mailing-List: pgsql-hackers +List-Archive: +List-Help: +List-Id: +List-Owner: +List-Post: +List-Subscribe: +List-Unsubscribe: +Precedence: bulk +Sender: pgsql-hackers-owner@postgresql.org +Content-Transfer-Encoding: 8bit +X-MIME-Autoconverted: from quoted-printable to 8bit by candle.pha.pa.us id k2H1D7u25008 +Status: OR + +> -----Original Message----- +> From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers- +> owner@postgresql.org] On Behalf Of Dann Corbit +> Sent: Thursday, March 16, 2006 4:42 PM +> To: Tom Lane +> Cc: Jonah H. Harris; pgsql-hackers@postgresql.org; Jerry Sievers +> Subject: Re: [HACKERS] qsort, once again +> +> > So my feeling is we should just remove the swap_cnt code and return +to +> > the original B&M algorithm. Being much faster than expected for +> > presorted input doesn't justify being far slower than expected for +> > other inputs, IMHO. In the context of Postgres I doubt that +perfectly +> > sorted input shows up very often anyway. +> > +> > Comments? +> +> Checking for presorted input is O(n). +> If the input is random, an average of 3 elements will be tested. +> So adding an in-order check of the data should not be too expensive. +> +> I would benchmark several approaches and see which one is best when +used +> in-place. + +Even if "hunks" of the input are sorted, the test is a very good idea. + +Recall that we are sorting recursively and so we divide the data into +chunks. + +Consider an example... + +Quicksort of a field that contains Sex as 'M' for male, 'F' for female, +or NULL for unknown. + +The median selection is going to pick one of 'M', 'F', or NULL. +After pass 1 of qsort we will have two partitions. One partition will +have all of one type and the other partition will have the other two +types. + +An in-order check will tell us that the monotone partition is sorted and +we are done with it. + +Imagine also a table that was clustered but for which we have not +updated statistics. Perhaps it is 98% sorted. Checking for order in +our partitions is probably a good idea. + +I think you could also get a good optimization if you are checking for +partitions and find a big section of the partition is not ordered (even +though the whole thing is not). If you could perk the ordered size up +the tree, you could just add another partition to the merge list and +sort the unordered part. + +In "C Unleashed" I call this idea partition discovery mergesort. + +---------------------------(end of broadcast)--------------------------- +TIP 5: don't forget to increase your free space map settings + +From pgsql-hackers-owner+M81172@postgresql.org Fri Mar 17 00:27:41 2006 +Return-path: +Received: from ams.hub.org (ams.hub.org [200.46.204.13]) + by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id k2H5Reu14258 + for ; Fri, 17 Mar 2006 00:27:40 -0500 (EST) +Received: from postgresql.org (postgresql.org [200.46.204.71]) + by ams.hub.org (Postfix) with ESMTP id CE86767BAEA; + Fri, 17 Mar 2006 01:27:36 -0400 (AST) +X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org +Received: from localhost (av.hub.org [200.46.204.144]) + by postgresql.org (Postfix) with ESMTP id 465549DC874 + for ; Fri, 17 Mar 2006 01:27:11 -0400 (AST) +Received: from postgresql.org ([200.46.204.71]) + by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024) + with ESMTP id 76897-01 + for ; + Fri, 17 Mar 2006 01:27:10 -0400 (AST) +X-Greylist: from auto-whitelisted by SQLgrey- +Received: from sss.pgh.pa.us (sss.pgh.pa.us [66.207.139.130]) + by postgresql.org (Postfix) with ESMTP id 2456F9DC871 + for ; Fri, 17 Mar 2006 01:27:08 -0400 (AST) +Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) + by sss.pgh.pa.us (8.13.1/8.13.1) with ESMTP id k2H5R6PF025295; + Fri, 17 Mar 2006 00:27:06 -0500 (EST) +To: "Dann Corbit" +cc: "Jonah H. Harris" , pgsql-hackers@postgresql.org, + "Jerry Sievers" +Subject: Re: [HACKERS] qsort, once again +In-Reply-To: +References: +Comments: In-reply-to "Dann Corbit" + message dated "Thu, 16 Mar 2006 17:12:35 -0800" +Date: Fri, 17 Mar 2006 00:27:05 -0500 +Message-ID: <25294.1142573225@sss.pgh.pa.us> +From: Tom Lane +X-Virus-Scanned: by amavisd-new at hub.org +X-Spam-Status: No, score=0.113 required=5 tests=[AWL=0.113] +X-Spam-Score: 0.113 +X-Mailing-List: pgsql-hackers +List-Archive: +List-Help: +List-Id: +List-Owner: +List-Post: +List-Subscribe: +List-Unsubscribe: +Precedence: bulk +Sender: pgsql-hackers-owner@postgresql.org +Status: OR + +"Dann Corbit" writes: +>> So my feeling is we should just remove the swap_cnt code and return +>> to the original B&M algorithm. + +> Even if "hunks" of the input are sorted, the test is a very good idea. + +Yah know, guys, Bentley and McIlroy are each smarter than any five of +us, and I'm quite certain it occurred to them to try prechecking for +sorted input. If that case is not in their code then it's probably +because it's a net loss. Unless you have reason to think that sorted +input is *more* common than other cases for the Postgres environment, +which is certainly a fact not in evidence. + +(Bentley was my thesis adviser for awhile before he went to Bell Labs, +so my respect for him is based on direct personal experience. McIlroy +I only know by reputation, but he's sure got a ton of that.) + +> Imagine also a table that was clustered but for which we have not +> updated statistics. Perhaps it is 98% sorted. Checking for order in +> our partitions is probably a good idea. + +If we are using the sort code rather than the recently-clustered index +for such a case, then we have problems elsewhere. This scenario is not +a good argument that the sort code needs to be specialized to handle +this case at the expense of other cases; the place to be fixing it is +the planner or the statistics-management code. + + regards, tom lane + +---------------------------(end of broadcast)--------------------------- +TIP 9: In versions below 8.0, the planner will ignore your desire to + choose an index scan if your joining column's datatypes do not + match + +From pgsql-hackers-owner+M81173@postgresql.org Fri Mar 17 00:29:24 2006 +Return-path: +Received: from ams.hub.org (ams.hub.org [200.46.204.13]) + by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id k2H5TNu14505 + for ; Fri, 17 Mar 2006 00:29:23 -0500 (EST) +Received: from postgresql.org (postgresql.org [200.46.204.71]) + by ams.hub.org (Postfix) with ESMTP id A271D67BAEA; + Fri, 17 Mar 2006 01:29:19 -0400 (AST) +X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org +Received: from localhost (av.hub.org [200.46.204.144]) + by postgresql.org (Postfix) with ESMTP id C96D79DCA40 + for ; Fri, 17 Mar 2006 01:28:55 -0400 (AST) +Received: from postgresql.org ([200.46.204.71]) + by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024) + with ESMTP id 44062-07 + for ; + Fri, 17 Mar 2006 01:28:54 -0400 (AST) +X-Greylist: from auto-whitelisted by SQLgrey- +Received: from postal.corporate.connx.com (postal.corporate.connx.com [65.212.159.187]) + by postgresql.org (Postfix) with ESMTP id CDE6C9DCA4B + for ; Fri, 17 Mar 2006 01:28:53 -0400 (AST) +X-MimeOLE: Produced By Microsoft Exchange V6.5 +Content-class: urn:content-classes:message +MIME-Version: 1.0 +Content-Type: text/plain; + charset="us-ascii" +Subject: Re: [HACKERS] qsort, once again +Date: Thu, 16 Mar 2006 21:28:52 -0800 +Message-ID: +Thread-Topic: [HACKERS] qsort, once again +Thread-Index: AcZJg2/Nvc2IdeFUT0id8WtNczZvGQAACfYw +From: "Dann Corbit" +To: "Tom Lane" +cc: "Jonah H. Harris" , , + "Jerry Sievers" +X-Virus-Scanned: by amavisd-new at hub.org +X-Spam-Status: No, score=0.097 required=5 tests=[AWL=0.097] +X-Spam-Score: 0.097 +X-Mailing-List: pgsql-hackers +List-Archive: +List-Help: +List-Id: +List-Owner: +List-Post: +List-Subscribe: +List-Unsubscribe: +Precedence: bulk +Sender: pgsql-hackers-owner@postgresql.org +Content-Transfer-Encoding: 8bit +X-MIME-Autoconverted: from quoted-printable to 8bit by candle.pha.pa.us id k2H5TNu14505 +Status: OR + +Well, my point was that it is a snap to implement and test. + +It will be better, worse, or the same. + +I agree that Bentley is a bloody genius. + +> -----Original Message----- +> From: Tom Lane [mailto:tgl@sss.pgh.pa.us] +> Sent: Thursday, March 16, 2006 9:27 PM +> To: Dann Corbit +> Cc: Jonah H. Harris; pgsql-hackers@postgresql.org; Jerry Sievers +> Subject: Re: [HACKERS] qsort, once again +> +> "Dann Corbit" writes: +> >> So my feeling is we should just remove the swap_cnt code and return +> >> to the original B&M algorithm. +> +> > Even if "hunks" of the input are sorted, the test is a very good +idea. +> +> Yah know, guys, Bentley and McIlroy are each smarter than any five of +> us, and I'm quite certain it occurred to them to try prechecking for +> sorted input. If that case is not in their code then it's probably +> because it's a net loss. Unless you have reason to think that sorted +> input is *more* common than other cases for the Postgres environment, +> which is certainly a fact not in evidence. +> +> (Bentley was my thesis adviser for awhile before he went to Bell Labs, +> so my respect for him is based on direct personal experience. McIlroy +> I only know by reputation, but he's sure got a ton of that.) +> +> > Imagine also a table that was clustered but for which we have not +> > updated statistics. Perhaps it is 98% sorted. Checking for order +in +> > our partitions is probably a good idea. +> +> If we are using the sort code rather than the recently-clustered index +> for such a case, then we have problems elsewhere. This scenario is +not +> a good argument that the sort code needs to be specialized to handle +> this case at the expense of other cases; the place to be fixing it is +> the planner or the statistics-management code. +> +> regards, tom lane + +---------------------------(end of broadcast)--------------------------- +TIP 2: Don't 'kill -9' the postmaster + +From pgsql-hackers-owner+M81277@postgresql.org Tue Mar 21 13:53:08 2006 +Return-path: +Received: from ams.hub.org (ams.hub.org [200.46.204.13]) + by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id k2LIr6M03797 + for ; Tue, 21 Mar 2006 13:53:06 -0500 (EST) +Received: from postgresql.org (postgresql.org [200.46.204.71]) + by ams.hub.org (Postfix) with ESMTP id 01A8467BBF2; + Tue, 21 Mar 2006 14:53:00 -0400 (AST) +X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org +Received: from localhost (av.hub.org [200.46.204.144]) + by postgresql.org (Postfix) with ESMTP id 0ED1D9DCCD2 + for ; Tue, 21 Mar 2006 14:52:29 -0400 (AST) +Received: from postgresql.org ([200.46.204.71]) + by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024) + with ESMTP id 38232-04-3 + for ; + Tue, 21 Mar 2006 14:52:26 -0400 (AST) +X-Greylist: from auto-whitelisted by SQLgrey- +Received: from sss.pgh.pa.us (sss.pgh.pa.us [66.207.139.130]) + by postgresql.org (Postfix) with ESMTP id 1A0EE9DCC2C + for ; Tue, 21 Mar 2006 14:52:22 -0400 (AST) +Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) + by sss.pgh.pa.us (8.13.1/8.13.1) with ESMTP id k2LIqHPc018733; + Tue, 21 Mar 2006 13:52:17 -0500 (EST) +To: "Dann Corbit" +cc: "Jonah H. Harris" , pgsql-hackers@postgresql.org, + "Jerry Sievers" +Subject: Re: [HACKERS] qsort, once again +In-Reply-To: +References: +Comments: In-reply-to "Dann Corbit" + message dated "Thu, 16 Mar 2006 21:28:52 -0800" +MIME-Version: 1.0 +Content-Type: multipart/mixed; boundary="----- =_aaaaaaaaaa0" +Content-ID: <18685.1142966822.0@sss.pgh.pa.us> +Date: Tue, 21 Mar 2006 13:52:17 -0500 +Message-ID: <18732.1142967137@sss.pgh.pa.us> +From: Tom Lane +X-Virus-Scanned: by amavisd-new at hub.org +X-Spam-Status: No, score=0.113 required=5 tests=[AWL=0.113] +X-Spam-Score: 0.113 +X-Mailing-List: pgsql-hackers +List-Archive: +List-Help: +List-Id: +List-Owner: +List-Post: +List-Subscribe: +List-Unsubscribe: +Precedence: bulk +Sender: pgsql-hackers-owner@postgresql.org +Status: OR + +------- =_aaaaaaaaaa0 +Content-Type: text/plain; charset="us-ascii" +Content-ID: <18685.1142966822.1@sss.pgh.pa.us> + +"Dann Corbit" writes: +> Well, my point was that it is a snap to implement and test. + +Well, having done this, I have to eat my words: it does seem to be a +pretty good idea. + +The following test numbers are using Bentley & McIlroy's test framework, +but modified to test only the case N=10000 rather than the four smaller +N values they originally used. I did that because it exposes quadratic +behavior more obviously, and the variance in N made it harder to compare +comparison ratios for different cases. I also added a "NEARSORT" test +method, which sorts the input distribution and then exchanges two +elements chosen at random. I did that because I was concerned that +nearly sorted input would be the worst case for the presorted-input +check, as it would waste the most cycles before failing on such input. + +With our existing qsort code, the results look like + +distribution SAWTOOTH: max cratio 94.17, min 0.08, average 1.56 over 105 tests +distribution RAND: max cratio 1.06, min 0.08, average 0.51 over 105 tests +distribution STAGGER: max cratio 6.08, min 0.23, average 1.01 over 105 tests +distribution PLATEAU: max cratio 94.17, min 0.08, average 2.12 over 105 tests +distribution SHUFFLE: max cratio 94.17, min 0.23, average 1.92 over 105 tests +method COPY: max cratio 6.08, min 0.08, average 0.72 over 75 tests +method REVERSE: max cratio 5.34, min 0.08, average 0.69 over 75 tests +method FREVERSE: max cratio 94.17, min 0.08, average 5.71 over 75 tests +method BREVERSE: max cratio 3.86, min 0.08, average 1.41 over 75 tests +method SORT: max cratio 0.82, min 0.08, average 0.31 over 75 tests +method NEARSORT: max cratio 0.82, min 0.08, average 0.36 over 75 tests +method DITHER: max cratio 5.52, min 0.18, average 0.77 over 75 tests +Overall: average cratio 1.42 over 525 tests + +("cratio" is the ratio of the actual number of comparison function calls +to the theoretical expectation, N log2(N)) + +That's pretty awful: there are several test cases that make it use +nearly 100 times the expected number of comparisons. + +Removing the swap_cnt test to bring it close to B&M's original +recommendations, we get + +distribution SAWTOOTH: max cratio 3.85, min 0.08, average 0.70 over 105 tests +distribution RAND: max cratio 1.06, min 0.08, average 0.52 over 105 tests +distribution STAGGER: max cratio 6.08, min 0.58, average 1.12 over 105 tests +distribution PLATEAU: max cratio 3.70, min 0.08, average 0.34 over 105 tests +distribution SHUFFLE: max cratio 3.86, min 0.86, average 1.24 over 105 tests +method COPY: max cratio 6.08, min 0.08, average 0.76 over 75 tests +method REVERSE: max cratio 5.34, min 0.08, average 0.75 over 75 tests +method FREVERSE: max cratio 4.56, min 0.08, average 0.73 over 75 tests +method BREVERSE: max cratio 3.86, min 0.08, average 1.41 over 75 tests +method SORT: max cratio 0.86, min 0.08, average 0.56 over 75 tests +method NEARSORT: max cratio 0.86, min 0.08, average 0.56 over 75 tests +method DITHER: max cratio 3.73, min 0.18, average 0.72 over 75 tests +Overall: average cratio 0.78 over 525 tests + +which is a whole lot better as to both average and worst cases. + +I then added some code to check for presorted input (just after the +n<7 insertion sort code): + +#ifdef CHECK_SORTED + presorted = 1; + for (pm = (char *) a + es; pm < (char *) a + n * es; pm += es) + { + if (cmp(pm - es, pm) > 0) + { + presorted = 0; + break; + } + } + if (presorted) + return; +#endif + +This gives + +distribution SAWTOOTH: max cratio 3.88, min 0.08, average 0.62 over 105 tests +distribution RAND: max cratio 1.06, min 0.08, average 0.46 over 105 tests +distribution STAGGER: max cratio 6.15, min 0.08, average 0.98 over 105 tests +distribution PLATEAU: max cratio 3.79, min 0.08, average 0.31 over 105 tests +distribution SHUFFLE: max cratio 3.91, min 0.08, average 1.09 over 105 tests +method COPY: max cratio 6.15, min 0.08, average 0.72 over 75 tests +method REVERSE: max cratio 5.34, min 0.08, average 0.76 over 75 tests +method FREVERSE: max cratio 4.58, min 0.08, average 0.73 over 75 tests +method BREVERSE: max cratio 3.91, min 0.08, average 1.44 over 75 tests +method SORT: max cratio 0.08, min 0.08, average 0.08 over 75 tests +method NEARSORT: max cratio 0.89, min 0.08, average 0.39 over 75 tests +method DITHER: max cratio 3.73, min 0.18, average 0.72 over 75 tests +Overall: average cratio 0.69 over 525 tests + +So the worst case seems only very marginally worse, and there is a +definite improvement in the average case, even for inputs that aren't +entirely sorted. Importantly, the "near sorted" case that I thought +might send it into quadratic behavior doesn't seem to do that. + +So, unless anyone wants to do further testing, I'll go ahead and commit +these changes. + + regards, tom lane + +PS: Just as a comparison point, here are the results when testing HPUX's +library qsort: + +distribution SAWTOOTH: max cratio 7.00, min 0.08, average 0.76 over 105 tests +distribution RAND: max cratio 1.11, min 0.08, average 0.53 over 105 tests +distribution STAGGER: max cratio 7.05, min 0.58, average 1.24 over 105 tests +distribution PLATEAU: max cratio 7.00, min 0.08, average 0.43 over 105 tests +distribution SHUFFLE: max cratio 7.00, min 0.86, average 1.54 over 105 tests +method COPY: max cratio 6.70, min 0.08, average 0.79 over 75 tests +method REVERSE: max cratio 7.05, min 0.08, average 0.78 over 75 tests +method FREVERSE: max cratio 7.00, min 0.08, average 0.77 over 75 tests +method BREVERSE: max cratio 7.00, min 0.08, average 2.11 over 75 tests +method SORT: max cratio 0.86, min 0.08, average 0.56 over 75 tests +method NEARSORT: max cratio 0.86, min 0.08, average 0.56 over 75 tests +method DITHER: max cratio 4.06, min 0.16, average 0.74 over 75 tests +Overall: average cratio 0.90 over 525 tests + +and here are the results using glibc's qsort, which of course isn't +quicksort at all but some kind of merge sort: + +distribution SAWTOOTH: max cratio 0.90, min 0.49, average 0.65 over 105 tests +distribution RAND: max cratio 0.91, min 0.49, average 0.76 over 105 tests +distribution STAGGER: max cratio 0.92, min 0.49, average 0.70 over 105 tests +distribution PLATEAU: max cratio 0.84, min 0.49, average 0.54 over 105 tests +distribution SHUFFLE: max cratio 0.64, min 0.49, average 0.52 over 105 tests +method COPY: max cratio 0.92, min 0.49, average 0.66 over 75 tests +method REVERSE: max cratio 0.92, min 0.49, average 0.68 over 75 tests +method FREVERSE: max cratio 0.92, min 0.49, average 0.67 over 75 tests +method BREVERSE: max cratio 0.92, min 0.49, average 0.68 over 75 tests +method SORT: max cratio 0.49, min 0.49, average 0.49 over 75 tests +method NEARSORT: max cratio 0.55, min 0.49, average 0.51 over 75 tests +method DITHER: max cratio 0.92, min 0.50, average 0.74 over 75 tests +Overall: average cratio 0.63 over 525 tests + +PPS: final version of test framework attached for the archives. + + +------- =_aaaaaaaaaa0 +Content-Type: application/octet-stream +Content-ID: <18685.1142966822.2@sss.pgh.pa.us> +Content-Description: sorttester.c +Content-Transfer-Encoding: base64 + +I2luY2x1ZGUgPHN0ZGlvLmg+CiNpbmNsdWRlIDxzdGRsaWIuaD4KI2luY2x1 +ZGUgPHN0cmluZy5oPgojaW5jbHVkZSA8bWF0aC5oPgoKI2lmZGVmIFVTRV9R +U09SVAojZGVmaW5lIHRlc3RfcXNvcnQgcXNvcnQKI2Vsc2UKZXh0ZXJuIHZv +aWQgdGVzdF9xc29ydCh2b2lkICphLCBzaXplX3Qgbiwgc2l6ZV90IGVzLAoJ +CQkJCSAgIGludCAoKmNtcCkgKGNvbnN0IHZvaWQgKiwgY29uc3Qgdm9pZCAq +KSk7CiNlbmRpZgoKLy9zdGF0aWMgY29uc3QgaW50IG5fdmFsdWVzW10gPSB7 +IDEwMCwgMTAyMywgMTAyNCwgMTAyNSwgMTAwMDAgfSA7CnN0YXRpYyBjb25z +dCBpbnQgbl92YWx1ZXNbXSA9IHsgMTAwMDAgfSA7CiNkZWZpbmUgTUFYX04g +MTAwMDAKCnR5cGVkZWYgZW51bQp7CglTQVdUT09USCwgUkFORCwgU1RBR0dF +UiwgUExBVEVBVSwgU0hVRkZMRQp9IERJU1Q7CiNkZWZpbmUgRElTVF9GSVJT +VAlTQVdUT09USAojZGVmaW5lIERJU1RfTEFTVAlTSFVGRkxFCgpzdGF0aWMg +Y29uc3QgY2hhciAqIGNvbnN0IGRpc3RuYW1lW10gPSB7CgkiU0FXVE9PVEgi +LCAiUkFORCIsICJTVEFHR0VSIiwgIlBMQVRFQVUiLCAiU0hVRkZMRSIKfTsK +CnR5cGVkZWYgZW51bQp7CglNQ09QWSwgTVJFVkVSU0UsIE1GUkVWRVJTRSwg +TUJSRVZFUlNFLCBNU09SVCwgTU5FQVJTT1JULCBNRElUSEVSCn0gTU9ETUVU +SE9EOwojZGVmaW5lIE1FVEhfRklSU1QJTUNPUFkKI2RlZmluZSBNRVRIX0xB +U1QJTURJVEhFUgoKc3RhdGljIGNvbnN0IGNoYXIgKiBjb25zdCBtZXRobmFt +ZVtdID0gewoJIkNPUFkiLCAiUkVWRVJTRSIsICJGUkVWRVJTRSIsICJCUkVW +RVJTRSIsICJTT1JUIiwgIk5FQVJTT1JUIiwgIkRJVEhFUiIKfTsKCi8qIHBl +ci10ZXN0IGNvdW50ZXIgKi8Kc3RhdGljIGxvbmcgbmNvbXBhcmVzOwoKLyog +YWNjdW11bGF0ZSByZXN1bHRzIGFjcm9zcyB0ZXN0cywgZGlzdC13aXNlICov +CnN0YXRpYyBkb3VibGUgc3VtY3JhdGlvX2RbRElTVF9MQVNUKzFdOwpzdGF0 +aWMgZG91YmxlIG1heGNyYXRpb19kW0RJU1RfTEFTVCsxXTsKc3RhdGljIGRv +dWJsZSBtaW5jcmF0aW9fZFtESVNUX0xBU1QrMV07CnN0YXRpYyBpbnQgbnRl +c3RzX2RbRElTVF9MQVNUKzFdOwovKiBhY2N1bXVsYXRlIHJlc3VsdHMgYWNy +b3NzIHRlc3RzLCBtb2QtbWV0aG9kLXdpc2UgKi8Kc3RhdGljIGRvdWJsZSBz +dW1jcmF0aW9fbVtNRVRIX0xBU1QrMV07CnN0YXRpYyBkb3VibGUgbWF4Y3Jh +dGlvX21bTUVUSF9MQVNUKzFdOwpzdGF0aWMgZG91YmxlIG1pbmNyYXRpb19t +W01FVEhfTEFTVCsxXTsKc3RhdGljIGludCBudGVzdHNfbVtNRVRIX0xBU1Qr +MV07CgoKc3RhdGljIGludAppbnRfY21wKGNvbnN0IHZvaWQgKmEsIGNvbnN0 +IHZvaWQgKmIpCnsKCWludAkJYWEgPSAqKGNvbnN0IGludCAqKSBhOwoJaW50 +CQliYiA9ICooY29uc3QgaW50ICopIGI7CgoJbmNvbXBhcmVzKys7CgoJaWYg +KGFhIDwgYmIpCgkJcmV0dXJuIC0xOwoJaWYgKGFhID4gYmIpCgkJcmV0dXJu +IDE7CglyZXR1cm4gMDsKfQoKCnN0YXRpYyB2b2lkCnRlc3RfY29tbW9uKERJ +U1QgZGlzdCwgTU9ETUVUSE9EIG1ldGgsIHZvaWQgKnh0LCBzaXplX3Qgbiwg +c2l6ZV90IHN6LAoJCQlpbnQgKCpjbXApIChjb25zdCB2b2lkICosIGNvbnN0 +IHZvaWQgKikpCnsKCWRvdWJsZSBubG9nbjsKCWRvdWJsZSBjcmF0aW87CgoJ +bmNvbXBhcmVzID0gMDsKCXRlc3RfcXNvcnQoeHQsIG4sIHN6LCBjbXApOwoJ +LyogbG9nIHRoZSBjb3N0IGJlZm9yZSBkb2luZyBtb3JlIGNtcHMgKi8KCW5s +b2duID0gbiAqIGxvZygoZG91YmxlKSBuKSAvIGxvZygyLjApOwoJY3JhdGlv +ID0gbmNvbXBhcmVzIC8gbmxvZ247CglzdW1jcmF0aW9fZFtkaXN0XSArPSBj +cmF0aW87CglpZiAobnRlc3RzX2RbZGlzdF0gPT0gMCkKCXsKCQltYXhjcmF0 +aW9fZFtkaXN0XSA9IG1pbmNyYXRpb19kW2Rpc3RdID0gY3JhdGlvOwoJfQoJ +ZWxzZQoJewoJCWlmIChjcmF0aW8gPiBtYXhjcmF0aW9fZFtkaXN0XSkKCQkJ +bWF4Y3JhdGlvX2RbZGlzdF0gPSBjcmF0aW87CgkJaWYgKGNyYXRpbyA8IG1p +bmNyYXRpb19kW2Rpc3RdKQoJCQltaW5jcmF0aW9fZFtkaXN0XSA9IGNyYXRp +bzsKCX0KCW50ZXN0c19kW2Rpc3RdKys7CglzdW1jcmF0aW9fbVttZXRoXSAr +PSBjcmF0aW87CglpZiAobnRlc3RzX21bbWV0aF0gPT0gMCkKCXsKCQltYXhj +cmF0aW9fbVttZXRoXSA9IG1pbmNyYXRpb19tW21ldGhdID0gY3JhdGlvOwoJ +fQoJZWxzZQoJewoJCWlmIChjcmF0aW8gPiBtYXhjcmF0aW9fbVttZXRoXSkK +CQkJbWF4Y3JhdGlvX21bbWV0aF0gPSBjcmF0aW87CgkJaWYgKGNyYXRpbyA8 +IG1pbmNyYXRpb19tW21ldGhdKQoJCQltaW5jcmF0aW9fbVttZXRoXSA9IGNy +YXRpbzsKCX0KCW50ZXN0c19tW21ldGhdKys7CgkvKiBub3cgY2hlY2sgZm9y +IGNvcnJlY3Qgc29ydGluZyAqLwoJewoJCWNoYXIgKnAgPSB4dDsKCQl3aGls +ZSAobi0tID4gMSkKCQl7CgkJCWlmIChjbXAocCwgcCArIHN6KSA+IDApCgkJ +CXsKCQkJCWZwcmludGYoc3RkZXJyLCAid3Jvbmcgc29ydCByZXN1bHQgZm9y +ICVzLyVzIVxuIiwKCQkJCQkJZGlzdG5hbWVbZGlzdF0sIG1ldGhuYW1lW21l +dGhdKTsKCQkJCWV4aXQoMSk7CgkJCX0KCQkJcCArPSBzejsKCQl9Cgl9Cn0K +CgovKiB3b3JrIG9uIGEgY29weSBvZiB4ICovCnN0YXRpYyB2b2lkCnRlc3Rf +aW50X2NvcHkoRElTVCBkaXN0LCBpbnQgeFtdLCBpbnQgbikKewoJaW50CQl4 +dFtNQVhfTl07CgoJbWVtY3B5KHh0LCB4LCBuICogc2l6ZW9mKGludCkpOwoJ +dGVzdF9jb21tb24oZGlzdCwgTUNPUFksICh2b2lkICopIHh0LCBuLCBzaXpl +b2YoaW50KSwgaW50X2NtcCk7Cn0KCi8qIHJldmVyc2UgdGhlIHJhbmdlIHN0 +YXJ0IDw9IGkgPCBzdG9wICovCnN0YXRpYyB2b2lkCnRlc3RfaW50X3JldmVy +c2UoRElTVCBkaXN0LCBNT0RNRVRIT0QgbWV0aCwgaW50IHhbXSwgaW50IG4s +IGludCBzdGFydCwgaW50IHN0b3ApCnsKCWludAkJeHRbTUFYX05dOwoJaW50 +CQlpOwoKCW1lbWNweSh4dCwgeCwgbiAqIHNpemVvZihpbnQpKTsKCWZvciAo +aSA9IHN0YXJ0OyBpIDwgc3RvcDsgaSsrKQoJewoJCXh0W2ldID0geFtzdG9w +IC0gMSAtIGldOwoJfQoJdGVzdF9jb21tb24oZGlzdCwgbWV0aCwgKHZvaWQg +KikgeHQsIG4sIHNpemVvZihpbnQpLCBpbnRfY21wKTsKfQoKLyogcHJlLXNv +cnQgdXNpbmcgYSB0cnVzdGVkIHNvcnQgKi8Kc3RhdGljIHZvaWQKdGVzdF9p +bnRfc29ydChESVNUIGRpc3QsIGludCB4W10sIGludCBuKQp7CglpbnQJCXh0 +W01BWF9OXTsKCgltZW1jcHkoeHQsIHgsIG4gKiBzaXplb2YoaW50KSk7Cglx +c29ydCgodm9pZCAqKSB4dCwgbiwgc2l6ZW9mKGludCksIGludF9jbXApOwoJ +dGVzdF9jb21tb24oZGlzdCwgTVNPUlQsICh2b2lkICopIHh0LCBuLCBzaXpl +b2YoaW50KSwgaW50X2NtcCk7Cn0KCi8qIG5lYXItc29ydGVkICovCnN0YXRp +YyB2b2lkCnRlc3RfaW50X25lYXJzb3J0KERJU1QgZGlzdCwgaW50IHhbXSwg +aW50IG4pCnsKCWludAkJeHRbTUFYX05dOwoKCW1lbWNweSh4dCwgeCwgbiAq +IHNpemVvZihpbnQpKTsKCXFzb3J0KCh2b2lkICopIHh0LCBuLCBzaXplb2Yo +aW50KSwgaW50X2NtcCk7CgkvKiBzd2FwIGEgcmFuZG9tIHR3byBlbGVtZW50 +cyAqLwoJewoJCWludCBpID0gcmFuZCgpICUgbjsKCQlpbnQgaiA9IHJhbmQo +KSAlIG47CgkJaW50IHQgPSB4dFtpXTsKCQl4dFtpXSA9IHh0W2pdOwoJCXh0 +W2pdID0gdDsKCX0KCXRlc3RfY29tbW9uKGRpc3QsIE1ORUFSU09SVCwgKHZv +aWQgKikgeHQsIG4sIHNpemVvZihpbnQpLCBpbnRfY21wKTsKfQoKLyogYWRk +IGklNSB0byB4W2ldICovCnN0YXRpYyB2b2lkCnRlc3RfaW50X2RpdGhlcihE +SVNUIGRpc3QsIGludCB4W10sIGludCBuKQp7CglpbnQJCXh0W01BWF9OXTsK +CWludAkJaTsKCglmb3IgKGkgPSAwOyBpIDwgbjsgaSsrKQoJewoJCXh0W2ld +ID0geFtpXSArIGklNTsKCX0KCXRlc3RfY29tbW9uKGRpc3QsIE1ESVRIRVIs +ICh2b2lkICopIHh0LCBuLCBzaXplb2YoaW50KSwgaW50X2NtcCk7Cn0KCgpp +bnQKbWFpbigpCnsKCWludAkJeFtNQVhfTl07CglpbnQJCWlfbjsKCWludAkJ +bjsKCWludAkJbTsKCURJU1QJZGlzdDsKCU1PRE1FVEhPRCBtZXRoOwoJaW50 +CQlpOwoJaW50CQlqOwoJaW50CQlrOwoJZG91YmxlCXN1bWNyYXRpbzsKCWlu +dAkJbnRlc3RzOwoKCWZvciAoaV9uID0gMDsgaV9uIDwgc2l6ZW9mKG5fdmFs +dWVzKS9zaXplb2Yobl92YWx1ZXNbMF0pOyBpX24rKykKCXsKCQluID0gbl92 +YWx1ZXNbaV9uXTsKCQlmb3IgKG0gPSAxOyBtIDwgMipuOyBtICo9IDIpCgkJ +ewoJCQlmb3IgKGRpc3QgPSBESVNUX0ZJUlNUOyBkaXN0IDw9IERJU1RfTEFT +VDsgZGlzdCsrKQoJCQl7CgkJCQlzd2l0Y2ggKGRpc3QpCgkJCQl7CgkJCQkJ +Y2FzZSBTQVdUT09USDoKCQkJCQkJZm9yIChpID0gaiA9IDAsIGsgPSAxOyBp +IDwgbjsgaSsrKQoJCQkJCQl7CgkJCQkJCQl4W2ldID0gaSAlIG07CgkJCQkJ +CX0KCQkJCQkJYnJlYWs7CgkJCQkJY2FzZSBSQU5EOgoJCQkJCQlmb3IgKGkg +PSBqID0gMCwgayA9IDE7IGkgPCBuOyBpKyspCgkJCQkJCXsKCQkJCQkJCXhb +aV0gPSByYW5kKCkgJSBtOwoJCQkJCQl9CgkJCQkJCWJyZWFrOwoJCQkJCWNh +c2UgU1RBR0dFUjoKCQkJCQkJZm9yIChpID0gaiA9IDAsIGsgPSAxOyBpIDwg +bjsgaSsrKQoJCQkJCQl7CgkJCQkJCQl4W2ldID0gKGkqbSArIGkpICUgbjsK +CQkJCQkJfQoJCQkJCQlicmVhazsKCQkJCQljYXNlIFBMQVRFQVU6CgkJCQkJ +CWZvciAoaSA9IGogPSAwLCBrID0gMTsgaSA8IG47IGkrKykKCQkJCQkJewoJ +CQkJCQkJeFtpXSA9IGkgPCBtID8gaSA6IG07CgkJCQkJCX0KCQkJCQkJYnJl +YWs7CgkJCQkJY2FzZSBTSFVGRkxFOgoJCQkJCQlmb3IgKGkgPSBqID0gMCwg +ayA9IDE7IGkgPCBuOyBpKyspCgkJCQkJCXsKCQkJCQkJCXhbaV0gPSAocmFu +ZCgpJW0pID8gKGorPTIpIDogKGsrPTIpOwoJCQkJCQl9CgkJCQkJCWJyZWFr +OwoJCQkJfQoJCQkJdGVzdF9pbnRfY29weShkaXN0LCB4LCBuKTsgLyogd29y +ayBvbiBhIGNvcHkgb2YgeCAqLwoJCQkJdGVzdF9pbnRfcmV2ZXJzZShkaXN0 +LCBNUkVWRVJTRSwgeCwgbiwgMCwgbik7IC8qIG9uIGEgcmV2ZXJzZWQgY29w +eSAqLwoJCQkJdGVzdF9pbnRfcmV2ZXJzZShkaXN0LCBNRlJFVkVSU0UsIHgs +IG4sIDAsIG4vMik7CS8qIGZyb250IGhhbGYgcmV2ZXJzZWQgKi8KCQkJCXRl +c3RfaW50X3JldmVyc2UoZGlzdCwgTUJSRVZFUlNFLCB4LCBuLCBuLzIsIG4p +OwkvKiBiYWNrIGhhbGYgcmV2ZXJzZWQgKi8KCQkJCXRlc3RfaW50X3NvcnQo +ZGlzdCwgeCwgbik7IC8qIGFuIG9yZGVyZWQgY29weSAqLwoJCQkJdGVzdF9p +bnRfbmVhcnNvcnQoZGlzdCwgeCwgbik7IC8qIGFuIGFsbW9zdCBvcmRlcmVk +IGNvcHkgKi8KCQkJCXRlc3RfaW50X2RpdGhlcihkaXN0LCB4LCBuKTsgLyog +YWRkIGklNSB0byB4W2ldICovCgkJCX0KCQl9Cgl9CgoJZm9yIChkaXN0ID0g +RElTVF9GSVJTVDsgZGlzdCA8PSBESVNUX0xBU1Q7IGRpc3QrKykKCXsKCQlw +cmludGYoImRpc3RyaWJ1dGlvbiAlczogbWF4IGNyYXRpbyAlLjJmLCBtaW4g +JS4yZiwgYXZlcmFnZSAlLjJmIG92ZXIgJWQgdGVzdHNcbiIsCgkJCSAgIGRp +c3RuYW1lW2Rpc3RdLAoJCQkgICBtYXhjcmF0aW9fZFtkaXN0XSwKCQkJICAg +bWluY3JhdGlvX2RbZGlzdF0sCgkJCSAgIHN1bWNyYXRpb19kW2Rpc3RdIC8g +bnRlc3RzX2RbZGlzdF0sCgkJCSAgIG50ZXN0c19kW2Rpc3RdKTsKCX0KCglz +dW1jcmF0aW8gPSAwOwoJbnRlc3RzID0gMDsKCglmb3IgKG1ldGggPSBNRVRI +X0ZJUlNUOyBtZXRoIDw9IE1FVEhfTEFTVDsgbWV0aCsrKQoJewoJCXByaW50 +ZigibWV0aG9kICVzOiBtYXggY3JhdGlvICUuMmYsIG1pbiAlLjJmLCBhdmVy +YWdlICUuMmYgb3ZlciAlZCB0ZXN0c1xuIiwKCQkJICAgbWV0aG5hbWVbbWV0 +aF0sCgkJCSAgIG1heGNyYXRpb19tW21ldGhdLAoJCQkgICBtaW5jcmF0aW9f +bVttZXRoXSwKCQkJICAgc3VtY3JhdGlvX21bbWV0aF0gLyBudGVzdHNfbVtt +ZXRoXSwKCQkJICAgbnRlc3RzX21bbWV0aF0pOwoJCXN1bWNyYXRpbyArPSBz +dW1jcmF0aW9fbVttZXRoXTsKCQludGVzdHMgKz0gbnRlc3RzX21bbWV0aF07 +Cgl9CgoJcHJpbnRmKCJPdmVyYWxsOiBhdmVyYWdlIGNyYXRpbyAlLjJmIG92 +ZXIgJWQgdGVzdHNcbiIsCgkJICAgc3VtY3JhdGlvIC8gbnRlc3RzLAoJCSAg +IG50ZXN0cyk7CgoJcmV0dXJuIDA7Cn0K + +------- =_aaaaaaaaaa0 +Content-Type: text/plain +Content-Disposition: inline +Content-Transfer-Encoding: 8bit +MIME-Version: 1.0 + + +---------------------------(end of broadcast)--------------------------- +TIP 5: don't forget to increase your free space map settings + +------- =_aaaaaaaaaa0-- + +From pgsql-hackers-owner+M81283@postgresql.org Tue Mar 21 15:18:07 2006 +Return-path: +Received: from ams.hub.org (ams.hub.org [200.46.204.13]) + by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id k2LKI7M12970 + for ; Tue, 21 Mar 2006 15:18:07 -0500 (EST) +Received: from postgresql.org (postgresql.org [200.46.204.71]) + by ams.hub.org (Postfix) with ESMTP id AABA167BBF8; + Tue, 21 Mar 2006 16:18:04 -0400 (AST) +X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org +Received: from localhost (av.hub.org [200.46.204.144]) + by postgresql.org (Postfix) with ESMTP id 3E6009DC827 + for ; Tue, 21 Mar 2006 16:17:38 -0400 (AST) +Received: from postgresql.org ([200.46.204.71]) + by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024) + with ESMTP id 56406-07 + for ; + Tue, 21 Mar 2006 16:17:38 -0400 (AST) +X-Greylist: from auto-whitelisted by SQLgrey- +Received: from stark.xeocode.com (stark.xeocode.com [216.58.44.227]) + by postgresql.org (Postfix) with ESMTP id 21DCF9DC809 + for ; Tue, 21 Mar 2006 16:17:35 -0400 (AST) +Received: from localhost ([127.0.0.1] helo=stark.xeocode.com) + by stark.xeocode.com with smtp (Exim 3.36 #1 (Debian)) + id 1FLnI7-0004xB-00; Tue, 21 Mar 2006 15:17:19 -0500 +To: Tom Lane +cc: "Dann Corbit" , + "Jonah H. Harris" , pgsql-hackers@postgresql.org, + "Jerry Sievers" +Subject: Re: [HACKERS] qsort, once again +References: + <18732.1142967137@sss.pgh.pa.us> +In-Reply-To: <18732.1142967137@sss.pgh.pa.us> +From: Greg Stark +Organization: The Emacs Conspiracy; member since 1992 +Date: 21 Mar 2006 15:17:19 -0500 +Message-ID: <87lkv3mu28.fsf@stark.xeocode.com> +Lines: 13 +User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.4 +MIME-Version: 1.0 +Content-Type: text/plain; charset=us-ascii +X-Virus-Scanned: by amavisd-new at hub.org +X-Spam-Status: No, score=0.128 required=5 tests=[AWL=0.128] +X-Spam-Score: 0.128 +X-Mailing-List: pgsql-hackers +List-Archive: +List-Help: +List-Id: +List-Owner: +List-Post: +List-Subscribe: +List-Unsubscribe: +Precedence: bulk +Sender: pgsql-hackers-owner@postgresql.org +Status: OR + + +Tom Lane writes: + +> and here are the results using glibc's qsort, which of course isn't +> quicksort at all but some kind of merge sort: +> ... +> Overall: average cratio 0.63 over 525 tests + +That looks better both on average and in the worst case. Are the time +constants that much worse that the merge sort still takes longer? + +-- +greg + + +---------------------------(end of broadcast)--------------------------- +TIP 5: don't forget to increase your free space map settings + +From pgsql-hackers-owner+M81285@postgresql.org Tue Mar 21 15:38:06 2006 +Return-path: +Received: from ams.hub.org (ams.hub.org [200.46.204.13]) + by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id k2LKc6M14799 + for ; Tue, 21 Mar 2006 15:38:06 -0500 (EST) +Received: from postgresql.org (postgresql.org [200.46.204.71]) + by ams.hub.org (Postfix) with ESMTP id 0917767BBF8; + Tue, 21 Mar 2006 16:38:03 -0400 (AST) +X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org +Received: from localhost (av.hub.org [200.46.204.144]) + by postgresql.org (Postfix) with ESMTP id 069389DC843 + for ; Tue, 21 Mar 2006 16:37:39 -0400 (AST) +Received: from postgresql.org ([200.46.204.71]) + by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024) + with ESMTP id 60037-07 + for ; + Tue, 21 Mar 2006 16:37:39 -0400 (AST) +X-Greylist: from auto-whitelisted by SQLgrey- +Received: from sss.pgh.pa.us (sss.pgh.pa.us [66.207.139.130]) + by postgresql.org (Postfix) with ESMTP id BDC039DC827 + for ; Tue, 21 Mar 2006 16:37:36 -0400 (AST) +Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) + by sss.pgh.pa.us (8.13.1/8.13.1) with ESMTP id k2LKbZid019858; + Tue, 21 Mar 2006 15:37:35 -0500 (EST) +To: Greg Stark +cc: "Dann Corbit" , + "Jonah H. Harris" , pgsql-hackers@postgresql.org, + "Jerry Sievers" +Subject: Re: [HACKERS] qsort, once again +In-Reply-To: <87lkv3mu28.fsf@stark.xeocode.com> +References: <18732.1142967137@sss.pgh.pa.us> <87lkv3mu28.fsf@stark.xeocode.com> +Comments: In-reply-to Greg Stark + message dated "21 Mar 2006 15:17:19 -0500" +Date: Tue, 21 Mar 2006 15:37:35 -0500 +Message-ID: <19857.1142973455@sss.pgh.pa.us> +From: Tom Lane +X-Virus-Scanned: by amavisd-new at hub.org +X-Spam-Status: No, score=0.113 required=5 tests=[AWL=0.113] +X-Spam-Score: 0.113 +X-Mailing-List: pgsql-hackers +List-Archive: +List-Help: +List-Id: +List-Owner: +List-Post: +List-Subscribe: +List-Unsubscribe: +Precedence: bulk +Sender: pgsql-hackers-owner@postgresql.org +Status: OR + +Greg Stark writes: +> That looks better both on average and in the worst case. Are the time +> constants that much worse that the merge sort still takes longer? + +Keep in mind that this is only counting the number of +comparison-function calls; it's not accounting for any other effects. +In particular, for a large sort operation quicksort might win because of +its more cache-friendly memory access patterns. + +The whole question of our qsort vs the system library's qsort probably +needs to be revisited, however, now that we've identified and fixed this +particular performance issue. + + regards, tom lane + +---------------------------(end of broadcast)--------------------------- +TIP 5: don't forget to increase your free space map settings + +From pgsql-hackers-owner+M81289@postgresql.org Tue Mar 21 16:27:30 2006 +Return-path: +Received: from ams.hub.org (ams.hub.org [200.46.204.13]) + by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id k2LLRTM20101 + for ; Tue, 21 Mar 2006 16:27:30 -0500 (EST) +Received: from postgresql.org (postgresql.org [200.46.204.71]) + by ams.hub.org (Postfix) with ESMTP id ACB4F67BBFD; + Tue, 21 Mar 2006 17:27:27 -0400 (AST) +X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org +Received: from localhost (av.hub.org [200.46.204.144]) + by postgresql.org (Postfix) with ESMTP id 16E0E9DCA0F + for ; Tue, 21 Mar 2006 17:27:01 -0400 (AST) +Received: from postgresql.org ([200.46.204.71]) + by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024) + with ESMTP id 69903-02 + for ; + Tue, 21 Mar 2006 17:27:02 -0400 (AST) +X-Greylist: from auto-whitelisted by SQLgrey- +Received: from stark.xeocode.com (stark.xeocode.com [216.58.44.227]) + by postgresql.org (Postfix) with ESMTP id 107429DC867 + for ; Tue, 21 Mar 2006 17:26:58 -0400 (AST) +Received: from localhost ([127.0.0.1] helo=stark.xeocode.com) + by stark.xeocode.com with smtp (Exim 3.36 #1 (Debian)) + id 1FLoNU-0006M4-00; Tue, 21 Mar 2006 16:26:56 -0500 +To: Tom Lane +cc: Greg Stark , "Dann Corbit" , + "Jonah H. Harris" , pgsql-hackers@postgresql.org, + "Jerry Sievers" +Subject: Re: [HACKERS] qsort, once again +References: + <18732.1142967137@sss.pgh.pa.us> <87lkv3mu28.fsf@stark.xeocode.com> + <19857.1142973455@sss.pgh.pa.us> +In-Reply-To: <19857.1142973455@sss.pgh.pa.us> +From: Greg Stark +Organization: The Emacs Conspiracy; member since 1992 +Date: 21 Mar 2006 16:26:55 -0500 +Message-ID: <87acbjmqu8.fsf@stark.xeocode.com> +Lines: 22 +User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.4 +MIME-Version: 1.0 +Content-Type: text/plain; charset=us-ascii +X-Virus-Scanned: by amavisd-new at hub.org +X-Spam-Status: No, score=0.128 required=5 tests=[AWL=0.128] +X-Spam-Score: 0.128 +X-Mailing-List: pgsql-hackers +List-Archive: +List-Help: +List-Id: +List-Owner: +List-Post: +List-Subscribe: +List-Unsubscribe: +Precedence: bulk +Sender: pgsql-hackers-owner@postgresql.org +Status: OR + +Tom Lane writes: + +> Greg Stark writes: +> > That looks better both on average and in the worst case. Are the time +> > constants that much worse that the merge sort still takes longer? +> +> Keep in mind that this is only counting the number of +> comparison-function calls; it's not accounting for any other effects. +> In particular, for a large sort operation quicksort might win because of +> its more cache-friendly memory access patterns. + +My question explicitly recognized that possibility. I'm just a little +skeptical since the comparison function in Postgres is often not some simple +bit of tightly optimized C code, but rather a complex locale sensitive +comparison function or even a bit of SQL expression to evaluate. + +Cache effectiveness is may be a minimal factor anyways when the comparison is +executing more than a minimal amount of code. And one extra comparison is +going to cost a lot more too. + +-- +greg + + +---------------------------(end of broadcast)--------------------------- +TIP 9: In versions below 8.0, the planner will ignore your desire to + choose an index scan if your joining column's datatypes do not + match + +From pgsql-hackers-owner+M81290@postgresql.org Tue Mar 21 16:48:00 2006 +Return-path: +Received: from ams.hub.org (ams.hub.org [200.46.204.13]) + by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id k2LLlxM22215 + for ; Tue, 21 Mar 2006 16:47:59 -0500 (EST) +Received: from postgresql.org (postgresql.org [200.46.204.71]) + by ams.hub.org (Postfix) with ESMTP id 28A9867BBFD; + Tue, 21 Mar 2006 17:47:57 -0400 (AST) +X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org +Received: from localhost (av.hub.org [200.46.204.144]) + by postgresql.org (Postfix) with ESMTP id 1B4849DCC25 + for ; Tue, 21 Mar 2006 17:47:27 -0400 (AST) +Received: from postgresql.org ([200.46.204.71]) + by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024) + with ESMTP id 72535-05 + for ; + Tue, 21 Mar 2006 17:47:28 -0400 (AST) +X-Greylist: from auto-whitelisted by SQLgrey- +Received: from sss.pgh.pa.us (sss.pgh.pa.us [66.207.139.130]) + by postgresql.org (Postfix) with ESMTP id D27239DCC21 + for ; Tue, 21 Mar 2006 17:47:24 -0400 (AST) +Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) + by sss.pgh.pa.us (8.13.1/8.13.1) with ESMTP id k2LLlNpJ002194; + Tue, 21 Mar 2006 16:47:23 -0500 (EST) +To: Greg Stark +cc: "Dann Corbit" , + "Jonah H. Harris" , pgsql-hackers@postgresql.org, + "Jerry Sievers" +Subject: Re: [HACKERS] qsort, once again +In-Reply-To: <87acbjmqu8.fsf@stark.xeocode.com> +References: <18732.1142967137@sss.pgh.pa.us> <87lkv3mu28.fsf@stark.xeocode.com> <19857.1142973455@sss.pgh.pa.us> <87acbjmqu8.fsf@stark.xeocode.com> +Comments: In-reply-to Greg Stark + message dated "21 Mar 2006 16:26:55 -0500" +Date: Tue, 21 Mar 2006 16:47:23 -0500 +Message-ID: <2193.1142977643@sss.pgh.pa.us> +From: Tom Lane +X-Virus-Scanned: by amavisd-new at hub.org +X-Spam-Status: No, score=0.113 required=5 tests=[AWL=0.113] +X-Spam-Score: 0.113 +X-Mailing-List: pgsql-hackers +List-Archive: +List-Help: +List-Id: +List-Owner: +List-Post: +List-Subscribe: +List-Unsubscribe: +Precedence: bulk +Sender: pgsql-hackers-owner@postgresql.org +Status: OR + +Greg Stark writes: +> My question explicitly recognized that possibility. I'm just a little +> skeptical since the comparison function in Postgres is often not some simple +> bit of tightly optimized C code, but rather a complex locale sensitive +> comparison function or even a bit of SQL expression to evaluate. + +Yeah, I'd guess the same way, but OTOH at least a few people have +reported that our qsort code is consistently faster than glibc's (and +that was before this fix). See this thread: +http://archives.postgresql.org/pgsql-hackers/2005-12/msg00610.php + +Currently I believe that we only use our qsort on Solaris, not any other +platform, so if you think that glibc's qsort is better then you've +already got your wish. It seems to need more investigation though. +In particular, I'm thinking that the various adjustments we've made +to the sort support code over the past month probably invalidate any +previous testing of the point, and that we ought to go back and redo +those comparisons. + + regards, tom lane + +---------------------------(end of broadcast)--------------------------- +TIP 5: don't forget to increase your free space map settings +