From 944b41fc00600b74f518005ac314cc222bf6abd5 Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Tue, 10 Nov 2015 15:59:59 -0500 Subject: [PATCH] Improve our workaround for 'TeX capacity exceeded' in building PDF files. In commit a5ec86a7c787832d28d5e50400ec96a5190f2555 I wrote a quick hack that reduced the number of TeX string pool entries created while converting our documentation to PDF form. That held the fort for awhile, but as of HEAD we're back up against the same limitation. It turns out that the original coding of \FlowObjectSetup actually results in *three* string pool entries being generated for every "flow object" (that is, potential cross-reference target) in the documentation, and my previous hack only got rid of one of them. With a little more care, we can reduce the string count to one per flow object plus one per actually-cross-referenced flow object (about 115000 + 5000 as of current HEAD); that should work until the documentation volume roughly doubles from where it is today. As a not-incidental side benefit, this change also causes pdfjadetex to stop emitting unreferenced hyperlink anchors (bookmarks) into the PDF file. It had been making one willy-nilly for every flow object; now it's just one per actually-cross-referenced object. This results in close to a 2X savings in PDF file size. We will still want to run the output through "jpdftweak" to get it to be compressed; but we no longer need removal of unreferenced bookmarks, so we might be able to find a quicker tool for that step. Although the failure only affects HEAD and US-format output at the moment, 9.5 cannot be more than a few pages short of failing likewise, so it will inevitably fail after a few rounds of minor-version release notes. I don't have a lot of faith that we'll never hit the limit in the older branches; and anyway it would be nice to get rid of jpdftweak across the board. Therefore, back-patch to all supported branches. --- doc/src/sgml/jadetex.cfg | 79 +++++++++++++++++++++++++++++++++++----- 1 file changed, 69 insertions(+), 10 deletions(-) diff --git a/doc/src/sgml/jadetex.cfg b/doc/src/sgml/jadetex.cfg index 25b79312eb..875598f151 100644 --- a/doc/src/sgml/jadetex.cfg +++ b/doc/src/sgml/jadetex.cfg @@ -1,14 +1,37 @@ % doc/src/sgml/jadetex.cfg % -% This file redefines FlowObjectSetup to eliminate one of the two control -% sequences it normally creates, thereby substantially reducing string usage -% and permitting the complete Postgres documentation to be built without -% overflowing a hard-to-expand TeX limit. The only known penalty is an -% increased number of TeX warnings about ignoring duplicate definitions. +% This file redefines \FlowObjectSetup and some related macros to greatly +% reduce the number of control sequence names created, and also to avoid +% creation of many useless hyperlink anchors (bookmarks) in PDF files. % -% Curiously, we only see the failure when building PDF output --- plain PS -% output does not come anywhere close to overflowing the string table. -% There may be another solution hidden in that observation. +% The original coding of \FlowObjectSetup defined a control sequence x@LABEL +% for pretty nearly every flow object in the file, whether that object was +% cross-referenced or not. Worse yet, it created a hyperlink anchor for +% every such object, which not only bloated the output PDF with useless +% anchors but consumed an additional control sequence name per anchor. +% This results in overrunning TeX's limited-size string pool. +% +% To fix, extend \PageLabel's already-existing mechanism whereby a p@LABEL +% control sequence is filled in only for labels that are referenced by at +% least one \Pageref call. We now also fill in p@LABEL for labels that are +% referenced by a \Link. Then, we can drop x@LABEL entirely, and use p@LABEL +% to control emission of both a hyperlink anchor and a page-number label. +% Now, both of those things are emitted for all and only the flow objects +% that have either a hyperlink reference or a page-number reference. +% We consume about one control sequence name per flow object plus one per +% referenced object, which is a lot better than three per flow object. +% +% (With a more invasive patch, we could track the need for an anchor and a +% page-number label separately, but that would probably require two control +% sequences for every flow object. Besides, many objects that have one kind +% of reference will have the other one too; that's certainly true for objects +% referenced in either the TOC or the index, for example.) +% +% +% In addition to checking p@LABEL not x@LABEL, this version of \FlowObjectSetup +% is fixed to clear \Label and \Element whether or not it emits an anchor +% and page label. Failure to do that seems to explain some pre-existing bugs +% in which certain SGML constructs weren't correctly cross-referenced. % \def\FlowObjectSetup#1{% \ifDoFOBSet @@ -16,6 +39,8 @@ \ifx\Label\@empty\let\Label\Element\fi \fi \ifx\Label\@empty\else + \expandafter\ifx\csname p@\Label\endcsname\relax + \else \bgroup \ifNestedLink \else @@ -23,8 +48,42 @@ \PageLabel{\Label}% \fi \egroup - \let\Label\@empty - \let\Element\@empty + \fi + \let\Label\@empty + \let\Element\@empty \fi \fi } +% +% Adjust \PageLabel so that the p@NAME control sequence acquires a correct +% value immediately; this seems to be needed to avoid scenarios wherein +% additional TeX runs are needed to reach a stable state of the .aux file. +% +\def\PageLabel#1{% + \@bsphack + \expandafter\ifx\csname p@#1\endcsname\relax + \else + \protected@write\@auxout{}% + {\string\pagelabel{#1}{\thepage}}% + % Ensure the p@NAME control sequence acquires correct value immediately + \expandafter\xdef\csname p@#1\endcsname{\thepage}% + \fi + \@esphack} +% +% In \Link, add code to emit an aux-file entry if the p@NAME sequence isn't +% defined. Much as in \@Setref, this ensures we'll process the referenced +% item correctly on the next TeX run. +% +\def\Link#1{% + \begingroup + \SetupICs{#1}% + \ifx\Label\@empty\let\Label\Element\fi +% \typeout{Made a Link at \the\inputlineno, to \Label}% + \hyper@linkstart{\LinkType}{\Label}% + \NestedLinktrue + % If p@NAME control sequence isn't defined, emit dummy def to aux file + % so it will get defined properly on next run, much as in \@Setref + \expandafter\ifx\csname p@\Label\endcsname\relax + \immediate\write\@mainaux{\string\pagelabel{\Label}{qqq}}% + \fi +}