See prior commit for an explanation for the goal of the change and why it had
to be split into two commits.
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/20230808021541.7lbzdefvma7qmn3w@awork3.anarazel.de
Backpatch: 15-, where CI support was added
cirrus-ci will soon restrict the amount of free resources every user gets (as
have many other CI providers). For most users of CI that should not be an
issue. But e.g. for cfbot it will be an issue.
To allow configuring different resources on a per-repository basis, introduce
infrastructure for overriding the task execution environment. Unfortunately
this is not entirely trivial, as yaml anchors have to be defined before their
use, and cirrus-ci only allows injecting additional contents at the end of
.cirrus.yml.
To deal with that, move the definition of the CI tasks to
.cirrus.tasks.yml. The main .cirrus.yml is loaded first, then, if defined, the
file referenced by the REPO_CI_CONFIG_GIT_URL variable, will be added,
followed by the contents of .cirrus.tasks.yml. That allows
REPO_CI_CONFIG_GIT_URL to override the yaml anchors defined in .cirrus.yml.
Unfortunately git's default merge / rebase strategy does not handle copied
files, just renamed ones. To avoid painful rebasing over this change, this
commit just renames .cirrus.yml to .cirrus.tasks.yml, without adding a new
.cirrus.yml. That's done in the followup commit, which moves the relevant
portion of .cirrus.tasks.yml to .cirrus.yml. Until that is done,
REPO_CI_CONFIG_GIT_URL does not fully work.
The subsequent commit adds documentation for how to configure custom compute
resources to src/tools/ci/README
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/20230808021541.7lbzdefvma7qmn3w@awork3.anarazel.de
Backpatch: 15-, where CI support was added
The main reason for this change is to reduce different ways of executing
tasks, making it easier to use custom compute resources for cfbot. A secondary
benefit is that the tasks seem slightly faster this way, apparently the
increased startup overhead is outweighed by reduced runtime overhead.
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/20230808021541.7lbzdefvma7qmn3w@awork3.anarazel.de
Backpatch: 15-, where CI support was added
The number of CPUs is the cost-determining factor. Most instance types that
run tests have more memory/core than what we specified, there's no real
benefit in wasting that.
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/20230808021541.7lbzdefvma7qmn3w@awork3.anarazel.de
Backpatch: 15-, where CI support was added
A significant chunk of the time on the macos CI task is spent installing
packages using homebrew. The downloads of the packages are cached, but the
installation needs to happen every time. We can't cache the whole homebrew
installation, because it is too large due to pre-installed packages.
Speed this up by installing packages using macports and caching the
installation as .dmg. That's a lot faster than unpacking a tarball.
In addition, don't install llvm - it wasn't enabled when building, so it's
just a waste of time/space.
This substantially speeds up the mac CI time, both in the cold cache and in
the warm cache case (the latter from ~1m20s to ~5s).
It doesn't seem great to have diverging sources of packages for CI between
branches, so backpatch to 15 (where CI was added).
Discussion: https://postgr.es/m/20230805202539.r3umyamsnctysdc7@awork3.anarazel.de
Backpatch: 15-, where CI was added
So far we have used containers for testing windows on cirrus-ci. Unfortunately
they come with substantial overhead: First, the container images are pulled
onto the host on-demand. Due to the large size of windows containers, that
ends up taking nearly 4 minutes. Secondly, IO is slow, leading to CI runs
taking long.
Thus switch to windows VMs, improving windows CI times by well over 2x.
Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/211afb88-6df6-b74d-f1b7-84b5f21ad875@gmail.com
Backpatch: 15-, where CI was added
While on it, newlines are removed from the end of two elog() strings.
The others are simple grammar mistakes. One comment in pg_upgrade
referred incorrectly to sequences since a7e5457.
Author: Justin Pryzby
Discussion: https://postgr.es/m/20221230231257.GI1153@telsasoft.com
Backpatch-through: 11
Cirrus is about to shut down its macOS-on-Intel support, so it's time to
move our CI testing over to ARM instances. The Homebrew package manager
changed its default installation prefix for the new architecture, so a
couple of tests need tweaks to find binaries.
Back-patch to 15, where in-tree CI began.
Author: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/20221122225744.GF11463%40telsasoft.com
Test performance regresses noticably when using all cores. This is more
pronounced with meson than with autoconf, presumably because meson will
schedule the "full number" of tests more consistently. 8 seems to work
OK.
Discussion: https://postgr.es/m/20220927040208.l3shfcidovpzqxfh@awork3.anarazel.de
Backpatch: 15-, where CI was introduced
Cirrus defaults to SetErrorMode(SEM_NOGPFAULTERRORBOX | ...). That prevents
crash reporting from working unless binaries do SetErrorMode()
themselves. Furthermore, it appears that either python or, more likely, the C
runtime has a bug where SEM_NOGPFAULTERRORBOX can very occasionally *trigger*
a crash on process exit - which is hard to debug, given that it explicitly
prevents crash dumps from working...
Discussion: https://postgr.es/m/20220909235836.lz3igxtkcjb5w7zb%40awork3.anarazel.de
Backpatch: 15-, where CI was added
CI builds recently started failing with:
"Memory size for 4.0 vCPU instance should be between 3840MiB and
26624MiB, while 2048MiB is requested."
Ok then, let's ask for 4G instead of 2G.
This may be due to a change in the type of instance used to work around
an outage, per:
https://twitter.com/cirrus_labs/status/1572657320093712384
freebsd 13.0 is out of support, switch to 13.1. It might be a good idea to
remove the minor version number from the image name, but there's not been a
response to that so far...
Discussion: https://postgr.es/m/20220728095704.ryywoaz4dqqrwstc@alap3.anarazel.de
Backpatch: 15-, just like the CI support
322becb has changed upgradecheck to use the TAP tests, discarding
pg_upgrade's tests in bincheck. However, this is proving to be a bad
idea for the Windows buildfarm clients that use MSVC when TAP tests are
disabled as this causes a hard failure at the pg_upgrade step.
This commit disables upgradecheck, moving the execution of the tests of
pg_upgrade to bincheck, as per an initial suggestion from Andres
Freund, so as the buildfarm is able to live happily with those changes.
While on it, remove the routine that was used by upgradecheck to
create databases whose names are generated with a range of ASCII
characters as it is not used since 322becb. upgradecheck is removed
from the CI script for Windows, as bincheck takes care of that now.
Per report from buildfarm member hamerkop (MSVC 2017 without a TAP
setup).
Reviewed-by: Justin Pryzby
Discussion: https://postgr.es/m/YkbnpriYEAagZ2wH@paquier.xyz
cirrus-ci now defaults to killing processes still running at the end of a
script. Unfortunately we start postgres in the background, which seems
nontrivial to fix. Previously we worked around that in 770011e3f3 by using an
older agent version, but now that CIRRUS_ESCAPING_PROCESSES we should use that.
This reverts commit 770011e3f3 "ci: windows:
Work around cirrus-ci bug causing test failures.
Discussion: https://postgr.es/m/CA+hUKGKx7k14n2nAALSvv6M_AB6oHasNBA65X6Dvo8hwfi9y0A@mail.gmail.com
To improve performance of check-world, and improve debugging, without
significantly slower builds (they're cached anyway).
This makes freebsd check-world run in 8.5 minutes rather than 15 minutes.
Author: Justin Pryzby <pryzbyj@telsasoft.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220310220611.GH28503@telsasoft.com
This is useful for patches during development, but once a feature is merged,
new libraries should be added to the OS image files, rather than installed
during every CI run forever into the future.
Author: Justin Pryzby <pryzbyj@telsasoft.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220310220611.GH28503@telsasoft.com
Having the different types of logs in artifacts instruction makes it quicker
to navigate between them. Either that didn't use to be possible, or I just did
it wrong. I saw that / how it works in a patch by Justin Pryzby.
Also remove a few unnecessary uses of **, suggested by Justin Pryzby.
Discussion: https://postgr.es/m/20220203195718.smqo5xg4ygp5qktq@alap3.anarazel.de
startup_hacks() called SetErrorMode() with the SEM_NOGPFAULTERRORBOX argument
to prevent GUI popups on error. While that likely was sufficient at some
point, there are other sources of error popups.
At the same time SEM_NOGPFAULTERRORBOX unfortunately also prevents
"just-in-time debuggers" from working reliably, i.e. the ability to attach to
a process on crash. This prevents collecting crash dumps as part of CI.
The error popups are particularly problematic when they occur during automated
testing, as they can cause the tests to hang, waiting for a button to be
clicked.
This commit improves the error handling setup in startup_hacks() to address
those problems. SEM_NOGPFAULTERRORBOX is not used anymore, instead various
other APIs are used to disable popups and to redirect output to stderr where
possible.
While this improves the situation for postgres.exe, it doesn't address similar
issues in all the other executables. There currently is no codepath that's
called early on for all frontend programs.
I've tested that this prevents GUI popups and allows JIT debugging in case of
crashes due to:
- abort()
- assert()
- C runtime errors
- unhandled exceptions
both in debug and non-debug mode, on Win10 with MSVC 2019 and with MinGW.
Now that crash reports are generated on windows, collect them in windows CI.
Discussion: https://postgr.es/m/20211005193033.tg4pqswgvu3hcolm@alap3.anarazel.de
On windows ci/cfbot currently regularly hangs / times out. Presumably this is due
to the issues discussed in
https://postgr.es/m/CA%2BhUKG%2BG5DUNJfdE-qusq5pcj6omYTuWmmFuxCvs%3Dq1jNjkKKA%40mail.gmail.com
which lead to reverting 75674c7ec1 everywhere but master.
But it's hard to tell - because the entire test task times out, we don't get
to see debugging information.
This commit adds a timeout (using git's timeout binary) to all vcregress
invocations, which should address the problem for now. It might also be a good
idea to add timeouts to the other operating systems at a later time.
The diff is a bit larger than one might think necessary: Yaml doesn't like % -
from the windows command variable syntax - at the start of an unquoted
string...
Discussion: https://postgr.es/m/20220110005704.es4el6i2nxlxzwof@alap3.anarazel.de
The build summary was disabled unintentionally by setting the build verbosity
lower.
While at it, also add ForceNoAlign to prevent msbuild from introducing
linebreaks in the middle of filenames etc - they make it harder to copy
output.
Discussion: https://postgr.es/m/20220113175554.u6gw7olrdfzivl3n@alap3.anarazel.de
Currently FreeBSD, Linux, macOS and Windows (Visual Studio) are tested.
The main goal of this integration is to make it easier to test in-development
patches across multiple platforms. This includes improving the testing done
automatically by cfbot [1] for commitfest entries. It is *not* the goal to
supersede the buildfarm.
cirrus-ci [2] was chosen because it was already in use for cfbot, allows using
full VMs, has good OS coverage and allows accessing the full test results
without authentication (like a github account). It might be worth adding
support for further CI providers, particularly ones supporting other git
forges, in the future.
To keep CI times tolerable, most platforms use pre-generated images. Some
platforms use containers, others use full VMs.
For instructions on how to enable the CI integration in a repository and
further details, see src/tools/ci/README
[1] http://cfbot.cputube.org/
[2] https://cirrus-ci.org/
Author: Andres Freund <andres@anarazel.de>
Author: Thomas Munro <tmunro@postgresql.org>
Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-By: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-By: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-By: Thomas Munro <tmunro@postgresql.org>
Reviewed-By: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Discussion: https://postgr.es/m/20211001222752.wrz7erzh4cajvgp6@alap3.anarazel.de