It uses the 'common' proc.c from various OpenBSD-daemons.
gmid grew organically bit by bit and it was also the first place where I
tried to implement privsep. It wasn't done very well, in fact the
parent process (that retains root privileges) just fork()s a generation
of servers, all sharing *exactly* the same address space. No good!
Now, we fork() and re-exec() ourselves, so that each process has a fresh
address space.
Some features (require client ca for example) are temporarly disabled,
will be fixed in subsequent commits. The "ge" program is also
temporarly disabled as it needs tweaks to do privsep too.
The FreeBSD and Linux' sandbox can't deal with `fastcgi' and `proxy'
configuration rules: new sockets needs to be opened and it's either
impossible (the former) or a huge pain in the arse (the latter).
The sandbox is still always used in case only static files are served.
I really want to get rid of the `executor' process hack for CGI scripts
and its escalation to allow fastcgi and proxying to work on non-OpenBSD.
This drops the CGI support and the `executor' process entirely and is
the first step towards gmid 2.0. It also allows to have more secure
defaults.
On non-OpenBSD systems this means that the sandbox will be deactivated
as soon as fastcgi or proxying are used: you can't open sockets under
FreeBSD' capsicum(4) and I don't want to go thru the pain of making it
work under linux' seccomp/landlock. Patches are always welcome however.
For folks using CGI scripts (hey, I'm one of you!) not all hope is lost:
fcgiwrap or OpenBSD' slowcgi(8) are ways to run CGI scripts as they were
FastCGI applications.
fixes for the documentation and to the non-OpenBSD sandboxes will
follow.
add_mime nows allocate dinamically copies of the passed strings, so
that we can actually free what we parse from the config file.
This matters a lot especially with lengthy `types' block: strings that
reach the internal mapping are never free'd, so every manual addition
is leaked.
Long time ago, client->req was a static buffer so the memcpy was safe.
However, it's been since moved to a dynamically allocated string, so
it's very often smaller than sizeof(req.buf) (1024), hence the out of
bound access which results in a SIGSEGV very often on OpenBSD thanks to
Otto' malloc.
The situation with the iri parser, client->req and how the request is
forwarded to the other process needs to be improved: this is just a fix
to address the issue quickly, a better one would be to restructure the
iri parser APIs and rethink how the info is forwarded to the ex process.
i.e. allow client_by_id to fail and return NULL.
Initially I thought it was a good idea to shut down a server process
if we receive an invalid client id as reply from one of our requests
to the executor process. This turned out not to be correct since a
client can (read: will) disconnect in the delay beteewn we acknowledge
their request and the cgi script execution.
The fastcgi and proxy handler already handled this situation, so
they're unaffected.
This allows an attacker to make gmid unresponsible by just making
enough requests until they hit the right timing.
it doesn't make any sense to keep the proxying info per-location:
proxying only one per-vhost. It can't work differently, it doesn't make
sense anyway.
Add to gmid the ability to forwad a request to another gemini server and
thus acting like a reverse proxy. The current syntax for the config
file is
server "example.com" {
...
proxy relay-to host:port
}
Further options (like the use of custom certificates) are planned.
cf. github issue #7
Currently dogfooding this patch at gemini.sgregoratto.me. To test,
run the following command and look for the "OCSP response" header:
openssl s_client -connect "gemini.sgregoratto.me:1965" -status
This adds a barebone dumping of the parsed configuration. It is not
complete, but I'm interested in dumping the full path to `cert' and
`key' in order to write some scripts that can inspect the
configuration, extract the certificates and renew them when expired
automatically.
It's not easy to parse gmid configuration otherwise because the syntax
is flexible and users can use macros. Instead, the idea is to run
gmid and let it dump the configuration once it's been parsed in a
static and predictable format.
Now is possible to parse gmid configuration with, say, awk or perl.
From day one we've been using a static array of client struct to hold
the clients data. This has variuos drawbacks, among which:
* reuse of the storage ("shades of heartbleed")
* maximum fixed amount of clients connected at the same time
* bugs are harder to debug
The last point in particular is important because if we mess the client
ids, or try to execute some functions (e.g. the various fcgi_*) after a
client has been disconnected, it's harder to "see" this "use after
free"-tier kind of bug.
Now I'm using a splay tree to hold the data about the live connections.
Each client' data is managed by malloc. If we try to access a client
data after the disconnection we'll probably crash with a SIGSEGV and
find the bug is more easy.
Performance-wise the connection phase should be faster since we don't
have to loop anymore to find an empty spot in the clients array, but
some operations could be slightly slower (compare the O(1) access in an
array with a SPLAY_FIND operation -- still be faster than O(n) thought.)
FastCGI is designed to multiplex requests over a single connection, so
ideally the server can open only one connection per worker to the
FastCGI application and that's that.
Doing this kind of multiplexing makes the code harder to follow and
easier to break/leak etc on the gmid side however. OpenBSD' httpd
seems to open one connection per client, so why can't we too?
One connection per request is still way better (lighter) than using
CGI, and we can avoid all the pitfalls of the multiplexing (keeping
track of "live ids", properly shut down etc...)
This is a big change in how gmid handles I/O. Initially we used a
hand-written loop over poll(2), that then was evolved into something
powered by libevent basic API. This meant that there were a lot of
small "asynchronous" function that did one step, eventually scheduling
the re-execution, that called each others in a chain.
The new implementation revolves completely around libevent'
bufferevents. It's more clear, as everything is implemented around the
client_read and client_write functions.
There is still space for improvements, like adding timeouts for one, but
it's solid enough to be committed as is and then further improved.
This changes the fastcgi implementation from a blocking I/O to an
async implementation on top of libevent' bufferevents.
Should improve the responsiveness of gmid especially when using remote
fastcgi applications.