Include gmid.h as first header in every file, as it then includes
config.h (that defines _GNU_SOURCE for instance).
Fix also a warning about unsigned vs signed const char pointers in
openssl.
Using BUFSIZ in sbuf is not OK. It's variable, and in various places
we assume that sbuf is 1024 (like handle_cgi_reply). We could patch
those, but we aren't sure BUFSIZ is >= 1024! Let's keep the hardcoded
number.
(found by debugging on arch on amd64, where BUFSIZ is bigger)
Before we mmap(2) file for reading, and use a buffer to handle CGI
scripts. Turns out, for sequential access over the whole mmap isn't
better than our loop on read. This has also the additional advantage
that we can use handle_cgi (now handle_copy) for both files and CGI,
which is pretty cool.
This also fixes a nasty bug where we could hang a connection forever,
because we scheduled the wrong type of event (read on POLLOUT and
write on POLLIN, it's the other way around!)
keep mark_nonblock in utils.c, as otherwise the build for the regress
suite will fail (mark_nonblock needs fatal which is in gmid.c, and
we can't link gmid.o with the regress suite...)
Now that I got rid of the enum+switch, adding more state is easier.
Before, we used an hack to remember if we had read the CGI reply or
not (c->code = -1).
This introduces a new state, handle_cgi_reply that reads the CGI
script reply, logs it, and only then switches to handle_cgi.
handle_cgi itself is cleaner, now it only reads into c->sbuf and send
what it had red.
We even get, almost for free, the 42 error. If read exists with -1 or
0 from in handle_cgi_reply, we return a proper error to the client.
We can extend this further in the future and also try to validate the
CGI reply (for now we're only looking for a \n).
instead of having a flag to discern between two different behaviours
in S_SENDING, split that state into S_SENDING_FILE and S_SENDING_CGI
(this will also make it easier in the future to add other sending
states). While there, also get rid of `goodbye' and make start_reply
advance the state machine by itself.
was probably added to distinguish between AF_INET and AF_INET6, to use
different functions for logging. But it wasn't really used, and now
we use getnameinfo anyway.
cgi.c wasn't really needed; it better to group all the server related
functions together, cgi or not. Now gmid.c contains only startup and
utility code.
this way, we can sandbox the listener with seccomp (todo) or capsicum
(already done) and still have CGI scripts. When we want to exec, we
tell the executor what to do, the executor executes the scripts and
send the fd backt to the listener.
* gmid.c (main): changed behaviour: daemon off by default
(main): changed -c in -C (cert option)
(main): changed -k in -K (key option, for consistency with -C)
(main): added -c to load a configuration
(main): certs, key and doc (-C -K and -d) doesn't have a default value anymore
(handle_handshake): add vhosts support
This alter the current state machine by adding S_HANDSHAKE as the
initial state. There, we ensure we did the handshake and we check
SNI. ATM we simply continue in S_OPEN, but later we can add virtual
host checks there, and skip to S_INITIALIZING with an error state if
the client is accessing a wrong host.
and while there replace SAFE_SETENV with an inline function. LOG is
more difficult to transform into an inline function, given the string
concatenations it does. The other LOG* and FATAL macros are fine as
they already are.
SIGHUP is sent when the tty is detached and by default kills the
process. When we run in the background we don't care anymore about
the tty, so it should be safe for us to ignore SIGHUP. (frankly,
I expected daemon(3) to do stuff like this for us).
Up until now I used a "poor man" approach: the uri parser is barely a
parser, it tries to extract the path from the request, with some minor
checking, and that's all. This obviously is not RFC3986-compliant.
The new RFC3986 (URI) parser should be fully compliant. It may accept
some invalid URI, but shouldn't reject or mis-parse valid URI. (in
particular, the rule for the path is way more relaxed in this parser
than it is in the RFC text).
A difference with RFC3986 is that we don't even try to parse the
(optional) userinfo part of a URI: following the Gemini spec we treat
it as an error.
A further caveats is that %2F in the path part of the URI is
indistinguishable from a literal '/': this is NOT conforming, but due
to the scope and use of gmid, I don't see how treat a %2F sequence in
the path (reject the URI?).