libevent2 can still somehowe call client_read even in code paths
that never enable reading from the evbuffer. Can't reproduce on
the libevent in base on OpenBSD. It's a bit ugly, but it's a small
workaround for something that otherwise *always* make gmid crash
when linked against libevent2. (client_read works under the
assumption that c->host != NULL, matched_proxy crashes otherwise.)
it doesn't make any sense to keep the proxying info per-location:
proxying only one per-vhost. It can't work differently, it doesn't make
sense anyway.
Add to gmid the ability to forwad a request to another gemini server and
thus acting like a reverse proxy. The current syntax for the config
file is
server "example.com" {
...
proxy relay-to host:port
}
Further options (like the use of custom certificates) are planned.
cf. github issue #7
Don't refuse to serve the request if the port number doesn't match the
one we're listening on, as initially suggested by Allen Sobot.
Complex setup may have a gmid instance reachable from multiple ports and
the meaning of the check in the first places was to avoid tricking
clients into thinking that we're serving for those domains: the port
number is way less important than the schema or domain name.
In the long run, the best way would probably to add a `listen on'
keyword for the servers blocks, just like OpenBSD' httpd, but gmid can't
listen on multiple ports/interfaces yet
the error path needs an initialized bufferevent too, otherwise it'll
crash when trying to write the response.
This moves the initialisation early, right after the tls_handshake.
Another option would be to initialise it in do_accept, but that may be
too early.
ECONNABORTED is returned if a connections gets aborted after being
queued before the accept(2). I had some cases of
accept: Software caused connection abort
on FreeBSD, this should avoid that.
It's been there for a long time, and it's frankly annoying to pretend
to use parameters. Most of the time, they're there to satisfy an
interface and nothings more.
From day one we've been using a static array of client struct to hold
the clients data. This has variuos drawbacks, among which:
* reuse of the storage ("shades of heartbleed")
* maximum fixed amount of clients connected at the same time
* bugs are harder to debug
The last point in particular is important because if we mess the client
ids, or try to execute some functions (e.g. the various fcgi_*) after a
client has been disconnected, it's harder to "see" this "use after
free"-tier kind of bug.
Now I'm using a splay tree to hold the data about the live connections.
Each client' data is managed by malloc. If we try to access a client
data after the disconnection we'll probably crash with a SIGSEGV and
find the bug is more easy.
Performance-wise the connection phase should be faster since we don't
have to loop anymore to find an empty spot in the clients array, but
some operations could be slightly slower (compare the O(1) access in an
array with a SPLAY_FIND operation -- still be faster than O(n) thought.)
FastCGI is designed to multiplex requests over a single connection, so
ideally the server can open only one connection per worker to the
FastCGI application and that's that.
Doing this kind of multiplexing makes the code harder to follow and
easier to break/leak etc on the gmid side however. OpenBSD' httpd
seems to open one connection per client, so why can't we too?
One connection per request is still way better (lighter) than using
CGI, and we can avoid all the pitfalls of the multiplexing (keeping
track of "live ids", properly shut down etc...)
libevent2 has this concept of "freezeness" of a buffer. It's a way to
avoid accidentally write/remove data from the wrong "edge" of the
buffer. The client_tls_{read,write} functions need to add/drain data
from the opposite edge, hence the need for the unfreeze call.
This is the minimum change in order to work on libevent2 too. Another
way would be to define evbuffer_{un,}freeze as NOP on libevent 1, but
it's ugly IMHO.
This is a big change in how gmid handles I/O. Initially we used a
hand-written loop over poll(2), that then was evolved into something
powered by libevent basic API. This meant that there were a lot of
small "asynchronous" function that did one step, eventually scheduling
the re-execution, that called each others in a chain.
The new implementation revolves completely around libevent'
bufferevents. It's more clear, as everything is implemented around the
client_read and client_write functions.
There is still space for improvements, like adding timeouts for one, but
it's solid enough to be committed as is and then further improved.
This changes the fastcgi implementation from a blocking I/O to an
async implementation on top of libevent' bufferevents.
Should improve the responsiveness of gmid especially when using remote
fastcgi applications.
While computing the parent directory it an out-of-bound access can
occur, which usually means the server process dies.
In particular, it can be triggered by making a request for a
non-existent file in the root of a virtual host if the path matches
the `cgi` pattern.
Thanks cage for helping in debugging!
the whole struct client is already memset'd to 0 in do_accept.
handle_handshake doesn't touch the request or iri buffer in the code
path that leads to handle_open_conn. (It does so in the error router
alone.)
Missing SNI (i.e. servname == NULL) is already handled correctly.
puny_decode refuses to work on NULL servname, c->domain is still the
empty string and everything flows as expected towards the error at the
end. However, it's better to bail out early and make more explicit
how the case of missing SNI is handled.
O_NOFOLLOW acts only on *the last component*, so on
open("/foo/bar/baz") only when baz is a symlink open fails.
Checking every path component is not viable.
gh issue #5 related (sort of)
we need to delete the events associated with the backends, otherwise
the server process won't ever quit.
Here, we add a pending counter to every backend and shut down
immediately if they aren't handling any client; otherwise we try to
close them as soon as possible (i.e. when they close the connection to
the last connected client.)
quoting strncpy(3)
strncpy() only NUL terminates the destination string when the
length of the source string is less than the length parameter.
strlcpy is more intuitive.
this is another warning gcc 8 found that clang didn't.
Yep, fd should be the file descriptor, but for lazyness when manually
calling the function sometimes we supply 0 as fd and event. Instead of
fixing the usage, do as other of such functions do in this
circumstances: use c->fd.
Not production-ready yet, but it's a start.
This adds a third ``backend'' for gmid: until now there it served
local files or CGI scripts, now FastCGI applications too.
FastCGI is meant to be an improvement over CGI: instead of exec'ing a
script for every request, it allows to open a single connection to an
``application'' and send the requests/receive the responses over that
socket using a simple binary protocol.
At the moment gmid supports three different methods of opening a
fastcgi connection:
- local unix sockets, with: fastcgi "/path/to/sock"
- network sockets, with: fastcgi tcp "host" [port]
port defaults to 9000 and can be either a string or a number
- subprocess, with: fastcgi spawn "/path/to/program"
the fastcgi protocol is done over the executed program stdin
of these, the last is only for testing and may be removed in the
future.
P.S.: the fastcgi rule is per-location of course :)
log.o is linked to some regress/ stuff. Calling from there a vhost_*
function means that we should link the regress/stuff to server.o too
(and that would pull in other stuff...). Moving the call is easier,
and also probably better.
Include gmid.h as first header in every file, as it then includes
config.h (that defines _GNU_SOURCE for instance).
Fix also a warning about unsigned vs signed const char pointers in
openssl.
accept4(2) isn't part of any standard (even though it'll be part in
the future) and raises warnings on some linux distro. Moreover, we
don't have thread that may fork at any time, so doing a mark_nonblock
after isn't a big deal.
Before we mmap(2) file for reading, and use a buffer to handle CGI
scripts. Turns out, for sequential access over the whole mmap isn't
better than our loop on read. This has also the additional advantage
that we can use handle_cgi (now handle_copy) for both files and CGI,
which is pretty cool.
This also fixes a nasty bug where we could hang a connection forever,
because we scheduled the wrong type of event (read on POLLOUT and
write on POLLIN, it's the other way around!)
the old server processes would stick around waiting on the signals
events. While there, also drop the `struct server_events' and define
events as globals.
reported by devel at datenbrei dot de. The first location would
overwrite the default value for a server, triggering the "`foo' rule
specified more than once" error. This also needed a small tweak on
how we match locations to avoid breaking other tests.
it's faster (statistically speaking) to first compute if the option is
set and then fnmatch than the inverse. This way we can avoid
unnecessary fnmatch.
keep mark_nonblock in utils.c, as otherwise the build for the regress
suite will fail (mark_nonblock needs fatal which is in gmid.c, and
we can't link gmid.o with the regress suite...)
Now that I got rid of the enum+switch, adding more state is easier.
Before, we used an hack to remember if we had read the CGI reply or
not (c->code = -1).
This introduces a new state, handle_cgi_reply that reads the CGI
script reply, logs it, and only then switches to handle_cgi.
handle_cgi itself is cleaner, now it only reads into c->sbuf and send
what it had red.
We even get, almost for free, the 42 error. If read exists with -1 or
0 from in handle_cgi_reply, we return a proper error to the client.
We can extend this further in the future and also try to validate the
CGI reply (for now we're only looking for a \n).
So we don't re-enter the handle_handsahke and re-do the loop on
fnmatch etc. This way, once we're successfully past the handshake,
we'll re-enter no handle_open_conn.
instead of having a flag to discern between two different behaviours
in S_SENDING, split that state into S_SENDING_FILE and S_SENDING_CGI
(this will also make it easier in the future to add other sending
states). While there, also get rid of `goodbye' and make start_reply
advance the state machine by itself.