Commit Graph

184 Commits

Author SHA1 Message Date
Omar Polo 18bd83915e sort the MIME mappings and do a binary search to match 2022-04-08 15:14:09 +00:00
Omar Polo 54203115cd don't load the built-in list when using `types' 2022-04-08 13:52:35 +00:00
Omar Polo d8d170aa5e allow add_mime to fail
add_mime nows allocate dinamically copies of the passed strings, so
that we can actually free what we parse from the config file.

This matters a lot especially with lengthy `types' block: strings that
reach the internal mapping are never free'd, so every manual addition
is leaked.
2022-04-08 13:44:49 +00:00
Omar Polo ea27eaaa83 fix an out-of-bound access in start_cgi
Long time ago, client->req was a static buffer so the memcpy was safe.
However, it's been since moved to a dynamically allocated string, so
it's very often smaller than sizeof(req.buf) (1024), hence the out of
bound access which results in a SIGSEGV very often on OpenBSD thanks to
Otto' malloc.

The situation with the iri parser, client->req and how the request is
forwarded to the other process needs to be improved: this is just a fix
to address the issue quickly, a better one would be to restructure the
iri parser APIs and rethink how the info is forwarded to the ex process.
2022-03-27 12:52:59 +00:00
Omar Polo 3fdc457c8d swap try_client_by_id with client_by_id
i.e. allow client_by_id to fail and return NULL.

Initially I thought it was a good idea to shut down a server process
if we receive an invalid client id as reply from one of our requests
to the executor process.  This turned out not to be correct since a
client can (read: will) disconnect in the delay beteewn we acknowledge
their request and the cgi script execution.

The fastcgi and proxy handler already handled this situation, so
they're unaffected.

This allows an attacker to make gmid unresponsible by just making
enough requests until they hit the right timing.
2022-03-26 11:32:26 +00:00
Omar Polo 1cdea97b6c allow using a custom hostname for SNI during proxying
add a `sni' option for the `proxy' block: the given name is used instead
of the one extracted by the `relay-to' rule.
2022-01-30 10:14:44 +00:00
Omar Polo e0f6dc646d improve proxy error path
properly release everything when during client_close if the request
was managed by a proxy.
2022-01-27 09:55:52 +00:00
Omar Polo ba94a608a8 add `require client ca' for proxy blocks
refactor the code that calls validate_against_ca into an helper
function to reuse it in both apply_require_ca and (optionally) in
apply_reverse_proxy.
2022-01-04 23:14:34 +00:00
Omar Polo b7967bc1f6 proxy: allow multiple proxy blocks, matching options and validations
as a side effect the order of the content of a server block is relaxed:
options, location or proxy blocks can be put in any order.
2022-01-02 16:33:28 +00:00
Omar Polo 593e412b49 allow to disable TLS when proxying requests 2022-01-01 20:16:14 +00:00
Omar Polo c7c8ef448b add `protocols' option to `proxy' rule 2022-01-01 19:04:50 +00:00
Omar Polo 5128c0b0e3 add `verifyname' option for `proxy' rule 2022-01-01 18:50:10 +00:00
Omar Polo 7bdcc91ec7 simplify the proxying code
it doesn't make any sense to keep the proxying info per-location:
proxying only one per-vhost.  It can't work differently, it doesn't make
sense anyway.
2022-01-01 17:08:39 +00:00
Omar Polo bd5f79542c update copyright years 2022-01-01 16:36:17 +00:00
Omar Polo d49093c105 support optional client certificate for proxy rule 2022-01-01 16:33:44 +00:00
Omar Polo 72b033ef18 add ability to proxy requests
Add to gmid the ability to forwad a request to another gemini server and
thus acting like a reverse proxy.  The current syntax for the config
file is

	server "example.com" {
		...
		proxy relay-to host:port
	}

Further options (like the use of custom certificates) are planned.

cf. github issue #7
2021-12-29 20:36:54 +00:00
Omar Polo 054387bb26 move struct iri and parser at the top of the file 2021-12-29 19:24:33 +00:00
Stephen Gregoratto ff05125eb8 Implement OCSP stapling support
Currently dogfooding this patch at gemini.sgregoratto.me. To test,
run the following command and look for the "OCSP response" header:

 openssl s_client -connect "gemini.sgregoratto.me:1965" -status
2021-10-15 09:58:23 +02:00
Omar Polo f0a01fc742 two -n to dump the parsed configuration
This adds a barebone dumping of the parsed configuration.  It is not
complete, but I'm interested in dumping the full path to `cert' and
`key' in order to write some scripts that can inspect the
configuration, extract the certificates and renew them when expired
automatically.

It's not easy to parse gmid configuration otherwise because the syntax
is flexible and users can use macros.  Instead, the idea is to run
gmid and let it dump the configuration once it's been parsed in a
static and predictable format.

Now is possible to parse gmid configuration with, say, awk or perl.
2021-10-09 21:40:55 +00:00
Omar Polo 492a274fd7 add compat for sys/tree.h 2021-10-07 11:36:25 +00:00
Omar Polo 207b3e80d8 Store clients inside a splay tree
From day one we've been using a static array of client struct to hold
the clients data.  This has variuos drawbacks, among which:

 * reuse of the storage  ("shades of heartbleed")
 * maximum fixed amount of clients connected at the same time
 * bugs are harder to debug

The last point in particular is important because if we mess the client
ids, or try to execute some functions (e.g. the various fcgi_*) after a
client has been disconnected, it's harder to "see" this "use after
free"-tier kind of bug.

Now I'm using a splay tree to hold the data about the live connections.
Each client' data is managed by malloc.  If we try to access a client
data after the disconnection we'll probably crash with a SIGSEGV and
find the bug is more easy.  

Performance-wise the connection phase should be faster since we don't
have to loop anymore to find an empty spot in the clients array, but
some operations could be slightly slower (compare the O(1) access in an
array with a SPLAY_FIND operation -- still be faster than O(n) thought.)
2021-10-07 11:20:34 +00:00
Omar Polo 4cd2520965 one FastCGI connection per client
FastCGI is designed to multiplex requests over a single connection, so
ideally the server can open only one connection per worker to the
FastCGI application and that's that.

Doing this kind of multiplexing makes the code harder to follow and
easier to break/leak etc on the gmid side however.  OpenBSD' httpd
seems to open one connection per client, so why can't we too?

One connection per request is still way better (lighter) than using
CGI, and we can avoid all the pitfalls of the multiplexing (keeping
track of "live ids", properly shut down etc...)
2021-10-07 10:47:02 +00:00
Omar Polo efb6210d77 improve libevent2 handling
* add configure check
* change the way the headers are required (copied from tmux)
2021-10-02 17:20:56 +00:00
Omar Polo fa0299a26d drop now unused trim_req_iri 2021-10-02 17:20:56 +00:00
Omar Polo efe7d18029 new I/O handling on top of bufferevents
This is a big change in how gmid handles I/O.  Initially we used a
hand-written loop over poll(2), that then was evolved into something
powered by libevent basic API.  This meant that there were a lot of
small "asynchronous" function that did one step, eventually scheduling
the re-execution, that called each others in a chain.

The new implementation revolves completely around libevent'
bufferevents.  It's more clear, as everything is implemented around the
client_read and client_write functions.

There is still space for improvements, like adding timeouts for one, but
it's solid enough to be committed as is and then further improved.
2021-10-02 17:20:56 +00:00
Omar Polo 741b69be96 fastcgi completely asynchronous
This changes the fastcgi implementation from a blocking I/O to an
async implementation on top of libevent' bufferevents.

Should improve the responsiveness of gmid especially when using remote
fastcgi applications.
2021-09-26 17:00:07 +00:00
Omar Polo 41395640e0 introduce IMSG_LOG_REQUEST 2021-07-19 07:31:40 +00:00
Omar Polo f361f79979 move version number to configure script 2021-07-10 09:19:44 +00:00
Omar Polo 391825e360 move parse_portno to gmid.c
it's used only to parse the -p flag.  While there add check_port_num
to check the range for the port.
2021-07-09 13:06:58 +00:00
Omar Polo a8a1f43921 style(9)-ify 2021-07-07 09:46:37 +00:00
Omar Polo 090b8a89fa gracefully shut down fastcgi backends
we need to delete the events associated with the backends, otherwise
the server process won't ever quit.

Here, we add a pending counter to every backend and shut down
immediately if they aren't handling any client; otherwise we try to
close them as soon as possible (i.e. when they close the connection to
the last connected client.)
2021-07-06 10:54:27 +00:00
Omar Polo 3b21cca385 allow to define macros in the config file
Macros can be defined at the top of the configuration file:

	dir = "/var/gemini"
	cert = "/etc/keys"

and re-used later, for example

	server "foo" {
		root "$dir/foo"       # -> /var/gemini/foo
		cert "$cert/foo.pem" # -> /etc/keys/foo.pem
	}
2021-06-29 12:17:40 +00:00
Omar Polo fdb43a4c0e define GMID_STRING and reuse-it
GMID_VERSION follows the CGI/FastCGI style, i.e. project_name/version.

Define GMID_STRING with a more "human" variant "project_name version",
and reuse that in the --help and --version codepath.
2021-06-29 10:56:07 +00:00
Omar Polo 74f0778b9a drop the dependency on lex by implementing yylex by ourselves
The actual implementation is based off doas' parse.y.  This gave us
various benefits, like cleaner code, \ to break long lines, better
handling of quotes etc...
2021-06-16 14:43:16 +00:00
Omar Polo e952c5052a allow sending fd to log on to the logger process
the logger process now can receive a file descriptor to write logs
to.  At the moment the logic is simple, if it receives a file it logs
there, otherwise it logs to syslog.  This will allow to log on custom
log files.
2021-06-15 08:06:10 +00:00
Omar Polo f740b61b03 more params from and send a custom list 2021-06-11 15:48:56 +00:00
Omar Polo ce2c9edbc2 define and use GMID_VERSION 2021-05-15 11:11:56 +00:00
Omar Polo 1feaf2a618 use the correct document root
pass the correct loc_off to the executor, so the various variables
that depends on the matched location (like DOCUMENT_ROOT) are computed
correctly.
2021-05-15 10:31:43 +00:00
Omar Polo 23a5aab4ce drop forward declaration of struct client
it's been since the switch to libevent that is no longer needed.
2021-05-11 19:41:16 +00:00
Omar Polo 8ad1c57024 fastcgi: a first implementation
Not production-ready yet, but it's a start.

This adds a third ``backend'' for gmid: until now there it served
local files or CGI scripts, now FastCGI applications too.

FastCGI is meant to be an improvement over CGI: instead of exec'ing a
script for every request, it allows to open a single connection to an
``application'' and send the requests/receive the responses over that
socket using a simple binary protocol.

At the moment gmid supports three different methods of opening a
fastcgi connection:

 - local unix sockets, with: fastcgi "/path/to/sock"
 - network sockets, with: fastcgi tcp "host" [port]
   port defaults to 9000 and can be either a string or a number
 - subprocess, with: fastcgi spawn "/path/to/program"
   the fastcgi protocol is done over the executed program stdin

of these, the last is only for testing and may be removed in the
future.

P.S.: the fastcgi rule is per-location of course :)
2021-05-09 18:23:36 +00:00
Omar Polo 50310aff33 added missing prototype 2021-05-04 18:19:47 +00:00
Omar Polo fdea6aa0bc allow ``root'' rule to be specified per-location block 2021-04-30 17:16:34 +00:00
Omar Polo cc8c2901ad added ``alias'' option to define hostname aliases for a server 2021-04-29 18:23:35 +00:00
Omar Polo 9cc630aa63 added ``env'' option to define environment vars for CGI scripts 2021-04-28 12:43:17 +00:00
Omar Polo 11c986679a sort the auto index alphabetically 2021-04-25 12:06:54 +00:00
Omar Polo 89541eeec0 define TLS_VERSION, TLS_CIPHER and TLS_CIPHER_STRENGTH for CGI scripts 2021-04-13 06:59:54 +00:00
Omar Polo b8e64ccd44 list instead of fixed-size array for vhosts and locations
saves some bytes of memory and removes the limit on the maximum number
of vhosts and location blocks.
2021-03-31 16:32:18 +00:00
Omar Polo 7e1df73d7d fix mkdirs: create directories recursively 2021-03-31 13:56:58 +00:00
Omar Polo c836cdfadb handle CGI scripts that replies with the maximum header length allowed
the 1024 bytes limits is for the META only, not for the whole
response.  That means that the maximum size for the header line is
1029!
2021-03-29 09:42:06 +00:00
Omar Polo 62e001b067 move all sandbox-related code to sandbox.c
while there, add capsicum for the logger process
2021-03-20 08:42:08 +00:00
Omar Polo bc99d868bc refactoring: imsg everywhere
use imsg to handle ALL kinds of IPC in gmid.  This simplifies and shorten the
code, and  makes everything more uniform too.
2021-03-19 19:21:29 +00:00
Omar Polo 2c3e53dac6 give each server process its own socket for the executor
this fixes a bug introduced with the prefork mechanics: every server
process shared the same socket, and this would cause a race condition
when multiple server processes asked for a script cgi being executed.

This gives each server process its own socket to talk to the executor,
so the race cannot happen.
2021-03-03 17:22:01 +00:00
Omar Polo 376a540764
move log_init & vars to gmid.c, retain logger_main in log.c
this is to let the regression suite compile
2021-02-23 13:44:20 +01:00
Omar Polo 793835cb26
add `log on/off' to enable/disable logs per-location 2021-02-23 13:43:24 +01:00
Omar Polo d278a0c3c5
moving logging to its own process 2021-02-23 13:40:59 +01:00
Omar Polo 52418c8d82 fix various compilation errors
Include gmid.h as first header in every file, as it then includes
config.h (that defines _GNU_SOURCE for instance).

Fix also a warning about unsigned vs signed const char pointers in
openssl.
2021-02-12 12:47:20 +00:00
Omar Polo 9356f61a63 BUFSIZ is variable, we need *exactly* 1024
Using BUFSIZ in sbuf is not OK.  It's variable, and in various places
we assume that sbuf is 1024 (like handle_cgi_reply).  We could patch
those, but we aren't sure BUFSIZ is >= 1024!  Let's keep the hardcoded
number.

(found by debugging on arch on amd64, where BUFSIZ is bigger)
2021-02-12 11:56:13 +00:00
Omar Polo 27b2fa9ae5 don't mmap
Before we mmap(2) file for reading, and use a buffer to handle CGI
scripts.  Turns out, for sequential access over the whole mmap isn't
better than our loop on read.  This has also the additional advantage
that we can use handle_cgi (now handle_copy) for both files and CGI,
which is pretty cool.

This also fixes a nasty bug where we could hang a connection forever,
because we scheduled the wrong type of event (read on POLLOUT and
write on POLLIN, it's the other way around!)
2021-02-12 11:27:33 +00:00
Omar Polo 02be96c6dd add `require client ca' rule to require certs signed by a CA 2021-02-09 22:30:04 +00:00
Omar Polo d090dc8491 define config_path as global variable 2021-02-08 12:44:34 +00:00
Omar Polo abc007d2b3 rewrite main loop using libevent 2021-02-08 10:01:45 +00:00
Omar Polo b63e30ff44 define TLS_CLIENT_NOT_BEFORE/NOT_AFTER in CGI scripts 2021-02-07 21:47:01 +00:00
Omar Polo 9f006a2127 [cgi] split the query in words if needed and add them to the argv 2021-02-07 18:55:04 +00:00
Omar Polo 3abf91b0b4 improve logs management 2021-02-07 15:30:28 +00:00
Omar Polo a709ddf5eb added prefork option 2021-02-07 12:05:32 +00:00
Omar Polo e3ddf39095 add the ``entrypoint'' option 2021-02-06 18:28:43 +00:00
Omar Polo 6abda252e9 added ``block return'' and ``strip'' options 2021-02-06 17:22:37 +00:00
Omar Polo ca21e10043 reload configuration on SIGHUP 2021-02-04 13:23:15 +00:00
Omar Polo 4e2e2ab1d3 refactor executor_main
now it's symmetrical to listener_main().
2021-02-03 16:37:53 +00:00
Omar Polo d672b8fba1 refactoring startup logic 2021-02-03 16:28:00 +00:00
Omar Polo 9b8f5ed2c0 revert commit 346f28eeaa
keep mark_nonblock in utils.c, as otherwise the build for the regress
suite will fail (mark_nonblock needs fatal which is in gmid.c, and
we can't link gmid.o with the regress suite...)
2021-02-03 14:16:39 +00:00
Omar Polo 346f28eeaa move mark_nonblock to utils.c 2021-02-02 23:03:33 +00:00
Omar Polo fe40638928 mark various functions as static
By marking all those function as static, the compiler is free to do
more optimizations.  In addition, those functions are not used outside
server.c
2021-02-02 23:01:09 +00:00
Omar Polo 5f715ce43f print the header in the directory listing 2021-02-02 09:48:32 +00:00
Omar Polo 35744950aa simplify handle_cgi
Now that I got rid of the enum+switch, adding more state is easier.
Before, we used an hack to remember if we had read the CGI reply or
not (c->code = -1).

This introduces a new state, handle_cgi_reply that reads the CGI
script reply, logs it, and only then switches to handle_cgi.
handle_cgi itself is cleaner, now it only reads into c->sbuf and send
what it had red.

We even get, almost for free, the 42 error.  If read exists with -1 or
0 from in handle_cgi_reply, we return a proper error to the client.
We can extend this further in the future and also try to validate the
CGI reply (for now we're only looking for a \n).
2021-02-01 22:04:51 +00:00
Omar Polo 92da82858b document the DFA 2021-02-01 20:24:14 +00:00
Omar Polo 112802ea31 client state machine: function pointers instead of enum+switch 2021-02-01 20:00:33 +00:00
Omar Polo 2fafa2d23e bring the CGI implementation in par with GLV-1.12556 2021-02-01 11:11:43 +00:00
Omar Polo 6016a593a3 invert the location precedence: first match wins
It's how httpd(8) does it, and it allows us to call fnmatch less time
2021-01-30 12:04:20 +00:00
Omar Polo a2fd801327 puny_decode: set an error string 2021-01-29 17:11:03 +00:00
Omar Polo 35cf19e3f3 punycode: refactoring
use the correct term "label" instead of "component" when speaking
about the parts of a hostname.  Also, fix the sizes for max hostname
and label.
2021-01-28 16:25:59 +00:00
Omar Polo 473e4531d5 delete unused constant 2021-01-28 16:24:48 +00:00
Omar Polo ef0f2ae88c remove unused function 2021-01-28 14:21:57 +00:00
Omar Polo 13ed2fb63f improve configuration parsing error 2021-01-27 16:45:45 +00:00
Omar Polo fe5967cd02 const-ify strings in struct location 2021-01-27 15:53:30 +00:00
Omar Polo 44ee1bac8b use starts_with in puny.c 2021-01-27 15:35:09 +00:00
Omar Polo c4f682f855 trim_req_iri: set error string 2021-01-27 15:05:16 +00:00
Omar Polo 46af8c6cfe drop the daemon config in favour of the -f flag
Now it daemonize by default when running with a config, unless the -f
flag is given.
2021-01-27 12:04:37 +00:00
Omar Polo 7957cbd9aa const-ify puny_decode (and add puny.c) 2021-01-27 11:21:23 +00:00
Omar Polo 3300cbe06a initial punycode support 2021-01-27 10:47:49 +00:00
Omar Polo 8443bff77a rework the configless mode: change flags and generate certs 2021-01-25 14:08:31 +00:00
Omar Polo ae08ec7da5 chroot & drop privileges 2021-01-25 10:30:07 +00:00
Omar Polo 252908e6bb added support for location blocks 2021-01-24 18:53:26 +00:00
Omar Polo c8b7433918 added support for location blocks 2021-01-24 14:11:40 +00:00
Omar Polo 07b0a14218 void-ify some functions
their return value is no longer used, it's only confusing at this point.
2021-01-24 09:54:44 +00:00
Omar Polo a87f662565 refactoring state management
instead of having a flag to discern between two different behaviours
in S_SENDING, split that state into S_SENDING_FILE and S_SENDING_CGI
(this will also make it easier in the future to add other sending
states).  While there, also get rid of `goodbye' and make start_reply
advance the state machine by itself.
2021-01-24 09:49:09 +00:00
Omar Polo e7a2a99b5a added index option 2021-01-24 09:14:01 +00:00
Omar Polo f890c8c54d use a helper to handle no-body replies 2021-01-22 13:58:54 +00:00
Omar Polo 36162ed86d rename goodbye -> close_conn 2021-01-22 13:45:27 +00:00
Omar Polo f77a8c867e add ends_with 2021-01-21 22:45:49 +00:00