Commit Graph

30 Commits

Author SHA1 Message Date
Omar Polo 06035a0237 more is*() unsigned char cast
continuation of 6130e0eeac
2022-11-29 23:03:55 +00:00
Omar Polo 97b306cbee add an implicit fastcgi parameter: GEMINI_SEARCH_STRING
it’s the QUERY_STRING decoded if it’s a search-string (i.e. not a
key-value pair.)  It’s useful for scripts to avoid percent-decoding
the querystring in the most common case of a query, because in Gemini
querystrings key-value paired are not common.

Idea from a discussion with Allen Sobot.
2022-11-27 15:35:10 +00:00
Omar Polo 6130e0eeac always cast is*() arguments to unsigned char 2022-11-17 09:21:38 +00:00
Omar Polo a555e0d67b copyright years 2022-07-04 09:48:39 +00:00
Omar Polo f2f8eb35c8 encode file names in the directory index
Spotted the hard way by cage
2022-07-04 09:31:36 +00:00
Omar Polo 5e41063f1b bugfix: allow @ and : in paths
gmid would disallow the '@' and ':' characters in paths (unless
percent-encoded.)  Issue reported by freezr.
2022-07-04 08:15:39 +00:00
Omar Polo 4842c72d9f fmt 2021-10-18 10:05:55 +00:00
Omar Polo fa0299a26d drop now unused trim_req_iri 2021-10-02 17:20:56 +00:00
Omar Polo e15fc95736 change struct initialization
makes more explicit which fields we're setting.

(and kill an extra empty line)
2021-09-24 08:12:40 +00:00
Omar Polo df0c2926cc use memset(3) rather than bzero(3)
There's no difference, but bzero(3) says

STANDARDS
     The bzero() function conforms to the X/Open System Interfaces option of
     the IEEE Std 1003.1-2004 (“POSIX.1”) specification.  It was removed from
     the standard in IEEE Std 1003.1-2008 (“POSIX.1”), which recommends using
     memset(3) instead.

so here we are.
2021-09-24 08:08:49 +00:00
Omar Polo a8a1f43921 style(9)-ify 2021-07-07 09:46:37 +00:00
Omar Polo 80fbf1e934 make sure l is always initialized
I can't think of cases where we reach serialize_iri and path is NULL,
but let's keep the safe side and initialize l.  gcc 8 found this,
clang didn't.
2021-06-16 15:04:42 +00:00
Omar Polo 9d092b607a fix IRI-parsing bug
Some particularly crafted IRIs can cause a denial of service (DOS).
IRIs which have a trailing `..' segment and resolve to a valid IRI
(i.e. a .. that's not escaping the root directory) will make the
server process loop forever.

This is """just""" an DOS vulnerability, it doesn't expose anything
sensitive or give an attacker anything else.
2021-04-12 20:11:47 +00:00
Omar Polo 52418c8d82 fix various compilation errors
Include gmid.h as first header in every file, as it then includes
config.h (that defines _GNU_SOURCE for instance).

Fix also a warning about unsigned vs signed const char pointers in
openssl.
2021-02-12 12:47:20 +00:00
Omar Polo 9f006a2127 [cgi] split the query in words if needed and add them to the argv 2021-02-07 18:55:04 +00:00
Omar Polo 19e7bd00a3 [iri] accept also : and @
again, to be RFC3986 compliant.
2021-02-06 09:33:48 +00:00
Omar Polo 8404ec301f don't %-decode the query 2021-02-05 14:31:53 +00:00
Omar Polo 2fafa2d23e bring the CGI implementation in par with GLV-1.12556 2021-02-01 11:11:43 +00:00
Omar Polo 57d0d0adba ensure iri.host isn't NULL 2021-01-31 11:50:01 +00:00
Omar Polo 117ac52cdd accept a wider range of UNICODE codepoints while parsing hostnames 2021-01-29 17:26:23 +00:00
Omar Polo 9a672b3712 legibility: use p[n] instead of (*(p + n)) 2021-01-28 16:26:49 +00:00
Omar Polo c4f682f855 trim_req_iri: set error string 2021-01-27 15:05:16 +00:00
Omar Polo 42bbdc7978 trim initial forward slashes
this parse gemini://example.com///foo into an IRI whose path is
"foo".  I'm not 100% this is standard-compliant but:

1. it seems a logical consequence of the URI/IRI cleaning algo (where
   we drop sequential slashes)
2. practically speaking serving file a sequence of forward slashes
   doesn't really make sense, even in the case of CGI scripts
2021-01-21 22:48:16 +00:00
Omar Polo 881dc835d0 wording 2021-01-16 20:14:02 +00:00
Omar Polo b777bf4b2b check also that the port number matches 2021-01-15 18:24:24 +00:00
Omar Polo f7b816dc39 style 2021-01-15 15:21:51 +00:00
Omar Polo e4d82becb7 normalize host name when parsing the IRI
RFC3986 3.2.2 "Host" says that

> Although host is case-insensitive, producers and normalizers should
> use lowercase for registered names and hexadecimal addresses for the
> sake of uniformity, while only using uppercase letters for
> percent-encodings.

so we cope with that.
2021-01-15 09:27:42 +00:00
Omar Polo de428fff65 normalize schema when parsing the IRI
RFC3986 in section 3.1 "Scheme" says that

> Although schemes are case-insensitive, the canonical form is
> lowercase and documents that specify schemes must do so with
> lowercase letters.  An implementation should accept uppercase
> letters as equivalent to lowercase in scheme names (e.g., allow
> "HTTP" as well as "http") for the sake of robustness but should only
> produce lowercase scheme names for consistency.

so we cope with that.  The other possibility would have been to use
strcasecmp instead of strcmp when checking on the protocol, but since
the "case" version, although popular, is not part of any standard
AFAIK I prefer downcasing while parsing and be done with it.
2021-01-13 19:00:53 +00:00
Omar Polo 6a9ae70773 remove infinite loop 2021-01-11 14:26:43 +00:00
Omar Polo 3c1cf9d07c s/uri/iri since we accept IRIs 2021-01-11 13:08:00 +00:00