normalize host name when parsing the IRI

RFC3986 3.2.2 "Host" says that

> Although host is case-insensitive, producers and normalizers should
> use lowercase for registered names and hexadecimal addresses for the
> sake of uniformity, while only using uppercase letters for
> percent-encodings.

so we cope with that.
This commit is contained in:
Omar Polo 2021-01-15 09:27:42 +00:00
parent 1590277007
commit e4d82becb7
2 changed files with 9 additions and 1 deletions

6
iri.c
View File

@ -146,8 +146,12 @@ parse_authority(struct parser *p)
while (unreserved(*p->iri)
|| sub_delimiters(*p->iri)
|| parse_pct_encoded(p))
|| parse_pct_encoded(p)) {
/* normalize the host name. */
if (*p->iri < 0x7F)
*p->iri = tolower(*p->iri);
p->iri++;
}
if (p->err != NULL)
return 0;

View File

@ -123,6 +123,10 @@ main(void)
FAIL,
empty,
"FAIL with invalid port number");
TEST("gemini://OmArPoLo.CoM",
PASS,
IRI("gemini", "omarpolo.com", "", "", "", ""),
"host is case-insensitive");
/* path */
TEST("gemini://omarpolo.com/foo/bar/baz",